Big Data Challenge

STEM Fellowship Big Data Challenge 2017-2018

Brought to you by:

x4 x3
x1 overleaf


Now in its fourth year, the STEM Fellowship Big Data Challenge introduces high school students to Data Science and helps them develop their analytical abilities as digital citizens.

The STEM Fellowship Big Data Challenge is an online Big Data inquiry and experiential learning program where students participate in-person or remotely in orientation sessions, workshops, mentorship, and Big Data Day. 

Theme for 2017-2018 Challenge

Think Global and Act Local with Big Data”

This year’s theme surrounds the UNs 17 Sustainable Developmental Goals.

Recommended Competition Research Topics

  1. Correlation between Food & Climate Change
  2. Renewable Energy and Sustainable Infrastructure
  3. Climate Change & Economy


The competition is an opportunity to develop

Global Citizenship – The opportunity to be part of designing and implementing a vision for a more sustainable world by 2030.
Computational Thinking – The ability to convert large amounts of data into a broad working problem (or How Might We… Problem) and conduct data-based reasoning to refine the problem’s scope.
Design Mindset – The ability to generate and create solutions in contexts where only part of the solutions’ requirements are known ahead of time.
Cognitive Load Management The ability to evaluate the quality of obtained data and then filter the usable information that is needed to produce successful solutions.

Get to know the most pressing problems surrounding climate change through the online footprint of the world’s top academic research publications. Please see the Digital Learner Inroads into Climate Change and UNESCO sustainable development and environmental data.

Participating Teams

30 teams from: the Abelard School, Bayview Secondary School, Earl Haig Secondary School, Erindale Secondary School, Lowell International Academy, Northern Secondary, Princeton International School of Mathematics and Science- P.R.I.S.M.S., St. Francis High School, The Academy for Gifted Children – P.A.C.E.,  TanenbaumCHAT Wallenberg, the University of Toronto Schools, Villanova College, and Webber Academy have joined the competition.


Teams are provided recommended data sets, in addition to data science tools and learning resources from SciNet UofT, SAS, and IBM Cognitive Class. Teams are offered data science e-workshops and Slack-based peer- and expert- mentorship. Special training will be organised for the Overleaf scholarly writing and collaboration platform that participants will use to prepare and submit professional research reports.

If you have any questions regarding registration , data, or tools, please email us at


Timeline for STEM Fellowship Big Data Challenge 2017-2018

Key Milestone Deadline Date
Deadline for Full Project Report submission January 19, 2018
Judging panel reviews project submissions January 22 – February 3, 2018
Announcement of Finalists Week of February 5, 2018
Big Data Day: SAS Headquarters, 280 King St East, Toronto, ON M5A 1K7

Finalist teams will present in front of judging panel

Roundtable discussion on Future of Data Science and Analytics Careers with industry experts

Award ceremony where the top 3 teams are recognized

February 22, 2018

Orientation Session Recordings from the 2017-2018 Big Data Challenge 

Presentations of competition resources, tools and topics by SciNet UofT and SAS.

SAS Canada Academic Program Resources for Faculty in Canada


SAS Canada Academic Program Resources for Students in Canada

Orientation Session Full Teleconference


  • Arnold Chan Student Innovation Award
  • SAS prize ($1000)
  • Digital Science Award (Altmetric, Overleaf and Figshare)
  • Invitation to the SAS box for the Raptors game 
  • SciNet Supercomputer Tour
  • Scholarly publication of the top three projects with the STEM Fellowship Journal
  • Royal Bank of Canada Award (TBD)

More prizes are on the way!


 Arjun Asokakumar 

Arjun is currently a Senior Manager at BMO leading a HR Analytics practice for the retail and wealth management businesses.With over 10 years of experience in HR and Masters degree in Management Analytics from Queen’s, his career has been focused on evolving HR to be more scientific through descriptive, predictive and prescriptive analytics. He is an avid researcher, self-professed data geek and life-long learner. He hopes to instill some of this enthusiasm in the next generation of data scientists.

 Artem Zaloga

I studied Engineering Physics (Queen’s U) and Physical Oceaongraphy (UBC), and am now working on some online mathematics education projects for students in grades 6-12. I would like students to be curious, be able to think for themselves, and think of the dynamics between people and the planet – which is why I am doing this.

Farooq Qaiser

Farooq  is a Data Analyst with MaRS Data Catalyst where he uses data science techniques to advance understanding of innovation and entrepreneurial activity in Canada. Prior to that, he worked as a Market Research Analyst at Bell.

In his spare time, Farooq enjoys giving back to community (Data4Good), competing  in hackathons (winner at HackOnData 2017) and working on side projects (currently building a self-driving car).

Farooq has a BSc in Accounting and Finance from the London School of Economics and Political Science.

 Indrani Gorti

Indrani Gorti is currently working as  Data Scientist in Banking.  She has a Masters in Computer Science  and has experience working as Data Scientist/Data Engineer in her previous roles at Bell Canada and Nuance Communications. She was a winner in the Toronto Apache Spark Hackathon 2016 (Toronto) and has keen interest in working with data from different domains.  Indrani is interested in mentoring students and to aid in discussions regarding career options in working with large scale data.

 Jay Rajasekharan

Jay Rajasekharan started his career as a project engineer at Honeywell Aerospace, where he managed a portfolio of projects that focused on quality improvement, cost reduction, and process improvement for Boeing, Airbus, and Lockheed Martin. Subsequently, Jay made a switch from engineering to analytics and joined IBM as a business analyst under the Infrastructure Services division. Currently, he is driving several productivity programs – using data analytics to drive insights from business operations and implementing optimizations such as streamlining workflows, improving service levels, and ultimately reducing cost. Aside from his career, Jay is very passionate about teaching – he volunteers as a tutor at his community library and as an Alumni mentor at the University of Toronto.

Mark Fruman

Mark studied physics and atmospheric physics at York University and the University of Toronto.  After his Ph.D., he spent nine years researching multi-scale waves in the atmosphere and oceans, first at Ifremer in Brest, France, and then at the Goethe University in Frankfurt, Germany.  He currently works as a data scientist at CaseWare International in Toronto, applying machine learning algorithms to problems involving financial data.

 Michael Levinshtein 

More than 20 years of SAS experience. 15 years of Response Models development including Data Mining, 2 years of Credit Risk Models development in TD Bank.

Motasem Salem

Motasem is a Data Scientist / Data Engineer at Flipp. His background is in Computer Science and he is currently pursuing a Master of Information and Data Science degree from UC Berkeley. He worked previously in Software Engineering, Enterprise Application Integration and technology focused strategy consulting.

Pranav Barot

I am a second year mechatronics engineering student from the University of Waterloo, currently working as a data scientist for co-op at TalkIQ, looking to share my knowledge with young, eager data scientists however possible! I have experience with machine learning, deep learning and natural language processing and I’d be happy to help anyone looking to learn more about these topics. I am a hackathon enthusiast (winner at McHacks 2016) and am very much into fitness, music and learning new (linguistic) languages and technologies and exploring their applications in the modern world.

 Roy Gupta

Roy is currently working as Manager, MIS Analytics in Banking with more than 6 years of experience in Business Analytics – including last one year in machine learning / predictive modelling. While engaging Machine Learning methodologies in Global Banking and Markets, Roy is also involved in extracting and delivering key actionable business insights for the senior level executives. Passionate about continuous learning and training – Roy wants to mentor the students participating in STEM Fellowship Big Data Challenge to share the knowledge he gained during his education and work experience in business management, computer science and management analytics.

Shruti Bhanderi

Shruti is a Masters (Computer Science) student at McGill University. She has been working as a Data Scientist intern at Ericsson. She applied to to be a STEM Fellowship mentor to help and motivate students for their career in data science. She wants to share her insights in Computer Science gained through her journey of undergrad, masters and industrial experiences.


Recommended Competition Research Topics

Correlation between Food & Climate Change

Understanding the resources (e.g. water and energy) that go into the food we consume can make us more aware of the relationship between food and climate change. The United Nations Food and Agriculture Organization (FAO) estimates that livestock production is responsible for 18% of greenhouse gas emission. What kind of impacts could replacing proteins from a bean-based diet instead of meat-based diet have on climate change mitigation?  Many big data problems could shed light on the connection between food and climate change and help influence our future on both the government and individual level.

Renewable Energy and Sustainable Infrastructure

Energy efficiency and renewable energy programs are having a huge impact on climate change. Moving away from a fossil fuel based society is paramount to the well being of future generations. A recent study estimates that we have approximately a 200 to 250 Gigatonnes (Gt) of carbon emission limit (~800 Gt of CO2 emissions) before 1.5°C of global warming above pre-industrial values occurs, which is the level set by the Paris Climate Accord (there is still time to change our path). What can big data tell us about our current path on this trajectory? Can it influence our individual behaviour and bring about a more sustainable path? How can the biggest impacts be made on the developing nations which continue to increase emissions while the emissions is some of the more developed world are beginning to level off? Are governments doing enough to mitigate climate change through a change in energy use and infrastructure? Infrastructure is a major part of the climate change problem if it is done poorly, and it is a major solution if it is done properly. Infrastructure accounts for 60% of global greenhouse gas emissions. We must consider climate risks in future investments into infrastructure such as renewable energy, cleaner transport and efficient water systems. How can big data guide us in this direction?

Climate Change & Economy

The impact of storms, flooding and other natural disasters on our society and the economy have been demonstrated to be directly attributable to climate change. The increasing number of humanitarian crisis in the wake of hurricanes and flooding (e.g. Hurricanes Harvey and Maria) require tens of billions of dollars of aid. How will rising sea level and hurricane intensity affect the nations of our world? What can big data tell us about the risks that are most pressing to both the richer and poorer nations of the world? Big data can tell us not only what the monetary impact that climate change is having on various countries of the world, but the social impact that it is having and useful strategies to mitigate disaster.

The challenge is recognized by the Parliament of Canada




Big Data Challenge 2016-2017 Winners

(as published in the STEM Fellowship Journal)




SAS Grand Prize Winners - Pierre Elliott Trudeau High School

Leon Chen, Curtis Chong, Emily Huang, Nathan Lo
'Effects of Climate Change on Canadian Forest Fires'

IBM Big Data University Award Winners - Earl Haig Secondary School

Tony Xu, Shayan Khalili, Cynthia Deng
'A Study on Factors Related to Readership of Scientific Articles'

Digital Science/Altmetric Award Winners - Earl Haig Secondary School

Peter Chou, Kevin Hong, Chandler Lei, Haolin Zhang
'Correlation Between Cancer Research Trends and Real World Data: An Analysis of Altmetric Data'

SAS Prize Winners - TanenbaumCHAT Wallenberg Campus

Joseph Train, David Roizenman, Seth Damiani, Ronny Rochwerg
'An Analysis of Time and Engagement for Articles Relating to Oncology'

Big Data Day Finalist Presentation Livestream:
Part 1 | Part 2 | Part 3
Big Data Challenge 2016-2017 Figshare