Big Data Challenge for High School Students




Go further into predictive analysis of optimal characteristics for sustainable living environments :
Collect and analyse geostatic terrestrial data from the CSA, ESA and NASA along with your city, province and feederal humanitarian open data. Explore Canada's 18 UNESCO Biosphere Reserves.

In partnership with Let's Talk Science
Under the patronage of the Canadian Commission for UNESCO

What is the big data challenge?

The STEM Fellowship Big Data Challenge is a competition that helps high school students get excited about Data Science and its potential to support inquiry-based learning and problem solving with open data. The Big Data Challenge involves STEAM students undertaking independent research projects to tackle real-world problems with the best industry analytical tools while learning about digital citizenship.  

How it works: We present the general competition theme within which you define a research topic of interest.  We supply you with relevant open data and leading data analytics tools with which you carry out your inquiry, come with ideas, develop solutions and present your findings in the form of a research paper.

The best part is you don’t even need to have prior data science knowledge to join! We deliver webinars and workshops teaching you the basics to get you started on your research!

The top 3 finalists will have their full paper published in the STEM Fellowship Journal


Theme: Big Data de terre

This year’s competition is focused on exploratory data analysis of Canadian Space Agency (CSA)National Aeronautics and Space Administration (NASA) and European Space Agency (ESA) open data to produce descriptive and graphical summaries of data with the goal of revealing the impact of environmental condition on human health and well-being. It invites high school students to go further into predictive analytics of optimal environmental characteristics for long-term, long-distance space travel.

  • Collect and analyse geostatic terrestrial data along with your city, province, federal and humanitarian open data. Explore the Canada’s 18 UNESCO Biosphere Reserves.
  • Discover human health and well-being issues (physical health, public health, mental health, etc.) in the context of local, regional and global environmental problems.
  • Seek and suggest environmental, socioeconomic solutions for sustainable living environments.
Check out our full itinerary here.

Why participate

  • Engage in self-directed learning through trial-and-error. Participate in our workshops to learn the data science skills needed for you to analyse your problems.

Participate in workshops to learn the data science skills needed to analyse problems 

Interact in person and remotely with mentors and students across Canada

Have your work published in the STEM Fellowship Journal! The top 3 finalists have full manuscripts published and all participants have their abstracts published.

Practice your communication skills and present your findings to a panel of judges 


Academic prizes:

SciNet Supercomputer Tour

Scholarly publication of all project abstracts and full manuscripts publication of winning project papers in the STEM Fellowship Journal, published by Canadian Science Publishing

Monetary prizes:

$1000 SAS Analytics Talent Award- Toronto

$1000 SAS Analytics Talent Award- Calgary

$1000 RBC Arnold Chan Memorial Award for Student Innovation- Toronto 

$1000 RBC Arnold Chan Memorial Award for Student Innovation- Calgary

$500 Digital Science Scholarly Communication Award- Toronto

$500 Digital Science Scholarly Communication Award- Calgary


we emphasize four key skills

Computational Thinking

The ability to translate aggregates of data into abstract concepts and conduct data-based reasoning .

Design Mindset

The ability to create solutions in contexts where only part  of the requirements are known 

Cognitive Load Management

The ability to discriminate and filter the information needed to produce successful solutions 

Social Intelligence

The ability to participate in the collaborative  construction of solutions.


September 1, 2018

Registration begins via Eventbrite

October 20, 2018

Team Registration Deadline via Eventbrite
Form your team(s) of up to 4 students and register them online.
The Challenge Registration form is here

February 1, 2019

Deadline for Full Project Report submission

February 21, 2019

BIG DATA DAY Take part in Big Data Day  in Toronto at the SAS Canada Headquarters (280 King St East, Toronto, ON M5A 1K7) or in Calgary (location to be announced!) Attend in-person or join in online for the presentations! Take part in the expert roundtable, inviting dignitaries and academia experts. Tune in to the award ceremony where the top 3 teams are recognized!

Week of October 2, 2018

Week of Oct. 2, 2018 – Orientation Sessions


  1. Toronto (and telepresence): SciNet UofT

MaRS West Tower, 661 University Avenue, Suite 1140, Toronto ON M5G 1M1
October 2nd, 2018- register here.


  1. Calgary Cassio A/B, University of Calgary
    October 4th, 2018- register here.


*Orientation sessions are run by data science experts from SciNet, IBM Cognitive Class & SAS.

 Orientation session recording will be provided after the sessions end.

Check out our past orientation session here.

October 21, 2018

  1. Form your team(s) of up to 4 students and register them online. The Challenge Registration form will be found on Eventbrite. Pay participation/abstract publication fee of $125 (tax included) per team.
    Crowdsource the knowledge and investigate analytics tools CISCO Academy Python Pandas, SAS Academy Programing and open source data analysis courses and tools- choose one you will learn and use. 
  2. Workshops Covering entrepreneurial innovations, Overleaf, Cisco Python, R, SAS and Tableau.
  3. Organize and plan your project We recruited a good number of industry experts who are willing to mentor student teams. Based on the team choice of data tools we will help to find mentors from amongst the data analytics and scholarly publishing companies/ community. Also, feel free to go through family connections or use orientation session connections.
  4. Work on your data set for 2.5 months: Learn together, from your mentor and online. Slice and dice it, zoom in and out, find patterns, trends, and important segments.
  5. Tell the story of your data discovery through a scientific report. Use Overleaf professional scholarly communication platform to prepare and submit your project report.

Week of February 4, 2019


The finalists (top 10 teams) will be announced!

If your team is selected, you will deliver a presentation at Big Data Day, the culminating event for the BDC.

2019 Finalists

Calgary Finalists

Westmount Charter School: Elena Pan, Claire Schroeder, Amna Sheikh, Julia Zhang

Westmount Charter School: Allan Cao, Sheridan Feucht

Burnaby North Secondary School: Jay Zou, Kevin Ye, Brian Ning, Brian Yao

Strathcona-Tweedsmuir School: Lindy Zhai, Gracia Angeline Soenarjo, Cecilia Liu, Brennan Cowley Adam

Webber Academy: Shounak Ray, Jaqueline Seal, Evin Chin

St. George’s School: Kevin Li, Joshua Xu, David Zuo

Webber Academy: Nicholas Wilger, Nicholas Sweerts, Rohin Dhadli, Daniel Awotundun

Toronto Finalists​

Earl Haig Secondary School: Milad Saadati, Christopher Chifor, Isaac Liao, Seyed Sepehr Seyed Ghasemipour

Mission San Jose High School: Era Dewan, Ayush Dewan

Earl Haig Secondary School: Maria Pasyechnyk, Robin Nash, Arya Shababi, Ali Seena Shakeri

St. Francis High School: Rohit Menon, Aditi Menon

TanenbaumCHAT: Jonah Garmaise, Ethan Ohayon, Ryan Goldberg, Mason Silver

PACE: Claire Beckley, Zev Friedman, Shezreen Khan

St. Malachy’s Memorial High School: Yanfei Wei, Tianming Han

2018/19 Reviewers

Dave Carter is a Research Council Officer with Digital Technologies. He holds a M.A.Sc. in Electrical Engineering and pursues work in syndromic surveillance and situational awareness.

Dave Carter (M.A.Sc.) is an engineer with the Text Analytics group at National Research Council Canada and is based in Ottawa. He is the technical lead for NRC’s situational awareness/outbreak detection platform.

Dr. Svetlana Kiritchenko is a research scientist at the National Research Council Canada. She received her Ph.D. in Computer Science from the University of Ottawa (Canada) and her M.Sc. in Applied Mathematics and Computer Science from Moscow State University (Russia). She primarily works in the areas of Computational Linguistics and Natural Language Processing (NLP). Her research interests include ethics and fairness in NLP, sentiment and emotion analysis, text classification, social media analysis, and medical informatics. As part of the NRC-Canada team, she has developed several text classification systems (for sentiment analysis and health-related social media mining) that ranked first in international shared task competitions.

CSEAnnualMeetingHeadshots1680 copy

Deepika knows that driving transformation starts from the inside out
and is passionate about making sure that marketing has a seat at the
table to help lead change. She has overseen all aspects of marketing
in large organizations (Wiley, Capcom) and entrepreneurial ventures
(RedLink, Tapjoy), building brands, web and e-commerce, digital
marketing, messaging, demand generation, thought leadership and
strategic events.
Prior to embarking on her Marketing career, at Intel, she was
responsible for analyzing savings for successful product rollouts.
And as an IT consultant, worked with clients to roll out ATT
high-speed data services and lead VOIP features for Sprint.
Deepika holds a Bachelor of Engineering from Bangalore University and a Master of Business Administration from Fordham University.

2017/18 Big Data Challenge

  • SAS Award - Tony Xu and Shayan Khalili (Earl Haig Secondary School)