Big Data Courses

BIG DATA COURSES

The following is proprietary material of STEM Fellowship and its creators and is under the protection of the STEM Fellowship Licensing Agreement. It cannot be modified, reproduced or distributed without written consent from STEM Fellowship.

 

Please download Jupyter for Linux, Mac, and Windows. You’ll be using it to run the workshops. There are installation instructions and the first thing you should do is open the R - Jupyter Essentials in 5 Minutes.ipynb notebook for R and the Python - Jupyter Essentials in 5 Minutes.ipynb for Python.

 

 
 

INTRODUCTION TO STATISTICS 

  

 
 
 

 

The purpose of this course is to introduce students to data basics which they can then apply for data analysis. The following topics are covered in the course: variables and cases; samples vs populations; describing data; mean, median and mode; variance and standard deviation; confidence intervals; hypothesis testing. 

 

 

The purpose of this course is to teach you how to make your data more oragnised and easier to read. Topics covered in this course include: introductory data science vocabulary; basic R; data structures; the data analysis process. 

 

 
 

DATA MANIPULATION USING R

  

 
 
 

 

 
 

DATA MANIPULATION USING PYTHON

  

 
 
 

 

 

The purpose of this course is to teach you how to make your data more oragnised and easier to read. Topics covered in this course include: introductory data science vocabulary; basic R; data structures; the data analysis process. 

 

This course will focus on teaching you about the various visualisation forms available to best represent various data types. Topics covered in this course include: data visualisation types (correlation heat maps, scatter plots, bar charts, pie charts); the grammar of graphics; layering; local vs. global features; geographical mapping; data cleaning.

 

 
 

DATA VISUALISATION USING R

  

 
 
 

 

 
 

DATA VISUALISATION USING PYTHON

  

 
 
 

 

This course will teach you about the various visualisation forms available to represent various data types. Topics covered in this course include: preparing data for visualisation; data visualisation types; scatter plots; correlation matrices; geographic mapping; packages available for data manipulation and analysis. 

 

 

 

This course will introduce you to k-means clustering and linear regressions. 

 

 
 

BASIC MACHINE LEARNING USING R

  

 
 
 

 

 
 

BASIC MACHINE LEARNING USING PYTHON

  

 
 
 

 

 

 

This course will introduce you to k-means clustering.