The Data Incubator

Businesses are drowning in data
but starving for insights
Forrester

Introduction to Data Wrangling

Summary

Much of the world's data cannot be easily processed in Excel or other spreadsheets. It is either too inaccessible, too messy, too unstructured, too varied, or too large for elementary handling. This course is designed to equip students with core tools to start down the path towards becoming data scientists. We cover the basic data structures and file formats, and then move into classes and SQL. Basics of Pandas, NumPy, and matplotlib follow from that. We introduce the fundamental building blocks of data manipulation. We also demonstrate how to translate simple Excel commands into more powerful languages like SQL and Python, as well as how to build on the existing open-source libraries.


Associated project work

Students solve several simple programming exercises in Python, to demonstrate the proficiency for more advanced courses.


Students will gain experience with Python-based data wrangling technologies to extract insights from a structured, web-API-based dataset. Students will learn the fundamental building blocks of data extraction, manipulation, and aggregation via Pandas DataFrames and good Python programming practice.


Prerequisites