Students practice making a variety of chart types and then begin to investigate a real world dataset, which they will continue to work with for the remainder of the course.
Lesson Goals |
Students will be able to…
|
Student-facing Lesson Goals |
|
Materials |
|
Preparation |
|
- categorical data
-
data whose values are qualities that are not subject to the laws of arithmetic
- data science
-
the science of collecting, organizing, and drawing general conclusions from data, with the help of computers
- dataset
-
a collection of related information that is composed of separate elements, but can be manipulated as a unit by a computer
- quantitative data
-
number values for which arithmetic makes sense
- random sample
-
a subset of individuals chosen from a larger set, such that each individual has the same probability of being chosen
- statistical inference
-
using information from a sample to draw conclusions about the larger population from which the sample was taken
🔗Review 20 minutes
Overview
Students practice making lots of chart types, focusing specifically on the "Consider Data" step in the Data Cycle and how it can be used alongside Contracts to help go from questions to code.
Launch
Let’s get some practice isolating the Rows and Columns needed to answer various questions, and use our knowledge of Contracts to help turn those questions into working code!
Complete Consider and Analyze.
Be sure to review student answers.
🔗Choosing a Dataset 30 minutes
Overview
Students select a dataset that interests them, and do some thinking about why it interests them, what questions they’d like to answer and what hypotheses they have. They’ll be analyzing this data for a long time, so it’s critical to ensure a high degree of buy-in before signing off on a student’s choice!
If you are opting to focus your whole class on a single dataset, we recommend skipping to the Exploring Your Dataset section of this lesson and using the dataset provided there. (It focuses on global food supply and production through environmental / geographic / cultural lenses and the variables were carefully selected to make sure it lends itself well for all kinds of data displays and discussions. You can, of course, opt to choose any dataset you’d like, from our library or otherwise.)
Launch
Data Science: it’s all about YOU!
What data matters to you? What questions do you care about? We live in a world filled with data, gathered about almost every subject you can imagine.
-
Climate sensors are gathering data on temperature, humidity, oxygen and more…practically everywhere on the globe.
-
Census data tracks the number of different groups of people, as well as their education, income level, and more.
-
Companies like Facebook, Amazon, and Google gather massive amounts of data on the websites you visit, what you chat about online, what you purchase, etc.
This data is used to set public policy, draw voting districts, approve drugs, calculate school funding, decide which advertisements you see, and more.
-
Where else do you see data being gathered?
-
What are some other ways data is used in the world around you?
Below is a list of every dataset already provided to students, with a corresponding Starter File that instantly imports the (cleaned) data into Pyret. We suggest giving students a direct link to this page, which lists all of the relevant links found in the lesson plan.
Students can also find their own dataset, and use this Blank Dataset Starter File for Bootstrap:Data Science. See this tutorial video for help importing your own data into Pyret.
Investigate
Have students choose a dataset that is interesting to them and save a copy of it in their programs!
Looking for a shorter list? We’ve starred a few good beginner datasets.
The Environment & Health
- Global Waste by Country 2019
- World Cities' Proximity to the Ocean
- Earthquakes
- Air Quality, Pollution Sources & Health in the U.S.
- Health by U.S. County
- COVID in the U.S. by County
- Arctic Sea Ice
Politics
- Countries of the World
- Gerry Mandering
- Marijuana Laws & Arrests by State 2018
- LAPD Arrests 2010-2019
- NYPD Stop, Search & Frisk 2019
- Refugees 2018
- State Demographics
- U.S. Income
- U.S. Jobs
- U.S. Voter Turnout 2016
Sports
- Esports Earnings
- MLB Hitting Stats
- NBA Players
- NFL Passing
- NFL Rushing
Entertainment
- ★Movies
- IGN video game Reviews
- International Exhibition of Modern Art
- North American Pipe Organs
- Pokemon
- Music
Education
- College Majors
- U.S. Colleges 2019-2020
- ★R.I. Schools
- Evolution of College Admissions in California
Nutrition
- Soda, Coffee & Other Drinks
- Fast Food Nutrition
Synthesize
Have students share which datasets they chose, and why they are interesting or important to them. What questions did they come up with?
For the rest of this course, you’ll be learning new programming and Data Science skills, practicing them with the Animals Dataset and then applying them to you own data.
🔗Exploring Your Dataset Start Today… continue in Upcoming Lessons
Overview
Students apply what they’ve learned about describing and making subsets from the Animals Dataset to their own dataset. If your students will all be focusing on the same dataset, we recommend using Global Food Supply & Production.
Launch
By now you’ve already learned what to do when you approach a new dataset.
-
With the Animals Dataset, you first read the data itself, and wrote down your Notices and Wonders.
-
You described the columns in the Animals Dataset, identifying which were categorical and which were quantitative, and whether they were Numbers, Strings, Booleans, etc.
-
You took random samples of the dataset, to explore inference and probability.
Now, you’re doing to do the same thing with your own dataset.
Investigate
-
Look at the spreadsheet or table for your dataset. What do you Notice? What do you Wonder?
-
Complete My Dataset, making sure to include at least two questions that can be answered by your dataset and one that cannot.
-
Save a copy of your starter file. In the Definitions Area, use
random-rows
to define at least three tables of different sizes:tiny-sample
,small-sample
, andmedium-sample
.
Today we will begin working on the Dataset Exploration, which will prepare students for writing their research papers. We will return to this in upcoming lessons. We are just going to work on the first section for now.
-
Make a copy of Dataset Exploration, and open the starter file for your dataset.
-
Complete the first set of questions in the exploration paper.
-
What are the categorical columns in your dataset? How are those values distributed?
-
Turn to Complete Data Cycle: Shape of My Dataset, and use the Data Cycle to generate pie and bar charts.
-
What do these charts tell you? Add the images of these charts - along with your interpretation! - to the "Making Displays" section of the exploration document.
-
Do these displays bring up any interesting questions? If so, add them to the end of the document.
Synthesize
Have students share their findings. Were any of them surprising?
These materials were developed partly through support of the National Science Foundation, (awards 1042210, 1535276, 1648684, and 1738598). Bootstrap by the Bootstrap Community is licensed under a Creative Commons 4.0 Unported License. This license does not grant permission to run training or professional development. Offering training or professional development with materials substantially derived from Bootstrap must be approved in writing by a Bootstrap Director. Permissions beyond the scope of this license, such as to run training, may be available by contacting contact@BootstrapWorld.org.