(Also available in CODAP)
Students practice making a variety of chart types and then begin to investigate a real world dataset, which they will continue to work with for the remainder of the course.
Lesson Goals |
Students will be able to…
|
Student-facing Lesson Goals |
|
Materials |
|
Supplemental Materials |
|
Preparation |
|
- data science
-
the science of collecting, organizing, and drawing general conclusions from data, with the help of computers
- dataset
-
a collection of related information that is composed of separate elements, but can be manipulated as a unit by a computer
- random sample
-
a subset of individuals chosen from a larger set, such that each individual has an equal probability of being chosen
- statistical inference
-
using information from a sample to draw conclusions about the larger population from which the sample was taken
🔗Review: Consider Data 20 minutes
Overview
Students practice making lots of chart types, focusing specifically on the "Consider Data" step in the Data Cycle and how it can be used alongside Contracts to help go from questions to code.
Launch
The Data Cycle is a roadmap that guides us in the process of data analysis. You’ve learned that the Data Cycle includes four steps. Let’s review what those steps entail.
-
In the Ask Questions phase of the Data Cycle, what are some of the different types of questions we can ask?
-
Lookup, arithmetic, and statistical questions.
-
-
What’s the difference between an arithmetic question question and a statistical question?
-
A statistical question does not specify a particular arithmetic process, while an arithmetic question does.
-
-
What does the Consider Data phase entail?
-
We need to ask two questions: "What rows should we investigate?" and "What columns do we need?"
-
-
During the Analyze Data phase of the Data Cycle, we choose what kind of display we’ll need to answer our question. Which two displays work with categorical data? Why might you choose one over the other?
-
Bar and pie charts workwith categorical data. A pie chart only makes sense when you have the full picture, whereas a bar chart shows the count. .
-
-
In your own words, what happens during the Interpret the Data phase?
-
We answer questions and summarize results, which often leads to new questions.
-
Investigate
In this lesson, we’re going to get some practice with the second step of the cycle - Consider Data. This entails isolating the Rows and Columns needed to answer various questions, and using our knowledge of Contracts to help turn those questions into working code!
Complete Consider and Analyze.
Be sure to review student answers.
Synthesize
-
What strategies did you use to determine which columns to isolate?
-
Why do the contracts for some displays require more arguments than others?
🔗Choosing a Dataset 30 minutes
Overview
Students select a dataset that interests them, and do some thinking about why it interests them, what questions they’d like to answer and what hypotheses they have. They’ll be analyzing this data for a long time, so it’s critical to ensure a high degree of buy-in before signing off on a student’s choice!
Launch
Data Science: it’s all about YOU!
What data matters to you? What questions do you care about? We live in a world filled with data, gathered about almost every subject you can imagine.
-
Climate sensors are gathering data on temperature, humidity, oxygen and more…practically everywhere on the globe.
-
Census data tracks the number of different groups of people, as well as their education, income level, and more.
-
Companies like Facebook, Amazon, and Google gather massive amounts of data on the websites you visit, what you chat about online, what you purchase, etc.
This data is used to set public policy, draw voting districts, approve drugs, calculate school funding, decide which advertisements you see, and more.
-
Where else do you see data being gathered?
-
What are some other ways data is used in the world around you?
Students can also find their own dataset, and use this Blank Dataset Starter File for Bootstrap:Data Science. For help, see this Tutorial Video: Importing Your Own Data into Pyret.
For teachers using a single dataset, we recommend using Global Food Supply & Production Starter File. This dataset focuses on global food supply and production through environmental / geographic / cultural lenses and the variables were carefully selected to make sure it lends itself well for all kinds of data displays and discussions. You can, of course, opt to choose any dataset you’d like, from our library or otherwise.
NOTE: We have compiled some Notes on our provided datasets, to help you decide which might be most useful in your classroom.
Investigate
Have students choose a dataset that is interesting to them and save a copy of it in their programs!
Looking for a shorter list? We’ve starred a few good beginner datasets.
The Environment & Health
- Global Waste by Country 2019
- World Cities' Proximity to the Ocean
- Earthquakes
- Air Quality, Pollution Sources & Health in the U.S.
- Health by U.S. County
- COVID in the U.S. by County
- Arctic Sea Ice
Politics
- Countries of the World
- Gerrymandering
- Marijuana Laws & Arrests by State 2018
- LAPD Arrests 2010-2019
- NYPD Stop, Search & Frisk 2019
- Refugees 2018
- State Demographics
- U.S. Income
- U.S. Jobs
- U.S. Voter Turnout 2016
Sports
- Esports Earnings
- MLB Hitting Stats
- NBA Players
- NFL Passing
- NFL Rushing
Entertainment
- ★Movies
- IGN video game Reviews
- International Exhibition of Modern Art
- North American Pipe Organs
- Pokemon
- Music
Education
- College Majors
- U.S. Colleges 2019-2020
- ★R.I. Schools
- Evolution of College Admissions in California
Nutrition
- Soda, Coffee & Other Drinks
- Fast Food Nutrition
Synthesize
-
What did you select, and why?
-
What questions did you come up with?
For the rest of this course, you’ll be learning new programming and Data Science skills, practicing them with the Animals Dataset and then applying them to you own data.
🔗Dataset Exploration Project flexible
Overview
Students are introduced to the Dataset Exploration Project. They will apply what they have learned to add four items to their Data Exploration Project Slide Template: (1) a description their dataset, including its source, structure, and relevance, (2) at least one bar chart, (3) at least one pie chart, and (4) any interesting questions they develop. To learn more about the sequence and scope of the exploration project, visit Project: Dataset Exploration.)
Launch
Today, we are going to start digging into the datasets we’ve chosen to study at length. Each time we learn about a new data science concept in this class, we will add displays, questions, and analyses to the Data Exploration Project Slide Template.
-
Open the Data Exploration Project Slide Template.
-
Create and save your own copy of the slide deck.
-
Let’s take a look! Peruse the slides to get a sense of what this cumulative project includes.
-
What do you Notice? What do you Wonder?
-
Students will likely notice that many displays they are unfamiliar with are referenced. They may wonder how there is going to be so much analysis on just one dataset!
-
Encourage students to familiarize themselves with the template, highlighting some important features:
-
Blue text is included to provide examples.
-
Slides can be duplicated if students want to add additional displays or interpretations.
Investigate
By now you’ve already learned what to do when you approach a new dataset. Think back to your first exposure to the Animals Dataset. You read the data and wrote down your Notices and Wonders. You described the columns. You even took some random samples of the dataset to explore inference and probability.
Now, you’re doing to do the same thing with your own dataset.
-
Open your chosen dataset starter file in Pyret.
-
Look at the spreadsheet or table for your dataset. What do you Notice? What do you Wonder?
-
Complete My Dataset, making sure to include at least two questions that can be answered by your dataset and one that cannot.
-
Save a copy of your starter file. In the Definitions Area, use
random-rows
to define at least three tables of different sizes:tiny-sample
,small-sample
, andmedium-sample
.
Today we will begin adding to our Data Exploration Project Slide Template. First, we are going to describe our dataset.
-
It’s time to add to your Data Exploration Project Slide Template.
-
Complete all of the slides you see in the "About this Dataset" portion of the slide deck. It may be helpful to refer to My Dataset.
-
Choose one categorical column from your dataset that you will represent with a bar chart.
-
What question does your display answer?
-
Now, write down that question in the top section of Data Cycle: Categorical Data.
-
Complete the rest of the data cycle, recording how you considered, analyzed, and interpreted the question.
-
Repeat this process for at least one more categorical column - but this time, create a pie chart.
Copy/paste at least one bar chart and one pie chart into your slide deck. Be sure to also add any interesting questions that you developed while making and thinking about these displays.
Synthesize
Share your findings with the class!
Did you discover anything surprising or interesting about your dataset?
What questions did the bar and pie charts inspire raise?
Did other students make any discoveries that were surprising or interesting to you?
These materials were developed partly through support of the National Science Foundation, (awards 1042210, 1535276, 1648684, 1738598, 2031479, and 1501927). Bootstrap by the Bootstrap Community is licensed under a Creative Commons 4.0 Unported License. This license does not grant permission to run training or professional development. Offering training or professional development with materials substantially derived from Bootstrap must be approved in writing by a Bootstrap Director. Permissions beyond the scope of this license, such as to run training, may be available by contacting contact@BootstrapWorld.org.