1 In the Definitions Area of the Expanded Animals Starter File, define the following samples:

tiny-sample = random-rows(more-animals, 10)
small-sample = random-rows(more-animals, 20)
medium-sample = random-rows(more-animals, 40)
large-sample = random-rows(more-animals, 80)

2 Click "Run" and make a pie-chart of the species in the tiny-sample. What animals are in the sample?

  • Click "Run" for a new random tiny-sample, and make another pie-chart for species. What animals are there?

  • Click "Run" for a new random sample, and make yet another pie-chart for species. Based on these 3 samples, how many species do you think are at the shelter?

  • Which is the most common species at the shelter?

3 What did you learn from taking multiple samples that you wouldn’t have known if you’d only taken one?

4 Repeat the steps above, but for small-sample. What animals are in the sample?

5 Now that you’ve seen small-sample, how has your sense of the distribution of the species changed?

6 Now use medium-sample to make a pie-chart of the species. If there are about 400 animals at the shelter, how many of each species would you predict there to be?

7 Now use large-sample to make a pie-chart of the species. If there’s anything you’d like to change about your prediction now that you’ve seen large-sample, record it here.

8 Let’s see how accurate your prediction is…​ feel free to click "Run" and build a few more pie charts from your samples if you want to collect more information first! When you’re ready, make a pie-chart of more-animals.

  • Which predictions were closest?

  • Which predictions were off?

  • Were there any surprises?

9 In the real world, we usually don’t have access to a whole dataset to check predictions against! How could we test…​

  • Every giraffe on the planet?

  • Everyone who has ever come in contact with a covid-positive person?

  • Every person who identifies as queer?

What strategies can we use to make sure that predictions from samples are as close to accurate as possible?

These materials were developed partly through support of the National Science Foundation, (awards 1042210, 1535276, 1648684, and 1738598). CCbadge Bootstrap by the Bootstrap Community is licensed under a Creative Commons 4.0 Unported License. This license does not grant permission to run training or professional development. Offering training or professional development with materials substantially derived from Bootstrap must be approved in writing by a Bootstrap Director. Permissions beyond the scope of this license, such as to run training, may be available by contacting contact@BootstrapWorld.org.