Referenced from lesson Randomness and Sample Size (Spring, 2021)

1 In the Definitions Area of the Expanded Animals Starter File, define the following samples:

``````tiny-sample = random-rows(big-animals-table, 10)
small-sample = random-rows(big-animals-table, 20)
medium-sample = random-rows(big-animals-table, 40)
large-sample = random-rows(big-animals-table, 80)``````

2 Click run and make a `pie-chart` of the species in the `tiny-sample`.

• What animals are in the sample?

• Click run for a new random sample and make another pie-chart of species in the `tiny-sample`. What animals are in the sample?

• Click run for a new random sample and make another pie-chart of species in the `tiny-sample`. Based on these samples, how many species of animals do you think are at the shelter?

• Which species do you think there are the most of at the shelter?

3 What did you learn from taking multiple samples that you wouldn’t have known if you’d only taken a single sample?

4 Now use `small-sample` to make a `pie-chart` of the species.

• What animals are in the sample?

• Click run for a new random sample and make another pie-chart of species in the `small-sample`. What animals are in the sample?

5 Now that you’ve seen `small-sample`, how has your sense of the distribution of the species changed?

6 Now use `medium-sample` to make a `pie-chart` of the species. If there are about 400 animals at the shelter, how many of each species would you predict there to be.

7 Now use `large-sample` to make a `pie-chart` of the species. If there’s anything you’d like to change about your prediction now that you’ve seen `large-sample`, record it here.

8 Let’s see how accurate your prediction is…​ feel free to click run and build a few more pie charts from your samples if you want to collect more information first! When you’re ready, make a `pie-chart` of `animals-table-2`.

• Which predictions were closest?

• Which predictions were off?

• Were there any surprises?

9 In the real world, we usually don’t have access to a whole dataset to check predictions against! How could we test…​

-Every giraffe on the planet? -Everyone who has ever come in contact with a covid-positive person? -Every person who identifies as queer?

What strategies can we use to make sure that predictions from samples are as close to accurate as possible?

