Unit 9:   Threats to Validity

imageUnit 9Threats to Validity
Unit Overview

Students consider possible threats to the validity of their analysis

English

add translation

Product Outcomes:
    Length: 30 Minutes
    Glossary:
    • threats to validity: factors than undermine the confidence in a conclusion

    Materials:
      Preparation:
      • Computer for each student (or pair), with access to the internet

      • Student workbooks, and something to write with

      Types

      Functions

      Values

      Number

      +, -, *, /, num-sqrt, num-sqr

      4, -1.2. 2/3

      String

      string-repeat, string-contains

      "hello" "91"

      Boolean

      true false

      Image

      triangle, circle, star, rectangle, ellipse, square, text, overlay, bar-chart, pie-chart, bar-chart-raw, pie-chart-raw, histogram, scatter-plot, lr-plot

      imageimage

      Table

      count, .row-n, .order-by, .filter, mean, median, mode



      Review

      Overview

      Learning Objectives

        Evidence Statementes

          Product Outcomes

            Materials

              Preparation

              • Computer for each student (or pair), with access to the internet

              • Student workbooks, and something to write with

              Review (Time 10 minutes)

              • ReviewYou’ve learned a lot in this class about how to analyze data. What questions matter to you?
                • Come up with a question that you want answered about the world around you.

                • Using what you know now, what information would you need to collect in order to answer it?

                • What subsets would you need to create? What analysis would you need to perform?

                Debrief as a class.

              Threats to Validity

              Overview

              Learning Objectives

              • Students learn about threats to validity, such as sample size, selection bias, sample error, and confounding variables.

              Evidence Statementes

                Product Outcomes

                  Materials

                    Preparation

                    Threats to Validity (Time 20 minutes)

                    • Threats to Validity

                      As good Data Scientists, the staff at the animal shelter is constantly gathering data about their animals, their volunteers, and the people who come to visit. But just because they have data doesn’t mean the conclusions they draw from it are correct! For example: suppose they surveyed 1,000 cat-owners and found that 95% of them thought cats were the best pet. Could they really claim that people generally prefer cats to dogs?

                      Have students share back what they think. The issue here is that cat-owners are not a representative sample of the population, so the claim is invalid.

                    • There’s more to data analysis than simply collecting data and crunching numbers. In the example of the cat-owning survey, the claim that "people prefer cats to dogs" is invalid because the data itself wasn’t representative of the whole population (of course cat-owners are partial to cats!). This is just one example of what are called Threats to Validity.

                    • On Page 54 and Page 55, you’ll find four different claims backed by four different datasets. Each one of those claims suffers from a serious threat to validity. Can you figure out what those threats are?

                      Give students time to discuss and share back. Answers: The dog-park survey is not a random sample, the dogs are friendlier towards whomever is giving them food, etc.

                    • Life is messy, and there are always threats to validity. Data Science is about doing the best you can to minimize those threats, and to be up front about what they are whenever you publish a finding. When you do your own analysis, make sure you include a discussion of the threats to validity!

                      On Page 56, you’ll find some deliberately misleading claims made by slimy Data Scientists. Can you figure out why these claims should not be trusted? Once you’ve finished, consider your own dataset and analysis: what misleading claims could someone make about your work? Turn to Page 57, and come up with four misleading claims based on data or displays from your work. Then trade papers with another group, and see if you can figure out why each other’s claims are not to be trusted!

                    Your research paper

                    Overview

                    Learning Objectives

                      Evidence Statementes

                        Product Outcomes

                          Materials

                            Preparation

                              Your research paper (Time flexible)

                              • Your research paperNow that you’ve completed your analysis, it’s time to write up your findings!

                                Open the Research Paper template, and save a copy to your Google Drive.

                              • Each section of the research paper refers back to the work you’ve done in the Student Workbook. Use these pages and your program to write your findings!