Tahli and Fernando are looking at a scatter plot showing the relationship between poverty and test scores at schools in Michigan. They find a trend, with low-poverty schools generally having higher test scores than high-poverty schools. However, one school is an extreme outlier: the highest poverty school in the state also has higher test scores than most of the other schools!
Tahli thinks the outlier should be removed before they start analyzing, and Fernando thinks it should stay. Here are their reasons:
Tahli’s Reasons: | Fernando’s Reasons: |
---|---|
This outlier is so far from every other school - it has to be a mistake. Maybe someone entered the poverty level or the test scores incorrectly! We don’t want those errors to influence our analysis. Or maybe it’s a magnet, exam or private school that gets all the top-performing students. It’s not right to compare that to non-magnet schools. |
Maybe it’s not a mistake or a special school! Maybe the school has an amazing new strategy that’s different from other schools! Instead of removing an inconvenient data point from the analysis, we should be focusing our analysis on what is happening there. |
Do you think this outlier should stay or go? Why? What additional information might help you make your decision?
These materials were developed partly through support of the National Science Foundation, (awards 1042210, 1535276, 1648684, 1738598, 2031479, and 1501927). Bootstrap by the Bootstrap Community is licensed under a Creative Commons 4.0 Unported License. This license does not grant permission to run training or professional development. Offering training or professional development with materials substantially derived from Bootstrap must be approved in writing by a Bootstrap Director. Permissions beyond the scope of this license, such as to run training, may be available by contacting contact@BootstrapWorld.org.