Monthly Archives: November 2015

Intro to Statistics (Unit 2)

During the first few days of the new unit students explore relationships in the data we had collected about the class.

Back to the Class Data

Looking at class data gives us the chance to ask questions about relationships among the variables. Here are some questions my students came up with:

  • Is arm span really equal to height?

The easiest way to dig into this question is to look at a scatter plot of the data. So, we plotted the variables, along with the line height = arm span.

11-22-2015 Image001

We noted that two people were on the line, two others were very close, and the rest were either above or below the line. What do those points above the line mean about height and arm span for those people? What about the points below the line?

  • If my hand span is longer than my wrist circumference, then shouldn’t I be able to wrap my hand around my wrist and touch my pinky to my thumb?

11-22-2015 Image009

One hundred percent of students had longer hand spans than wrist circumference, but only a couple of students could wrap their hands around their wrists.

  • Is the age (in months) related to any other measure?

11-22-2015 Image010

It would seem that none of the other variables is a good predictor for age in months. It also seems as if age vs height has a negative association. Huh?

Digging a Little Deeper

If the line height = arm span doesn’t describe, or predict, that relationship well, then what would do a better job? We added a “movable line” and adjusted it until it looked about right.

movable line


Our line predicted that height = 0.85 * arm span + 26 cm. Wait, what? Height is 85% of arm span? And what is that +26 cm all about? It made for an interesting conversation, especially this question from a student: “How can a person who has an arm span of 0 cm be 26 cm tall?” Which prompted: “What does an arm span of 0 cm even mean?” I certainly don’t have definitive answers to these questions. What I can do is encourage the curiosity, the conversation, and point out that the relationship we discovered is for these measurements. Does it make much sense to use our calculated relationship to make predictions about heights for arm spans that are relatively far away from the data we collected?

Correlation, Causation, Outliers, Influential Points

All of these topics follow from this initial discussion about the class data. Ultimately, students once again find their own variables of interest and complete an analysis demonstrating what they’ve learned. This time topics included unemployment rates, marriage rates, divorce rates, distances & temperatures of celestial objects, height & weight, obesity rate & life expectancy, and mean snowfall & mean low temperature.

Once again, the variety of topics that interested my students is greater than what I could have come up with. More importantly, because they chose their own variables, they were interested in analyzing the data and answering their own questions.

1 Comment

Filed under teaching

Intro to Statistics (Unit 1)

Statistics & probability in high school is often saved for 12th grade, though some progress has been made with integrating linear regression into algebra classes.
My school operates on trimesters, so each class is only 12 weeks long. We’ve created an Intro to Statistics class to focus on descriptive statistics during those 12 weeks. It’s really designed for students who are entering high school, not leaving it. I probably should have created this post a couple of months ago, since the term ends on Tuesday, but I’ve been a little busy.

All About the Chips

Early in the term we investigated claims made by Keebler and Chips Ahoy about their chocolate chip cookies. Of course, in order to really investigate, we needed to dissect the cookies and count up the chips. Here are our results (from this term):

  • Fifty percent of Keebler cookies have more chips than 100% of Chips Ahoy.
  • Keebler has a mean of 34.4. chips per cookie. With 24 cookies per package, this means there are approximately 860 chips per package.
  • Chips Ahoy has a mean of 25.9 chips per cookie. With 35 cookies per package, this means there are approximately 907 chips per package.
  • Although Keebler has fewer chips per package, they have more than 25% more chips per cookie (on average) than Chips Ahoy. Keebler would need to have an average of 32.4 chips per cookie for their claim to be true. They had an average of 34.4 chips per cookie, which is more than 25% more chips per cookie.

Students were asked to write an introductory paragraph and a concluding paragraph. Here’s one introduction:

Are they lying? That’s the question we asked ourselves when we conducted tests to see if either Chip’s Ahoy or Keebler told the truth in their advertisements. Chip’s Ahoy promised 1000 chocolate chips in every bag, and Keebler promised 25% more. Our findings surprised us.

The findings followed, and then this conclusion:

We believe, based on our findings, that Chip’s Ahoy told the truth, while Keebler tried to get away with a misleading slogan. While Chip’s Ahoy had approximately 907 chips per package, which is 93 less than they promised, it would be unreasonable to expect our estimate to be exact, as some cookies may have more chips than others. Because of this, we must grant Chips Ahoy some leeway, as it could simply be our estimate was low. However, Keebler promised 25% more chips than Chips Ahoy. However, the total number of chips in Keebler was actually less than Chips Ahoy. However, we believe “25% more” may be referring to the number of chips per cookie, not per package. Because of this, Keebler may be technically telling the truth, but they are misleading consumers. Chips Ahoy was telling the truth all along.

All About the Class

We also collected some data about the class, including height, arm span, and kneeling height. Students were asked to apply what they learned from the cookie activity to the this new data set. They represented the data graphically:

box plot histogram

And then described what they saw:

The height is skewed to the left, whilst the kneeling height is symmetrical. Kneeling Height has a small interquartile range, and is less spread out than height. The minimum Height is larger than the maximum kneeling height. Kneeling Height and Standing Height do not share a single point.

They are similar because they are both a measure of distance/height. They are different because a person’s kneeling height will never be greater than their standing height, which leaves interesting data with you compare the two.

There is less variation in kneeling height than there is in standing height. No one in the class was so tall their kneeling height was greater than the minimum standing height recorded.


Height: The data for height are skewed to the left with a median of 170.5 cm and an interquartile range of 10 cm.

Armspan: The data for armspan are skewed to the right with a median of 166.3 cm and an interquartile range of 11 cm.

Comparison: The median of both sets of data have a difference of 4.2 cm and the interquartile range has a difference of 1 cm. The Height data are skewed to the left while the armspan data are Skewed to the right.

Conclusions: In conclusion, the rule of thumb that you are as tall as your arms are long is mostly true because the median of both data sets is only 4.2 cm off and the fact that the interquartile range is but one centimeter off proves this further.

All About What They Learned

The first unit of the course ends with students finding and analyzing their own data. Data choices included movies, bass fishing, hours in space, world series appearances, touchdowns scored by the Giants and the Cowboys, wealth vs age, costliest hurricanes, and daily high temperatures for Portland, ME and Berlin, Germany. What I love the most about this assignment is that students are able to investigate something that interests them and show me what they’ve learned.

They always come up with topics that I would never think of!


Filed under Baxter, teaching