Unit 8 Data Sets And Distributions — Unit Plan

TitleAssessment
Lesson 1
Got Data?
What’s the Question?

Would each survey question produce categorical data or numerical data?

  1. What is your favorite vegetable?

  2. Have you been to the capital city of your state?

  3. How old is the youngest person in your family?

  4. In which zip code do you live?

  5. What is the first letter of your name?

  6. How many hours do you spend outdoors each day?

Show Solution
  1. Categorical
  2. Categorical
  3. Numerical
  4. Categorical
  5. Categorical
  6. Numerical
Lesson 2
Statistical Questions
Questions about Temperature

Here are two questions:

Question A: Over the past 10 years, what is the warmest temperature recorded, in degrees Fahrenheit, for the month of December in Miami, Florida?

Question B: At what temperature does water freeze in Miami, Florida?

  1. Decide if each question is statistical or non-statistical. Explain your reasoning.
  2. If you decide that a question is statistical, describe how you would find the answer. What data would you collect?
Show Solution
    • Question A is a statistical question. Sample reasoning: The temperature in Miami in December changes from day to day and from year to year.
    • Question B is not a statistical question. Water freezes at sea level at 32 degrees Fahrenheit. This is a known fact.
  1. To answer Question A (about the warmest temperature), find the temperature records for the past ten years and look for the highest value in degrees Fahrenheit.
Section A Check
Section A Checkpoint
Problem 1

Classify each set of data as numerical data or categorical data.

  1. The items on a shopping list for the grocery store.

  2. The total cost of all the items on the shopping list for the grocery store.

  3. The numbers on a barcode that can be scanned at the grocery store.

Show Solution
  1. Categorical
  2. Numerical
  3. Categorical
Problem 2
Write a statistical question about your favorite food. Explain why it is a statistical question.
Show Solution
Sample response: What percentage of people like pancakes better than waffles? It is a statistical question because the answer would require collecting data from people about their preference and I expect there to be different answers included in the data.
Lesson 3
Representing Data Graphically
Swimmers and Swimming Class
  1. Noah gathered information on the home states of the swimmers on a national team. He organized the data in a table. Would a dot plot be appropriate to display his data? Explain your reasoning.
  2. This dot plot shows the ages of students in a swimming class. How many students are in the class?

    Dot plot, age in months, labeled 16 through 36 by twos. For intervals of 1, starting above 18, the number of dots above each number is 2, 0, 1, 2, 0 3, 0, 2, 1, 0, 1, 0, 2, 1, 1.

  3. Based on the dot plot, do you agree with each of the statements? Explain your reasoning.

    1. The frequency of each age represented is never greater than 3.
    2. Half of the students are between 2 and 3 years old.
Show Solution
  1. No. He could use a bar graph because the home states of swimmers are categorical data.
  2. 16 students are in the class.
    1. Agree. Sample reasoning: The number of dots stacked over a number is never greater than 3 for this dot plot.
    2. Agree. On the number line, eight of the 16 data points, or half of the class, are placed to the right of 24 months and to the left of 36 months.
Lesson 4
Dot Plots
Family Size

A group of students was asked, “How many children are in your family?” The responses are displayed in the dot plot.

A dot plot, number of children, 0 through 6 by ones.  Starting at 0, the number of dots above each increment is 0, 5, 7, 5, 2, 1, 0.

  1. How many students responded to the question?
  2. What percentage of the students have more than one child in the family?
  3. Write a sentence that describes the distribution of the data shown on the dot plot. Use a description of the center and spread in your description.
Show Solution
  1. There are 20 dots and each corresponds to one student in the group.
  2. 75%. 15 out of 20 students answered that there are 2 or more children in the family.
  3.  Sample response: A typical number of children for this group of families is around 2 because the center is around 2.5 or so, but some families had many more children than others. The distribution is not very spread out with most families having 1–3 children and only a few of them having more.
Lesson 5
Using Dot Plots to Answer Statistical Questions
Packing Tomatoes

A farmer sells tomatoes in packages of ten. She would like the tomatoes in each package to all be about the same size and close to 5.5 ounces in weight. The farmer is considering two different tomato varieties: Variety A and Variety B. She weighs 25 tomatoes of each variety. These dot plots show her data.

A photograph of 10 red tomatoes of a similar size. Top row has 5 tomatoes. Bottom row has 5 tomatoes.

Variety A
A dot plot, weight in ounces, labeled from 5 point 3 5 to 5 point 6 by five hundredths.
A dot plot, Variety A,  weight in ounces, labeled from 5 point 3 5 to 5 point 6 by five hundredths. Starting at 5 point 3 5 up by point 0 1, the number of dots above each increment is 0, 1, 1, 0, 1, 1, 0, 0, 2, 2, 1, 1, 2, 1, 1, 3, 1, 2, 1, 1, 0, 1, 0, 1, 1.

Variety B
A dot plot, weight in ounces, labeled from 5 point 3 5 to 5 point 6 by five hundredths
A dot plot, Variety B, weight in ounces, labeled from 5 point 3 5 to 5 point 6 by five hundredths. Starting at 5 point 4 7 up by point 0 1, the number of dots above each increment is 1, 1, 7, 8, 4, 2, 1, 1.

  1. What would be a good description for the center of the distribution of weights of Variety A tomatoes, in general? What about for the weight of Variety B tomatoes?
  2. Which tomato variety should the farmer choose? Explain your reasoning.
Show Solution
  1. In general, Variety A tomatoes are about 5.49 ounces and Variety B tomatoes are about 5.5 ounces.

  2. She should choose Variety B. Sample reasoning: The two varieties of tomatoes have about the same center of their distributions, but there is much less variability in Variety B tomato weights. The weights are much more consistent than the weights for Variety A, so the tomatoes are more likely to be the same size and closer to 5.5 ounces in weight.

Lesson 6
Interpreting Histograms
Rain in Miami

Here is the average amount of rainfall, in inches, for each month in Miami, Florida.

month rainfall (inches) month rainfall (inches)
January 1.61 July 6.5
February 2.24 August 8.9
March 2.99 September 9.84
April 3.14 October 6.34
May 5.35 November 3.27
June 9.69 December 2.05
  1. Complete the frequency table and use it to make a histogram.

    rainfall
    (inches)
    frequency
    0–2 1
    2–4 5
    4–6
    6–8
    8–10

    A blank grid, horizontal axis labeled rainfall in inches, boxes from 0 to 11 by ones, labeled 0 to 10 by twos. Vertical axis 0 to 7 by ones.

  2. What can you say about the center of this distribution using the histogram?
Show Solution
  1. rainfall (inches) frequency
    0–2 1
    2–4 5
    4–6 1
    6–8 2
    8–10 3

    <p>A histogram.</p>

  2. Sample response: The center of the distribution appears to be between 4 and 6 inches of rain.
Lesson 7
Using Histograms to Answer Statistical Questions
A Tale of Two Seasons

The two histograms show the points scored per game by a basketball player in 2008 and 2016.

A histogram, points per game in 2 thousand 8
A histogram, points per game in 2 thousand 8, 0 to 45 by fives.  Beginning at 0 up to but not including 5, height of bar at each interval is 0, 1, 10, 11, 7, 1, 2, 1. 

A histogram, points per game in two thousand 16.
A histogram, points per game in two thousand 16, 0 to 45 by fives, Beginning at 0 up to but not including 5, height of bar at each interval is 2, 7, 6, 9, 5, 3, 2, 0, 0

  1. Describe the center of each distribution represented by the histograms. Explain your reasoning.
  2. Write 2–3 sentences that describe the spreads of the two distributions, including what spreads might tell us in this context.
Show Solution

Sample response:

  1. In both seasons, the player typically scored around 15 to 20 points in a game. In each histogram, there seems to be a similar frequency of values on each side of this interval.
  2. The spread of the distribution for 2008 seems less than the spread for the 2016 distribution. There are only 5 games in which the player did not score between 10 and 25 points per game in 2008, but in 2016 the data are more spread out. This means that, from game to game, the player was more consistent in 2008 than in 2016.
Lesson 8
Describing Distributions on Histograms
Point Spread

Here is a histogram that shows the number of points scored by a college basketball player during the 2008 season. Describe the shape and features of the data. Mention the center and spread as well as any symmetry, gaps, peaks, or other features that you notice.

A histogram, points per game, 0 to 50 by fives. Beginning at 5 up to but not including 10, height of bar at each interval is 1, 10, 11, 7, 1, 2, 0, 1.

Show Solution

Sample response: The distribution is not symmetrical because there is a peak on the left. The histogram shows a gap between 35 and 40, so there is no game in which the player scored 35, 36, 37, 38 or 39 points. There was one game that was unusually high scoring, between 40 and 44 points. The peak is between 15 and 20 points. The center is around 20 points, and the data is spread out pretty far above this center.

Section B Check
Section B Checkpoint
Problem 1

Describe the distribution. Include a mention of the center and spread in your description.

A dot plot with about the same number of dots for all whole numbers between 2 and 12.

Show Solution
Sample response: The distribution of data is fairly evenly distributed between the values 2 and 12. The center of the distribution is around 7 because there are about the same number of points on either side of that value.
Problem 2

A car service company keeps track of how long it takes between when a customer requests a car and when it arrives. The data are summarized in the histogram.

A histogram showing wait times in minutes

  1. What is a typical amount of time that customers have to wait for a car to arrive? Explain your reasoning.
  2. How many customers in this group had to wait longer than 20 minutes?
Show Solution
  1. Sample response: 8 minutes. Most wait times are in the 6–10 minute range, so I chose the middle value.
  2. 6
Lesson 9
Mean
Finding Means
  1. Last week, the daily low temperatures for a city, in degrees Celsius, were 5, 8, 6, 5, 10, 7, and 1. What was the average low temperature? Show your reasoning.
  2. The mean of four numbers is 7. Three of the numbers are 5, 7, and 7. What is the fourth number? Explain your reasoning.
Show Solution
  1. 6 degrees Celsius. The sum of the temperatures divided by the total number of recorded temperatures is (5+ 8+ 6+ 5+ 10+ 7+ 1)÷7=6(5+ 8+ 6+ 5+ 10+ 7+ 1)\div 7 = 6.
  2. 9. Sample reasoning: The 4 numbers must be distributed evenly around 7. Because 2 of the numbers are 7, and the third number is two less than 7, the fourth number must be 2 more than 7.
Lesson 10
Finding and Interpreting the Mean as the Balance Point
Text Messages

The three data sets show the number of text messages sent to their parents by Jada, Diego, and Lin over 6 days.

One of the data sets has a mean of 4, one has a mean of 5, and one has a mean of 6.

Jada

  • 4
  • 4
  • 4
  • 6
  • 6
  • 6

Diego

  • 4
  • 5
  • 5
  • 6
  • 8
  • 8

Lin

  • 1
  • 1
  • 2
  • 2
  • 9
  • 9
  1. Which data set has which mean? What does this tell you about the text messages sent by the three students?

  2. Which data set has the greatest variability? Explain your reasoning.

Show Solution
  1. Jada's mean is 5, since 4+4+4+6+6+66=306=5\frac{4+4+4+6+6+6}{6}=\frac{30}{6}=5. Diego's mean is 6, since 4+5+5+6+8+86=366=6\frac{4+5+5+6+8+8}{6}=\frac{36}{6}=6. Lin's mean is 4, since 1+1+2+2+9+96=246=4\frac{1+1+2+2+9+9}{6}=\frac{24}{6}=4. On average, Diego sent the most text messages to his parents per day, and Lin sent the fewest text messages per day to her parents.
  2. Sample response: Lin's data has the highest variability. The sum of the distances to each side of the mean is the greatest.
Section C Check
Section C Checkpoint
Problem 1

A large company uses 2 manufacturing plants to package 50 pound bags of corn seed. The weight of the bags that are produced in a week are measured and summarized with this information.

Plant A

  • Mean weight of bags: 51.2 pounds
  • MAD weight of bags: 1.8 pounds

Plant B

  • Mean weight of bags: 50.1 pounds
  • MAD weight of bags: 0.1 pounds
  1. Write 2 sentences comparing the distribution of bag weights for the 2 plants based on the given information.
  2. The company is worried about one of the plants having too many bags that are under the advertised 50 pound weight. Which plant do you think is having this problem? Explain your reasoning.
Show Solution
  1. Sample response: Plant A typically makes heavier bags of corn seed based on the mean, but has much larger variability based on the MAD. Plant B is more consistent in bag weight and stays closer to the 50 pound bags that are claimed.
  2. Plant A. Sample reasoning: Although the mean weight is greater, the large MAD indicates that it is not uncommon to have bags that weigh less than the advertised 50 pounds (51.21.8=49.451.2 - 1.8 = 49.4)
Lesson 13
Median
Practicing the Piano

Jada and Diego are practicing the piano for an upcoming rehearsal. The number of minutes each of them practiced in the past few weeks are listed. 

Jada's practice times:

  • 10
  • 10
  • 20
  • 15
  • 25
  • 25
  • 8
  • 15
  • 20
  • 20
  • 35
  • 25
  • 40

Diego's practice times:

  • 25
  • 10
  • 15
  • 30
  • 15
  • 20
  • 20
  • 25
  • 30
  • 45
  1. Find the median of each data set.
  2. Explain what the medians tell you about Jada's and Diego's piano practice.
Show Solution
  1. Jada's median: 20 minutes. Diego's median: 22.5 minutes.
  2. Sample response: Half of Jada's practices are 20 minutes or shorter and the other half of her practices are 20 minutes or longer. Half of Diego's practices are 22.5 minutes or shorter, and the other half are 22.5 minutes or longer.
Lesson 14
Comparing Mean and Median
Which Measure of Center to Use?

For each dot plot or histogram:

  1. Predict if the mean is greater than, less than, or approximately equal to the median. Explain your reasoning.
  2. Which measure of center—the mean or the median—better describes a typical value for the distributions?

Heights of 50 basketball players
&lt;p&gt;Histogram from 66 to 80 by 2’s. Height in inches. Beginning at 66 up to but not including 68, height of bar at each interval is 12, 3, 14, 18, 6, 4, 1.&lt;/p&gt;<br>
 

Backpack weights of 55 sixth-grade students
&lt;p&gt;Dot plot from 0 to 16 by 2’s. Backpack weight in kilograms. &lt;/p&gt;<br>
 
<p>Dot plot from 0 to 16 by 2’s. Backpack weight in kilograms. Beginning at 0, number of dots above each increment from 0 to 9 is 0, 7, 9, 12, 7, 6, 3, 3, 2, 1. 1 dot above 16.</p>  

Ages of 30 people at a family dinner party
&lt;p&gt;Histogram from 5 to 50 by 5’s. Age in years. Beginning at 5 up to but not including 10, height of bar at each interval is 2, 3, 1,1,2,3,2,5,11.&lt;/p&gt;<br>
 

Show Solution

Sample responses:

  1. Player heights
    1. The mean would be approximately equal to the median, because the data are roughly symmetric.
    2. Since I think the values would be pretty close, either the mean or the median would describe a typical height pretty well.
  2. Backpack weights
    1. The mean would be higher than the median. The value of 16 kilograms would bring the mean up and move it away from the center of the data.
    2. The median would better describe a typical backpack weight, since that value would lie in the center of the large cluster of data points.
  3. People's ages
    1. The mean would be lower than the median, because even though a large fraction of the people at the dinner party are 40 or older, the ages of the people that span from 5 to 40 would bring the average age down.
    2. The median would better describe the center of the distribution of around 40–45 years old.
Lesson 15
Quartiles and Interquartile Range
How Far Can You Throw?

Diego wondered how far sixth-grade students could throw a heavy ball. He decided to collect data to find out. He asked 10 friends to throw the ball as far as they could and measured the distance from the starting line to where the ball landed. The data shows the distances he recorded in feet.

  • 40
  • 40
  • 47
  • 49
  • 50
  • 53
  • 55
  • 57
  • 63
  • 76
  1. Find the median and IQR of the data set.
  2. On a later day, he asked the same group of 10 friends to throw a ball again and collected another set of data. The median of the second data set is 49 feet, and the IQR is 6 feet.

    1. Did the 10 friends, as a group, perform better (throw farther) or worse in the second round compared to the first round? Explain how you know.
    2. Were the distances in the second data set more variable or less variable compared to those in the first round? Explain how you know.
Show Solution
  1. The median is 51.5 feet. (50+53)÷2=51.5(50+53)\div 2=51.5. The IQR is 10, because Q1 is 47, Q3 is 57, and 5747=1057-47 = 10.
    1. Worse. Sample reasoning: The median of the second data set is 49 feet, which is 2.5 feet lower than in the first round. 
    2. Less variable. Sample reasoning: The IQR of the second data set is smaller, so the values are less spread out.
Lesson 17
Using Box Plots
Humpback Whales

Researchers measures the lengths, in feet, of 20 male humpback whales and 20 female humpback whales. Here are two box plots that summarize their data.

&lt;p&gt;Two box plots on a grid from 38 to 56 by 2's. Length in feet. Top box plot labeled male. Bottom box plot labeled female.&lt;/p&gt;<br>
 
<p>Two box plots on a grid from 38 to 56 by 2's. Length in feet. Top box plot labeled male. Bottom box plot labeled female. Box plot labeled male whisker from 39 point 2 to 43. Box from 43 to 46 with vertical line at 44 point 5. Whisker from 46 to 48.  Box plot labeled female whisker from 48 to 49. Box from 49 to 51 point 8 with vertical line at 50 point 8. Whisker from 51 point 8 to 54 point 5.</p>  

  1. How long is the longest whale measured? Is this whale male or female?

  2. What is a typical length for the male humpback whales in this study?

  3. Do you agree with each of these statements about the whales? Explain your reasoning.

    1. More than half of male humpback whales measured are longer than 46 feet.
    2. The male humpback whales tend to be longer than female humpback whales.
    3. The lengths of the male humpback whales tend to vary more than the lengths of the female humpback whales.
Show Solution
  1. The longest whale is about 55 feet long and is a female.
  2. A typical male humpback whale is about 44.5 feet long.
    1. Disagree. Sample explanation: The upper quartile of the data for the male humpbacks is 46 feet, which means a quarter of the whales are longer than 46 feet.
    2. Disagree. Sample explanation: The entire distribution for the lengths of female humpbacks is greater than that for male humpbacks, so female humpbacks tend to be longer than their male counterparts.
    3. Agree. Sample explanation: The IQR of the data for male humpbacks is slightly greater than that for female humpbacks, and the range of the data for the males is larger than that for females, so the lengths of male humpbacks tend to vary more.
Section D Check
Section D Checkpoint
Problem 1
In a large city, the median rent paid monthly for an apartment is $2000 and the interquartile range (IQR) is $850. If you are planning to move to this city, what information does each of the values mean to you?

Show Solution
Sample response: The median rent of $2000 means that half of apartments in the city cost at least $2000 per month and half cost less. The IQR of $850 means that the middle half of rents are within $850 of each other. This means that there is quite a lot of variability for apartments in this city. It tells me that I might be expected to pay about $2000 per month for an apartment, but, if I need to, I should be able to find a cheaper one without too much trouble.
Problem 2

A person who recently graduated from college is looking at the salaries for people who work for two different companies. The box plots summarize the information from each company.

2 boxplots for Company A and Company B showing salaries for workers at the companies.

Compare the distribution of pay from each company.

Show Solution
Sample response: Both companies have the same median, so typical workers at the companies are paid similarly. The IQR for Company A is much less than at Company B, so there is more variability of salaries for the middle half of workers at Company B. The range of salaries at Company A is about $250,000 and only about $150,000 at Company B, so there may be more pay inequality at Company A.
Lesson 18
Using Data to Solve Problems
Time Spent on Chores

Lin surveys her classmates on the number of hours they spend doing chores each week. She represents her data with a dot plot and a histogram.




Dot plot from 0 to 6 by 1’s. Hours spent on chores per week. Beginning at 0, number of dots above each increment is 0, 9, 6, 4, 2, 2, 1.

Histogram from 0 to 8 by 1’s. Hours spent on chores per week. Beginning at 0 up to but not including 2, height of bar at each interval is 10, 10, 4, 1.

  1. Lin thinks that she can find the median, the minimum, and the maximum of the data set using both the dot plot and the histogram. Do you agree? Explain your reasoning.
  2. Should Lin use the mean and MAD, or the median and IQR to summarize her data?  Explain your reasoning.
Show Solution

Samples responses:

  1. Disagree. The dot plot makes it possible to find the median, the minimum, and the maximum fairly easily because it shows each data value individually. The histogram makes it possible to estimate these values, but it is impossible to tell the exact values because the data points are grouped together.
  2. Lin should use the median and IQR because the data is not approximately symmetrical and has values far from the center. There are a few larger values that are not similar to most of the other values
Unit 8 Assessment
End-of-Unit Assessment
Problem 1

Select all the true statements.

A.

Given a box plot, it is always possible to calculate the mean of the data.

B.

Given a box plot, it is always possible to find the median of the data.

C.

Given a box plot, it is always possible to construct a corresponding dot plot.

D.

Given a dot plot, it is always possible to construct a corresponding box plot.

E.

Given a histogram, it is always possible to construct a corresponding box plot.

Show Solution
B, D
Problem 2

Here’s a dot plot of a data set.

A dot plot with the numbers 3 through 13 indicated. The data are as follows: 4, 4 dots; 5, 6 dots; 13, 1 dot.

Which statement is true about the mean of the data set?

A.

The mean is less than 5.

B.

The mean is equal to 5.

C.

The mean is greater than 5.

D.

There is not enough information to determine the mean.

Show Solution

The mean is greater than 5.

Problem 3

The air quality was tested in many office buildings in two cities. The results of the testing are shown in these box plots.

Double box plot from 0 to 65 by 5's. Parts per million. Top box plot labeled city P. Bottom box plot labeled city Q.
Double box plot from 0 to 65 by 5's. Parts per million. Top box plot labeled city P. Bottom box plot labeled city Q. Top box plot whisker from 10 to 15. Box from 15 to 35 with vertical line at 30. Whisker from 35 to 40. Bottom box plot whisker from 5 to 20. Box from 20 to 30 with vertical line at 25. Whisker from 30 to 60.

A level of less than 50 parts per million is considered healthy. A level of 50 or more parts per million is considered unhealthy.

Select all the statements that must be true.

A.

The lowest recorded measurement was in city Q.

B.

All buildings tested in city P are in the healthy range.

C.

The mean for city P is greater than the mean for city Q.

D.

The range for city Q is greater than the range for city P.

E.

The median for city P is greater than the median for city Q.

Show Solution
A, B, D, E
Problem 4

This box plot displays information about the number of text messages that some students sent to their parents in one day.

Box plot from 0 to 50 by 5’s. Number of texts sent to parents. Whisker from 0 to 10. Box from 10 to 20 with vertical line at 14. Whisker from 20 to 50.
A box plot for “number of texts.” The numbers 0 through 50, in increments of 5, are indicated. The five number summary for the box plot is as follows: Minimum value, 0. Maximum value, 50. Q1, 10. Q2, 14. Q3, 20.

  1. What is the median number of texts sent by students?
  2. What is the IQR (interquartile range)?
  3. Is this data set symmetric? Explain how you know.
Show Solution
  1. 14 text messages (also accept anything between 13 and 14.5).
  2. 10 text messages (the IQR is 201020-10, which is 10).
  3. No. The top quartile (or top whisker) is much wider than the bottom quartile.

Tier 1 response:

  • Accurate, correct work.
  • Correct answers to all three questions, including a correct explanation of why the data set is not symmetric.
  • Acceptable errors: Claim that the data set is not symmetric because the right side of the box is wider than the left side, without reference to whiskers.

Tier 2 response:

  • Work shows general conceptual understanding and mastery, with some errors.
  • Sample errors: Incorrect median; incorrect IQR; incorrect answer or explanation on data symmetry question, including a general statement that the box plot is not symmetric (not specific enough).

Tier 3 response:

  • Significant errors in work demonstrate lack of conceptual understanding or mastery.
  • Sample errors: Two or more error types from Tier 2 response.
Problem 5

Two groups went bowling. Here are the scores from each group.

Group A

  • 70
  • 80
  • 90
  • 100
  • 110
  • 130
  • 190

Group B

  • 50
  • 100
  • 107
  • 110
  • 120
  • 140
  • 150
  1. Draw two box plots, one for the data in each group.

    A blank box plot for "score in points". The numbers 50 through 190, in increments of 10, are indicated.

  2. Which group shows greater variability?
Show Solution
  1.  

    <p>Box Plot. Score in points. </p>

  2. Group A shows greater variability. It has a wider range (120 to Group B’s 100), and a wider IQR (50 to Group B’s 40).

Tier 1 response:

  • Accurate, correct work.
  • Both box plots are drawn correctly, correctly stating that Group A shows greater variability.

Tier 2 response:

  • Work shows general conceptual understanding and mastery, with some errors.
  • Sample errors: 1 or 2 types of minor errors in creating box plots (incorrect placement of median, quartiles, max or min, badly drawn box); incorrectly stating Group B shows greater variability or omitting question.

Tier 3 response:

  • Significant errors in work demonstrate lack of conceptual understanding or mastery.
  • Sample errors: More than 2 types of minor errors in creating box plots; major errors in creating box plots, such as not using 5 numbers to generate box plot; creating only one box plot.
Problem 6

Ten students each attempted 10 free throws. This list shows how many free throws each student made.

8

5

6

6

4

9

7

6

5

9

  1. What is the median number of free throws made?
  2. What is the IQR (interquartile range)?
Show Solution
  1. 6 free throws. (The ordered list is 4,5,5,6,6,6,7,8,9,94, 5, 5, 6, 6, 6, 7, 8, 9, 9. The two middle terms in the ordered list are both 6.)
  2. 3 free throws. (The first half of the data is 4,5,5,6,64, 5, 5, 6, 6; its median is 5. The second half of the data is 6,7,8,9,96, 7, 8, 9, 9; its median is 8. The IQR is 3, since 8 5=38 - 5 = 3.)
Problem 7

Jada asked some students at her school how many hours they spent watching television last week, to the nearest hour. Here are a box plot and a histogram for the data she collected.

Box plot:

Box plot from 0 to 26 by 2’s. Time in hours. Whisker from 0 to 2. Box from 2 to 10 with vertical line at 5. Whisker from 10 to 26.
A box plot for “time in hours.” The numbers 0 through 26, in increments of two, are indicated. The five-number summary for the box plot is as follows: Minimum value, 0. Maximum value, 26. Q1, 2. Q2, 5. Q3, 10.

Histogram:

Histogram from 0 to 30 by 5’s. Time in hours. Beginning at 0 up to but not including 5, height of bar at each interval is 40, 30, 20, 5, 4,3
A histogram: the horizontal axis is labeled “time in hours,” and the numbers 0 through 30, in increments of 5, are indicated. On the vertical axis, the numbers 0 through 40, in increments of 5, are indicated. The data represented by the bars are as follows: From 0 up to 5 hours, 40; From 5 up to 10 hours, 30; From 10 up to 15 hours, 20; From 15 up to 20 hours, 5; From 20 up to 25 hours, 3; From 25 up to 30 hours, 2

  1. About how many students did Jada ask?
  2. Is the mean or the median a more appropriate measure of center for this data set? Explain your reasoning.

  3. Can Jada use these data displays to find the exact median? Explain how you know.

  4. Can Jada use these data displays to find the exact mean?
  5. What would be an appropriate measure of variability for this data set? Find or estimate its value.
Show Solution
  1. Jada asked about 100 students.
  2. The median is more appropriate because the data is not symmetric.
  3. Yes, the box plot gives the exact median, 5 hours.
  4. No
  5. Sample response: The IQR (interquartile range) is appropriate because the median is being used as a measure of center. The box plot gives the IQR of 8 hours because 102=810 - 2 = 8.

Tier 1 response:

  • Accurate, correct work.
  • Correct answer to each question, description of why IQR is an appropriate measure of spread, correct IQR.
  • Acceptable errors: Mistake in determining median or IQR caused by a misreading of the box plot.

Tier 2 response:

  • Work shows good conceptual understanding and mastery, with minor errors.
  • Sample errors: Incorrect response for histogram total, larger than 6; stating that the data is symmetric; attempt to calculate precise mean; incorrect or missing IQR calculation.
  • Acceptable errors: Incorrect MAD estimation, given (incorrect) statement that data is symmetric.

Tier 3 response:

  • Work shows a developing but incomplete conceptual understanding, with significant errors.
  • Sample errors: Two or more error types from Tier 2 response; incorrect response for histogram total, 6 or fewer; incorrect median; invalid use of box plot to determine mean.

Tier 4 response:

  • Work includes major errors or omissions that demonstrate a lack of conceptual understanding and mastery.
  • Sample errors: Three or more error types from Tier 2 response; two or more error types from Tier 3 response; multiple omitted parts.