Unit 8 Data Sets And Distributions — Unit Plan

TitleTakeawaysStudent SummaryMastery Check
Lesson 4
Dot Plots (Part A)
6.SP.4

Display quantitative data in plots on a number line, including dot plots, and histograms.

A dot plot is a way to show data on a number line. It is the same idea as the line plot from Grade 4 — just with dots instead of X marks.

Every dot plot has four parts:

  • Title — what the data is about (for example, “Hours on Homework”).
  • Key — what each dot stands for (usually each dot = 1 student).
  • Axis label — what the numbers along the bottom mean.
  • Dots — one dot for each piece of data, stacked above the matching number.

To read a dot plot, count the dots above a single number. That count is the frequency for that value. The number with the tallest stack is the mode — the most common value.

To build a dot plot from a frequency table, draw the right number of dots above each tick to match the frequency in the table. The total number of dots equals the total in the table.

Reading & Building a Dot Plot (3 problems)
Problem 1

Use the dot plot to find the frequency for 2 hours. How many students spent 2 hours on homework on Wednesday?

Hours on Homework (Wednesday)01234Hours on Homework (Wednesday)

Show Solution
  1. 3 students. The dot plot has 3 dots stacked above 2.
  2. A: 1. B: 5 students. The column above 1 has 5 dots, more than any other column.
  3. The dot plot has 2 dots above 0, 4 dots above 1, 3 dots above 2, 1 dot above 3, and 0 dots above 4 — matching the frequency table.
Problem 2

Use the dot plot to answer both questions:

Number of Siblings0123Number of Siblings

A. Which value on the dot plot has the most students?

B. How many students chose that value?

Show Solution
  1. 3 students. The dot plot has 3 dots stacked above 2.
  2. A: 1. B: 5 students. The column above 1 has 5 dots, more than any other column.
  3. The dot plot has 2 dots above 0, 4 dots above 1, 3 dots above 2, 1 dot above 3, and 0 dots above 4 — matching the frequency table.
Problem 3

Use the frequency table to build the dot plot from scratch. Type the number of students for each value of hours.

HoursStudents
02
14
23
31
40

Hours on Homework (Friday)01234Hours on Homework

Show Solution
  1. 3 students. The dot plot has 3 dots stacked above 2.
  2. A: 1. B: 5 students. The column above 1 has 5 dots, more than any other column.
  3. The dot plot has 2 dots above 0, 4 dots above 1, 3 dots above 2, 1 dot above 3, and 0 dots above 4 — matching the frequency table.
Lesson 4
Dot Plots (Part B)
6.SP.4

Display quantitative data in plots on a number line, including dot plots, and histograms.

6.SP.B

No additional information available.

6.SP.2

Understand that a set of quantitative data collected to answer a statistical question has a distribution which can be described by its center, spread, and overall shape.

We often collect and analyze data because we are interested in learning what is “typical,” or what is common and can be expected in a group.

Sometimes it is easy to tell what a typical member of the group is. For example, we can say that a typical shape in this set is a large circle.

A set that consists of 17 shapes. There are 10 large circles, 1 medium circle, 3 small circles, 1 large square, and 2 small squares.

Just looking at the members of a group doesn’t always tell us what is typical, however. For example, if we are interested in the side length typical of squares in this set, it isn’t easy to do so just by studying the set visually.

A set that consists of 18 squares of varying side lengths.

In a situation like this, it is helpful to gather the side lengths of the squares in the set and look at their distribution, as shown in this dot plot.

A dot plot for "side lengths in centimeters".
A dot plot for "side lengths in centimeters". The numbers 1 through 8 are indicated. The data are as follows: 2 centimeters, 4 dots. 3 centimeters, 5 dots. 4 centimeters, 3 dots. 5 centimeters, 3 dots. 6 centimeters, 2 dots. 7 centimeters, 1 dot.

We can see that squares with 3 centimeter sides are the most common and many others are about the same size. That means we could say that side lengths of about 3 centimeters are typical of squares in this set.

Family Size (2 problems)
Problem 1

A group of students was asked, “How many children are in your family?” The responses are displayed in the dot plot below.

A dot plot, number of children, 0 through 6 by ones. Starting at 0, the number of dots above each increment is 0, 5, 7, 5, 2, 1, 0.

How many students responded to the question?

Show Solution
  1. There are 20 dots and each dot represents one student in the group, so 20 students responded.
  2. 15 students. Counting the dots above 2, 3, 4, and 5 (every value greater than 1) gives 7 + 5 + 2 + 1 = 15 students.
Problem 2

Using the same dot plot, how many students have more than one child in their family?

Show Solution
  1. There are 20 dots and each dot represents one student in the group, so 20 students responded.
  2. 15 students. Counting the dots above 2, 3, 4, and 5 (every value greater than 1) gives 7 + 5 + 2 + 1 = 15 students.
Lesson 6
Histograms (Part A)
6.SP.4

Display quantitative data in plots on a number line, including dot plots, and histograms.

A histogram shows how often values fall into different ranges. Each bar covers a bin — a range of numbers like “20 up to (but not including) 30”, written [20, 30).

  • The height of a bar tells you how many values fall in its bin.
  • Each value belongs to exactly one bin. The left number is included; the right number is not. So 30 goes in [30, 40), not in [20, 30).
  • The total number of values is the sum of all the bar heights.

To build a histogram from a categorized frequency table, draw one bar per row, with the bar's height matching the frequency in that row. Put the bars next to each other — they touch, because the intervals are continuous.

A histogram looks a lot like the bar graph from 2nd grade, but a histogram is for numbers, not categories. That’s why the bars touch.

Reading & Building a Histogram (3 problems)
Problem 1

Use the histogram to find the frequency for the [20, 30) bin. How many students read for 20 to less than 30 minutes?

Minutes Spent Reading01234567891001020304050minutes spent readingnumber of students

Show Solution
  1. 8 students. The bar over the bin [20, 30) has a height of 8.
  2. [2, 4). The decimal 2.24 is between 2 and 4, so it goes in the [2, 4) bin (2 is included, 4 is not).
  3. 20 students. Sum of bar heights: 4 + 7 + 5 + 3 + 1 = 20.
Problem 2

The histogram has bins [0, 2), [2, 4), [4, 6), [6, 8), [8, 10). In which bin does the value 2.24 belong?

02468102.24

Show Solution
  1. 8 students. The bar over the bin [20, 30) has a height of 8.
  2. [2, 4). The decimal 2.24 is between 2 and 4, so it goes in the [2, 4) bin (2 is included, 4 is not).
  3. 20 students. Sum of bar heights: 4 + 7 + 5 + 3 + 1 = 20.
Problem 3

Use the histogram to find the total number of students surveyed.

Books Read This Year0123456789100510152025books readnumber of students

Show Solution
  1. 8 students. The bar over the bin [20, 30) has a height of 8.
  2. [2, 4). The decimal 2.24 is between 2 and 4, so it goes in the [2, 4) bin (2 is included, 4 is not).
  3. 20 students. Sum of bar heights: 4 + 7 + 5 + 3 + 1 = 20.
Lesson 6
Interpreting Histograms (Part B)
6.SP.1

Recognize that a statistical question is one that anticipates variability in the data related to the question and accounts for it in the answers. Understand that statistics can be used to gain information about a population by examining a sample of the population.

6.SP.3

Recognize that a measure of center for a quantitative data set summarizes all of its values with a single number while a measure of variation describes how its values vary with a single number.

6.SP.4

Display quantitative data in plots on a number line, including dot plots, and histograms.

6.SP.5.b

Summarize quantitative data sets in relation to their context.

In addition to using dot plots, we can also represent distributions of numerical data using histograms.

Here is a dot plot that shows the weights, in kilograms, of 30 dogs, followed by a histogram that shows the same distribution.

A dot plot for dog weights in kilograms
A dot plot, the numbers 10 through 35, in increments of 5, are indicated. The 30 data values are as follows: 10 kilograms, 1 dot. 11 kilograms, 1 dot. 12 kilograms, 2 dots. 13 kilograms, 1 dot. 15 kilograms, 1 dot. 16 kilograms, 2 dots. 17 kilograms, 1 dot. 18 kilograms, 2 dots. 19 kilograms, 1 dot. 20 kilograms, 3 dots. 21 kilograms, 1 dot. 22 kilograms, 3 dots. 23 kilograms, 1 dot. 24 kilograms, 2 dots. 26 kilograms, 2 dots. 28 kilograms, 1 dot. 30 kilograms, 1 dot. 32 kilograms, 2 dots. 34 kilograms, 2 dots.

A histogram for dog weights in kilograms.
A histogram, the horizontal axis is labeled “dog weights in kilograms” and the numbers 10 through 35, in increments of 5, are indicated. On the vertical axis the numbers 0 through 10, in increments of 2, are indicated. The data represented by the bars are as follows: Weight from 10 up to 15, 5. Weight from 15 up to 20, 7. Weight from 20 up to 25, 10. Weight from 25 up to 30, 3. Weight from 30 up to 35, 5.

In a histogram, data values are placed in groups, or “bins,” of a certain size, and each group is represented with a bar. The height of the bar tells us the frequency for that group.

For example, the height of the tallest bar is 10, and the bar represents weights from 20 to less than 25 kilograms, so there are 10 dogs whose weights fall in that group. Similarly, there are 3 dogs that weigh anywhere from 25 to less than 30 kilograms.

Notice that the histogram and the dot plot have a similar shape. The dot plot has the advantage of showing all of the data values, but the histogram is easier to draw and to interpret when there are a lot of values or when the values are all different.

Here is a dot plot showing the weight distribution of 40 dogs. The weights were measured to the nearest 0.1 kilogram instead of the nearest kilogram.

A dot plot for “dogs weights on kilograms.”
A dot plot for “dog weights in kilograms”. The numbers 8 through 36, in increments of 2, are indicated. There is 1 dot on each of the following values: 10 kilograms,, 11 kilograms, 11.3 kilograms, 12 kilograms, 12.1 kilograms, 13 kilograms, 14.7 kilograms, 15 kilograms, 15.1 kilograms, 16 kilograms, 16.5 kilograms, 17 kilograms, 18 kilograms, 18.5 kilograms, 19 kilograms, 19.1 kilograms, 20 kilograms, 20.2 kilograms, 20.4 kilograms, 21 kilograms, 21.5 kilograms, 22.6 kilograms, 22.7 kilograms, 22.8 kilograms, 23.2 kilograms, 23.4 kilograms, 24 kilograms, 24.9 kilograms, 26 kilograms, 26.1 kilograms, 26.7 kilograms, 28 kilograms, 28.4 kilograms, 30 kilograms, 31.5 kilograms, 32 kilograms, 32.1 kilograms, 33.5 kilograms, 34 kilograms, 34.4 kilograms.

Here is a histogram showing the same distribution.

Histogram from 10 to 35 by 5's. Dog weights in kilograms.

In this case, it is difficult to make sense of the distribution from the dot plot because the precision of the measurement means the dots are distinct and so close together. The histogram of the same data set does a much better job showing the distribution of weights by grouping similar values to show an overall trend, even though we can’t see the individual data values.

Rain in Miami (1 problem)

Here is the average amount of rainfall, in inches, for each month in Miami, Florida.

month rainfall (inches) month rainfall (inches)
January 1.61 July 6.5
February 2.24 August 8.9
March 2.99 September 9.84
April 3.14 October 6.34
May 5.35 November 3.27
June 9.69 December 2.05
  1. Complete the frequency table and use it to make a histogram.

    rainfall
    (inches)
    frequency
    0–2 1
    2–4 5
    4–6
    6–8
    8–10

    A blank grid, horizontal axis labeled rainfall in inches, boxes from 0 to 11 by ones, labeled 0 to 10 by twos. Vertical axis 0 to 7 by ones.

  2. What can you say about the center of this distribution using the histogram?
Show Solution
  1. rainfall (inches) frequency
    0–2 1
    2–4 5
    4–6 1
    6–8 2
    8–10 3

    <p>A histogram.</p>

  2. Sample response: The center of the distribution appears to be between 4 and 6 inches of rain.
Section B Check
Section B Checkpoint
Lesson 9
Mean (Part A)
6.SP.3

Recognize that a measure of center for a quantitative data set summarizes all of its values with a single number while a measure of variation describes how its values vary with a single number.

6.SP.5.c

Summarize quantitative data sets in relation to their context.

The mean (also called the average) is one number we use to describe a whole data set. To find the mean, we use a 3-step procedure:

  1. ADD all the values in the data set.
  2. COUNT how many values there are. Call this number nn.
  3. DIVIDE the sum by nn.

For example, to find the mean of {4,6,8,6,6}\{4, 6, 8, 6, 6\}:

  • Sum: 4+6+8+6+6=304 + 6 + 8 + 6 + 6 = 30
  • Count: there are 55 values.
  • Mean: 30÷5=630 \div 5 = 6.

So the mean of this data set is 6.

The procedure works the same way for decimal data sets. To find the mean of {2.5,3.0,3.5,4.0,2.0}\{2.5, 3.0, 3.5, 4.0, 2.0\}: sum is 15.015.0, count is 55, mean is 15.0÷5=3.015.0 \div 5 = 3.0.

Computing the Mean (3 problems)
Problem 1

Find the mean of this data set: 7, 9, 11.

Show Solution

9. Sample reasoning: Add the values to get 7+9+11=277 + 9 + 11 = 27. There are 3 values, so the mean is 27÷3=927 \div 3 = 9. Other valid reasoning is accepted as long as it demonstrates understanding that the mean is the sum divided by the number of values.

Problem 2

Find the mean of this data set: 4, 8, 6, 10, 7.

Show Solution

7. Sample reasoning: The sum is 4+8+6+10+7=354 + 8 + 6 + 10 + 7 = 35. There are 5 values, so the mean is 35÷5=735 \div 5 = 7. Other valid reasoning is accepted as long as it demonstrates understanding of summing then dividing by the count.

Problem 3

Find the mean of this data set: 1.5, 2.0, 2.5.

Show Solution

2.0. Sample reasoning: The sum is 1.5+2.0+2.5=6.01.5 + 2.0 + 2.5 = 6.0. There are 3 values, so the mean is 6.0÷3=2.06.0 \div 3 = 2.0. Other valid reasoning is accepted as long as it demonstrates understanding that the same add-and-divide procedure works for decimal data sets.

Lesson 9
Mean (Part B)
6.SP.3

Recognize that a measure of center for a quantitative data set summarizes all of its values with a single number while a measure of variation describes how its values vary with a single number.

6.SP.B

No additional information available.

6.SP.5.c

Summarize quantitative data sets in relation to their context.

Sometimes a general description of a distribution does not give enough information, and a more precise way to talk about center or spread would be more useful. The mean, or average, is a number we can use for the center to summarize a distribution.

We can think about the mean in terms of “fair share” or “leveling out.” That is, a mean can be thought of as a number that each member of a group would have if all the data values were combined and distributed equally among the members.

For example, suppose there are 5 containers, each of which has a different amount of water: 1 liter, 4 liters, 2 liters, 3 liters, and 0 liters.

5 diagrams, each composed of 4 squares, some colored blue. From left to right, the number of blue squares in each diagram are 1, 4, 2, 3, 0.
There are 5 identical tape diagrams that are each partitioned into 4 equal parts. The first diagram has 1 part shaded. The second diagram has 4 parts shaded. The third diagram has 2 parts shaded. The fourth diagram has 3 parts shaded. The fifth diagram has no parts shaded.

To find the mean, first we add up all of the values. We can think of this as putting all of the water together: 1+4+2+3+0=101+4+2+3+0=10.

  

A tape diagram partitioned into 10 equal parts. All 10 parts are shaded.

  

To find the “fair share,” we divide the 10 liters equally into the 5 containers: 10÷5=210\div 5 = 2.

There are 5 identical tape diagrams each partitioned into 4 equal parts. Each diagram has 2 parts shaded.

The mean is useful when each unit of measurement has equal importance. For example, it may make sense to find the mean score of assignments of the same importance, such as all quizzes. If some grades are more important, it may not make sense to find the mean. For example, it may not make sense to find the mean score when there are 6 short homework assignments and one major essay.

Suppose the quiz scores of a student are 70, 90, 86, and 94. We can find the mean (or average) score by finding the sum of the scores (70+90+86+94=340)(70+90+86+94=340) and dividing the sum by four (340÷4=85)(340 \div 4 = 85). We can then say that the student scored, on average, 85 points on the quizzes.

In general, to find the mean of a data set with nn values, we add all of the values and divide the sum by nn.

Finding Means (1 problem)

Last week, the daily low temperatures for a city, in degrees Celsius, were 5, 8, 6, 5, 10, 7, and 1. What was the average low temperature? Show your reasoning.

Show Solution

6 degrees Celsius. The sum of the temperatures divided by the total number of recorded temperatures is (5+8+6+5+10+7+1)÷7=6(5 + 8 + 6 + 5 + 10 + 7 + 1) \div 7 = 6.

Section C Check
Section C Checkpoint
Lesson 13
Median (Part A)
6.SP.5.c

Summarize quantitative data sets in relation to their context.

The median is the middle value of a data set when the values are listed from least to greatest. Half of the values are at or below the median, and half are at or above the median.

To find the median:

  1. Order the values from least to greatest.
  2. Cross off from both ends of the list — one from the left, one from the right — until you reach the middle.
  3. If there is one middle value (an odd number of values), that value is the median.
  4. If there are two middle values (an even number of values), the median is the average of those two values: add them and divide by 2.

Odd N example: The data set 4, 7, 9, 12, 15 has 5 values. The middle (3rd) value is 9, so the median is 9.

Even N example: The data set 6, 10, 14, 18 has 4 values. The two middle values are 10 and 14. The median is (10 + 14) / 2 = 24 / 2 = 12.

Split-the-difference example: The data set 18, 22, 25, 27 has 4 values. The two middle values are 22 and 25. The median is (22 + 25) / 2 = 47 / 2 = 23.5. The median doesn't have to be a whole number, and it doesn't have to be a value that's actually in the data set.

Find the Median (3 problems)
Problem 1

Find the median of this data set:

4, 7, 9, 12, 15

Show Solution
  1. Median of 4, 7, 9, 12, 15 is 9. The data is already in order; 9 is the middle (3rd of 5) value.
  2. Median of 11, 5, 8, 3, 14 is 8. Sorted: 3, 5, 8, 11, 14. The middle (3rd of 5) value is 8.
  3. Median of 6, 10, 14, 18 is 12. The data is in order; the two middle values are 10 and 14. (10 + 14) / 2 = 12.
Problem 2

Find the median of this data set:

11, 5, 8, 3, 14

Show Solution
  1. Median of 4, 7, 9, 12, 15 is 9. The data is already in order; 9 is the middle (3rd of 5) value.
  2. Median of 11, 5, 8, 3, 14 is 8. Sorted: 3, 5, 8, 11, 14. The middle (3rd of 5) value is 8.
  3. Median of 6, 10, 14, 18 is 12. The data is in order; the two middle values are 10 and 14. (10 + 14) / 2 = 12.
Problem 3

Find the median of this data set:

6, 10, 14, 18

Show Solution
  1. Median of 4, 7, 9, 12, 15 is 9. The data is already in order; 9 is the middle (3rd of 5) value.
  2. Median of 11, 5, 8, 3, 14 is 8. Sorted: 3, 5, 8, 11, 14. The middle (3rd of 5) value is 8.
  3. Median of 6, 10, 14, 18 is 12. The data is in order; the two middle values are 10 and 14. (10 + 14) / 2 = 12.
Lesson 13
Median (Part B)
6.SP.B

No additional information available.

6.SP.5.c

Summarize quantitative data sets in relation to their context.

The median is another measure of center for a distribution. It is the middle value in a data set when values are listed in order. The number of values less than or equal to the median is the same as the number of values that are greater than or equal to the median.

To find the median, we order the data values from least to greatest and find the number in the middle.

Suppose we have 5 dogs whose weights, in pounds, are shown in the table. The median weight for this group of dogs is 32 pounds because three dogs weigh less than or equal to 32 pounds and three dogs weigh greater than or equal to 32 pounds.

20

25

32

40

55

Now suppose we have 6 cats whose weights, in pounds, are listed here. Notice that there are 2 values in the middle: 7 and 8.

4

6

7

8

10

10

The median weight must be between 7 and 8 pounds, because half of the cats weigh less than or equal to 7 pounds, and half of the cats weigh greater than or equal to 8 pounds.

When there are even numbers of values, we take the number exactly in between the two middle values. In this case, the median cat weight is 7.5 pounds because (7+8)÷2=7.5(7+8)\div 2=7.5.

Jada's and Diego's Piano Practice (3 problems)
Problem 1

Jada is practicing the piano for an upcoming rehearsal. Here are the number of minutes she practiced on each of 13 days:

Jada's practice times (in minutes):

20, 10, 25, 15, 8, 25, 35, 20, 10, 15, 25, 40, 20

Find the median of Jada's practice times.

Show Solution
  1. Jada's median is 20 minutes. After ordering Jada's 13 values (8, 10, 10, 15, 15, 20, 20, 20, 25, 25, 25, 35, 40), the 7th value is the middle value, which is 20.
  2. Diego's median is 23.5 minutes. After ordering Diego's 10 values (15, 16, 18, 20, 22, 25, 26, 28, 30, 32), the median is the average of the 5th and 6th values: (22 + 25) / 2 = 23.5.
  3. Sample reasoning: Diego's median of 23.5 minutes is greater than Jada's median of 20 minutes, so a typical practice session for Diego is longer than a typical practice session for Jada. Other valid reasoning is accepted as long as it demonstrates understanding that the median represents a typical value, and that comparing medians compares typical values across two data sets.
Problem 2

Diego is also practicing the piano. Here are the number of minutes he practiced on each of 10 days:

Diego's practice times (in minutes):

22, 15, 20, 25, 32, 26, 16, 18, 28, 30

Find the median of Diego's practice times.

Show Solution
  1. Jada's median is 20 minutes. After ordering Jada's 13 values (8, 10, 10, 15, 15, 20, 20, 20, 25, 25, 25, 35, 40), the 7th value is the middle value, which is 20.
  2. Diego's median is 23.5 minutes. After ordering Diego's 10 values (15, 16, 18, 20, 22, 25, 26, 28, 30, 32), the median is the average of the 5th and 6th values: (22 + 25) / 2 = 23.5.
  3. Sample reasoning: Diego's median of 23.5 minutes is greater than Jada's median of 20 minutes, so a typical practice session for Diego is longer than a typical practice session for Jada. Other valid reasoning is accepted as long as it demonstrates understanding that the median represents a typical value, and that comparing medians compares typical values across two data sets.
Problem 3

Jada's median practice time is 20 minutes. Diego's median practice time is 23.5 minutes.

What does this tell you about how Jada's and Diego's piano practice are different? Explain your reasoning in 1-2 sentences.

Show Solution
  1. Jada's median is 20 minutes. After ordering Jada's 13 values (8, 10, 10, 15, 15, 20, 20, 20, 25, 25, 25, 35, 40), the 7th value is the middle value, which is 20.
  2. Diego's median is 23.5 minutes. After ordering Diego's 10 values (15, 16, 18, 20, 22, 25, 26, 28, 30, 32), the median is the average of the 5th and 6th values: (22 + 25) / 2 = 23.5.
  3. Sample reasoning: Diego's median of 23.5 minutes is greater than Jada's median of 20 minutes, so a typical practice session for Diego is longer than a typical practice session for Jada. Other valid reasoning is accepted as long as it demonstrates understanding that the median represents a typical value, and that comparing medians compares typical values across two data sets.
Lesson 14
Comparing Mean and Median (Part A)
6.SP.5.c

Summarize quantitative data sets in relation to their context.

The mean and the median are two different ways to describe the center of a data set. Sometimes they're equal. Sometimes they're very different. Looking at both side-by-side tells us about the shape of the distribution.

To compute the mean: add all the values, then divide by the number of values.

To compute the median: order the values from least to greatest, then find the middle (or average the two middle values for even N).

Symmetric example: The data set 4, 6, 8, 6, 6 has mean (4 + 6 + 8 + 6 + 6) / 5 = 30 / 5 = 6. Sorted: 4, 6, 6, 6, 8. The median is the 3rd value, 6. Mean = median = 6.

Skewed-right example: The data set 2, 3, 4, 5, 21 has mean (2 + 3 + 4 + 5 + 21) / 5 = 35 / 5 = 7. Sorted: 2, 3, 4, 5, 21. The median is 4. The mean (7) is much greater than the median (4) because the value 21 pulls the mean up. The median is not affected the same way.

The pattern:

  • When a data set is roughly symmetric (no extreme values), mean ≈ median.
  • When a data set has a value that's much greater than the others, the mean is pulled up and is greater than the median.
  • When a data set has a value that's much smaller than the others, the mean is pulled down and is less than the median.
Compute Both Mean and Median (3 problems)
Problem 1

For the data set 4, 6, 8, 6, 6:

  1. Find the mean.
  2. Find the median.
Show Solution
  1. For the data set 4, 6, 8, 6, 6: mean = (4+6+8+6+6)/5 = 30/5 = 6; sorted is 4, 6, 6, 6, 8 so median = 6. Both equal 6.
  2. For the data set 2, 3, 4, 5, 21: mean = (2+3+4+5+21)/5 = 35/5 = 7; sorted is 2, 3, 4, 5, 21 so median = 4.
  3. For the data set 1, 2, 3, 4, 25: mean = (1+2+3+4+25)/5 = 35/5 = 7; sorted is 1, 2, 3, 4, 25 so median = 3. The mean is greater than the median because 25 pulls the mean up.
Problem 2

For the data set 2, 3, 4, 5, 21:

  1. Find the mean.
  2. Find the median.
Show Solution
  1. For the data set 4, 6, 8, 6, 6: mean = (4+6+8+6+6)/5 = 30/5 = 6; sorted is 4, 6, 6, 6, 8 so median = 6. Both equal 6.
  2. For the data set 2, 3, 4, 5, 21: mean = (2+3+4+5+21)/5 = 35/5 = 7; sorted is 2, 3, 4, 5, 21 so median = 4.
  3. For the data set 1, 2, 3, 4, 25: mean = (1+2+3+4+25)/5 = 35/5 = 7; sorted is 1, 2, 3, 4, 25 so median = 3. The mean is greater than the median because 25 pulls the mean up.
Problem 3

For the data set 1, 2, 3, 4, 25: is the mean greater than, less than, or equal to the median? Explain.

Show Solution
  1. For the data set 4, 6, 8, 6, 6: mean = (4+6+8+6+6)/5 = 30/5 = 6; sorted is 4, 6, 6, 6, 8 so median = 6. Both equal 6.
  2. For the data set 2, 3, 4, 5, 21: mean = (2+3+4+5+21)/5 = 35/5 = 7; sorted is 2, 3, 4, 5, 21 so median = 4.
  3. For the data set 1, 2, 3, 4, 25: mean = (1+2+3+4+25)/5 = 35/5 = 7; sorted is 1, 2, 3, 4, 25 so median = 3. The mean is greater than the median because 25 pulls the mean up.
Lesson 14
Comparing Mean and Median (Part B)
6.SP.5.b

Summarize quantitative data sets in relation to their context.

6.SP.5.c

Summarize quantitative data sets in relation to their context.

6.SP.5.d

Summarize quantitative data sets in relation to their context.

Both the mean and the median are ways of measuring the center of a distribution. They tell us slightly different things, however.

The dot plot shows the number of stickers on 30 pages. The mean number of stickers is 21 (marked with a triangle). The median number of stickers is 20.5 (marked with a diamond).

&lt;p&gt;A dot plot for "stickers on a page".&lt;/p&gt;<br>
 
<p>A dot plot for stickers on a page. The numbers 8 through 34, in increments of 2, are indicated. A diamond is indicated at 20.5 stickers and a triangle is indicated at 21 stickers. Data are as follows: 9 stickers, 1 dot; 10 stickers, 1 dot; 11 stickers, 2 dots; 12 stickers, 1 dot; 14 stickers, 1 dot; 16 stickers, 2 dots; 17 stickers, 1 dot; 18 stickers, 2 dots; 19 stickers, 1 dot; 20 stickers, 3 dots; 21 stickers, 1 dot; 22 stickers, 3 dots; 23 stickers, 1 dot; 24 stickers, 2 dots; 26 stickers, 2 dots; 28 stickers, 1 dot; 30 stickers, 1 dot; 32 stickers, 2 dots; 33 stickers, 1 dot; 34 stickers, 1 dot.</p>  

The mean tells us that if the number of stickers were distributed so that each page has the same number, then each page would have 21. We could also think of 21 stickers as a balance point for the number of stickers on all of the pages in the set. 

The median tells us that half of the pages have more than 20.5 stickers and half have less than 20.5 stickers. In this case, both the mean and the median could describe a typical number of stickers on a page because they are fairly close to each other and to most of the data points.

Here is a different set of 30 pages with stickers. It has the same mean as the first set, but the median is 23 stickers.

&lt;p&gt;A dot plot for “stickers on a page.” &lt;/p&gt;<br>
 
<p>A dot plot for “stickers on a page.” The numbers 8 through 34, in increments of 2, are indicated. A triangle is indicated at 21 stickers, and a diamond is indicated at 23 stickers. The data are as follows: 9 stickers, 1 dot; 10 stickers, 1 dot; 13 stickers, 1 dot; 14 stickers, 1 dot; 16 stickers, 1 dot; 17 stickers, 1 dot; 19 stickers, 1 dot; 20 stickers, 2 dots; 21 stickers, 2 dots; 22 stickers, 3 dots; 23 stickers, 6 dots; 24 stickers, 5 dots; 25 stickers, 4 dots; 26 stickers, 1 dot.</p>  

In this case, the median is closer to where most of the data points are clustered and is therefore a better measure of center for this distribution. That is, it is a better description of the typical number of stickers on a page. The mean number of stickers is influenced (in this case, pulled down) by a handful of pages with very few stickers, so it is farther away from most data points.

In general, when a distribution is symmetrical or approximately symmetrical, the mean and median values are close. But when a distribution is not roughly symmetrical, the two values tend to be farther apart.

Which Measure of Center to Use? (1 problem)

For each dot plot or histogram:

  1. Predict if the mean is greater than, less than, or approximately equal to the median. Explain your reasoning.
  2. Which measure of center—the mean or the median—better describes a typical value for the distributions?

Heights of 50 basketball players
&lt;p&gt;Histogram from 66 to 80 by 2’s. Height in inches. Beginning at 66 up to but not including 68, height of bar at each interval is 12, 3, 14, 18, 6, 4, 1.&lt;/p&gt;<br>
 

Backpack weights of 55 sixth-grade students
&lt;p&gt;Dot plot from 0 to 16 by 2’s. Backpack weight in kilograms. &lt;/p&gt;<br>
 
<p>Dot plot from 0 to 16 by 2’s. Backpack weight in kilograms. Beginning at 0, number of dots above each increment from 0 to 9 is 0, 7, 9, 12, 7, 6, 3, 3, 2, 1. 1 dot above 16.</p>  

Ages of 30 people at a family dinner party
&lt;p&gt;Histogram from 5 to 50 by 5’s. Age in years. Beginning at 5 up to but not including 10, height of bar at each interval is 2, 3, 1,1,2,3,2,5,11.&lt;/p&gt;<br>
 

Show Solution

Sample responses:

  1. Player heights
    1. The mean would be approximately equal to the median, because the data are roughly symmetric.
    2. Since I think the values would be pretty close, either the mean or the median would describe a typical height pretty well.
  2. Backpack weights
    1. The mean would be higher than the median. The value of 16 kilograms would bring the mean up and move it away from the center of the data.
    2. The median would better describe a typical backpack weight, since that value would lie in the center of the large cluster of data points.
  3. People's ages
    1. The mean would be lower than the median, because even though a large fraction of the people at the dinner party are 40 or older, the ages of the people that span from 5 to 40 would bring the average age down.
    2. The median would better describe the center of the distribution of around 40–45 years old.
Section D Check
Section D Checkpoint
Unit 8 Assessment
Unit 8 Assessment - Data Sets and Distributions