Unit 1 One Variable Statistics — Unit Plan

TitleAssessment
Lesson 1
Getting to Know You
Categorizing Questions

Categorize each of these questions as one of these types, then explain your reasoning for putting the question in that category.

  • Statistical question requiring numerical data to answer it
  • Statistical question requiring categorical data to answer it
  • Non-statistical question
  1. On average, how many books does each person in the United States read each year?
  2. How many acts are in the play Romeo and Juliet?
  3. Which book was read most by students in the class this summer?
  4. How many books are in the classroom right now?
Show Solution
  1. Statistical question requiring numerical data to answer it. The data will be numbers and will have some variability.
  2. Non-statistical question since there is one right answer to the question.
  3. Statistical question requiring categorical data to answer it. The data will be words or phrases and will have some variability.
  4. Non-statistical question since there is one right answer to the question.
Lesson 2
Data Representations
Reasoning about Representations

The dot plot, histogram, and box plot represent the distribution of the same data in 3 different ways.

  1. What information can be seen most easily in the dot plot?

  2. What information can be seen most easily in the histogram?

  3. What information can be seen most easily in the box plot?

<p>Dot plot from 1 to 8 by 0.5’s. battery life in hours. Beginning at 1, number of dots above each increment is 0,0,0,2,2,4,2,4,2,6,2,2,0,0,0.</p>

<p>Histogram from 1 to 8 by 1’s. battery life in hours. Beginning at 1 up to but not including 2, height of bar at each interval is 0, 2, 6, 6, 8, 3, 4, 0.</p>

<p>Boxplot from 1 to 8 by 0.5’s. battery life in hours. Whisker from 2.5 to 3.5. Box from 3.5 to 5.5 with a vertical line at 4.5. Whisker from 5.5 to 6.5.</p>

Show Solution

Sample response:

  1. The actual values, the shape of the distribution, and the most common value are easily seen in the dot plot.
  2. The shape of the distribution and the most common interval of data are easily seen in the histogram.
  3. The five-number summary (minimum, first quartile, median, third quartile, and maximum) are easily seen in the box plot.
Section A Check
Section A Checkpoint
Problem 1
What do dot plots and box plots have in common? What is different?
Show Solution

Sample response:

Both dot plots and box plots can show data visually. They both easily show the maximum and minimum values. Dot plots show all of the data values while box plots do not. Box plots show the median, quartiles, and interquartile range while dot plots do not easily show those summary statistics.

Lesson 4
The Shape of Distributions
Distribution Types

Describe each of these distributions. If more than one term applies, include all the terms that describe each distribution. Where possible, use the terms:

  • Symmetric distribution
  • Skewed distribution
  • Bell-shaped distribution
  • Uniform distribution
  • Bimodal distribution
  1.  
    <p>Dot plot from 0 to 12 by 1’s. Beginning at 0, number of dots above each increment is 0, 1, 1, 2, 2, 3, 5, 3, 2, 2, 1, 1, 0.</p>
  2.  
    <p>Dot plot from 0 to 12 by 1’s. Beginning at 0, number of dots above each increment is 0, 4, 5, 4, 3, 2, 2, 1, 1, 1, 0, 0, 0.</p>
  3.  
    <p>Dot plot from 0 to 12 by 1’s. Beginning at 0, number of dots above each increment is 0, 6, 3, 2, 1, 1, 0, 1, 1, 1, 6, 1, 0.</p>
  4.  
    <p>Dot plot from 0 to 12 by 1’s. Beginning at 0, number of dots above each increment is 0, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 0.</p>
  5. Which of these distributions is most likely to show data collected while studying the number of plates people use while eating at an all-you-can-eat buffet? Explain your reasoning.
Show Solution
  1. Symmetric, bell-shaped with center near 6
  2. Skewed right with center near 4
  3. Skewed right, bimodal with center near 6
  4. Symmetric, uniform with center near 6
  5. Sample response: The second dot plot, since most people will use between 1 and 3 plates while a rare person might go back again and again to use up to nine plates. The other distributions don't make much sense for this context.
Lesson 5
Calculating Measures of Center and Variability
Calculating MAD and IQR
  • 5
  • 18
  • 6
  • 18
  • 13

mean: 12

  1. Find the mean absolute deviation for the data.
  2. Find the interquartile range for the data.
Show Solution
  1. 5.2
  2. 12.5
Section B Check
Section B Checkpoint
Problem 1

Describe the shape of the distribution. Include a measure of center and a measure of variability that would make sense to use, and estimate values for each.

dot plot of values ranging from 23 to 37

Show Solution

Sample response: The distribution is bell-shaped and symmetric with a mean of 30 and a MAD of about 2.

Lesson 6
Mystery Computations
What Does This Do?

<p>A spreadsheet with rows 1 to 7 and columns A to C. A1 contains 2. A2 contains 5. A3 contains 7. A4 contains 12. A5 contains 18. C1 contains =(A1 + A2)*A3.  All other cells are blank.</p>

  1. What value will be in cell C1 after the formula is entered?
  2. What value will be in cell C1 if the numbers in column A are changed to 1, 2, 3, 4, and 5 in that order?
Show Solution
  1. 49
  2. 9
Lesson 7
Spreadsheet Computations
Good Old Raisins and Peanuts

Diego's family is going on a camping trip and his job is to make a batch of GORP (Good Old Raisins and Peanuts) for a snack to take on the trip. In his kitchen, he finds many identical boxes of raisins and many identical bags of peanuts. He puts all of this information in a spreadsheet:

<p>Spreadsheet with rows 1 to 4 and columns A and B. Rows in A contain ounces of raisins in each box, number of boxes, ounces of peanuts in each bag, number of bags. Rows in B contain 3.5, 12, 4, 18.</p>

  1. Explain how Diego could use the spreadsheet to figure out how many total ounces of GORP he can make.
  2. Diego decides that he doesn't need that much GORP. Explain how he could use the spreadsheet to figure out how many boxes of raisins and how many bags of peanuts he needs to make around 60 ounces of GORP.
Show Solution

Possible responses:

  1. In cell B5, type = B1 * B2 + B3 * B4
  2. He could decrease the numbers in cells B2 and B4 until the total number of ounces was around 60.
Lesson 8
Spreadsheet Shortcuts
Doubling in a Spreadsheet

A list of numbers is made with this pattern: Start with 3, and multiply by 2 each time.

Here is the beginning of the list of numbers: 3, 6, 12, . . .

Explain how you could use "fill down" in a spreadsheet to find the tenth number in this list. (You do not need to actually find this number.)

Show Solution

Sample response: In cells A1, A2, and A3, type 3, 6, and 12. In cell A4, type = A3 * 2. Then, click the little square in the corner of A4 and drag it down far enough so that you can see the contents of A10.

Section C Check
Section C Checkpoint
Lesson 9
Technological Graphing
What Are These Values?

Here are some statistics given by a spreadsheet program:

  • nn: 100
  • Mean: 51.68
  • σ\sigma: 29.2957
  • ss: 29.4433
  • Σx\Sigma x: 5168
  • Σx2\Sigma x^2: 352906
  • Min: 1
  • Q1: 23
  • Median: 51
  • Q3: 77
  • Max: 105
  1. What are the mean and median for the data?
  2. How many values are in the data set?
  3. What is the interquartile range for the data? Explain or show your reasoning.
Show Solution
  1. mean: 51.68, median: 51
  2. 100
  3. 54 since 7723=5477 - 23 = 54
Lesson 10
The Effect of Extremes
Shape and Statistics
  1. Is the mean greater than, less than, or equal to the median? Explain your reasoning.

    <p>Dot plot from 70 to 160 by 5’s. Beginning at 70, number of dots above each increment is 0, 0, 7, 9, 11, 9, 9, 9, 7, 4, 4, 5, 5, 6, 5, 5, 4, 1, 0.</p>

  2. Is the mean greater than, less than, or equal to the median? Explain your reasoning.

    <p>Dot plot from 5 to 55 by 5’s. Beginning at 5, number of dots above each increment is 0, 11, 11, 11, 11, 11, 11, 11, 11, 11, 0.</p>

Show Solution
  1. Sample response: The mean is greater than the median because the larger values to the right make the mean higher than it would be if the distribution were uniform.
  2. Sample response. The mean is equal to the median because the data is symmetric.
Lesson 11
Comparing and Contrasting Data Distributions
Which Menu?

A restaurant owner believes that it is beneficial to have different menu items with a lot of variability so that people can have a choice of expensive and inexpensive food. Several chefs offer menus and suggested prices for the food they create. The owner creates dot plots for the prices of the menu items and finds some summary statistics. Which menu best matches what the restaurant is looking for? Explain your reasoning.

Italian:

mean: ​$9.03

median: $9

MAD: $2.45

IQR: $3.50

<p>Dot plot from 0 to 34 by 2’s. Price in dollars. Numbers of dots above 2 is 1, 4.50 is 1, 5 is 5, 6 is 2, 7 is 1, 8 is 9, 9 is 4, 10 is 3, 10.50 is 3, 11 is 3, 12.50 is 6, 14.50 is 2.</p>

Diner:

mean: $3.36

median: $2

MAD: $2.12

IQR: $4

<p>Dot plot from 0 to 34 by 2’s. Price in dollars. Numbers of dots above 1 is 12, 2 is 9, 3 is 5, 4 is 2, 5 is 7, 6.50 is 1, 12 is 1, 16 is 1.</p>

Japanese:

mean: $10.35

median: $10

MAD: $5.55

IQR: $9.50

<p>Dot plot from 0 to 34 by 2’s. Price in dollars. Numbers of dots above 2 is 3, 3 is 2, 4 is 5, 5 is 3, 6 is 4, 7 is 1, 9 is 1, 10 is 4, 12 is 4, 13 is 2, 14 is 3, 15 is 1, 17 is 1, 20 is 3, 21 is 1, 25 is 1, 33 is 1.</p>

Steakhouse:

mean: $11.51

median: $10.50

MAD: $3.69

IQR: $4.50

<p>Dot plot from 0 to 34 by 2’s. Price in dollars. Numbers of dots above 5 is 4, 6 is 4, 8 is 1, 9 is 3, 9.50 is 1, 10 is 7, 11 is 4, 12 is 3, 13 is 3, 14 is 1, 16 is 4, 17 is 1, 18 is 1, 22 is 1, 23 is 1, 25 is 1.</p>

Show Solution

Japanese. The variability, whether measured with IQR or MAD, is greater than any of the other menus available.

Lesson 12
Standard Deviation
True or False: Reasoning with Standard Deviation

The low temperature in degrees Celsius for some cities on the same days in March are recorded in the dot plots.

<p>Dot plot</p>
Dot plot from  5 to 15 by 1's. Christchurch, New Zealand low temperature in degrees Celsius. Beginning at 5, number of dots above each increment is 0, 1, 1, 4, 2, 6, 2, 4, 1, 1, 0.  

<p>Dot plot.</p>
Dot plot from negative 5 to 9 by 1's. Saint Louis Missouri low temperature in degrees Celsius. Beginning at negative 5, number of dots above each increment is 0, 0, 0, 0, 1, 3, 1, 1, 5, 0, 1, 5, 1, 1, 0.

<p>Dot plot</p>
Dot plot from negative 5 to 9 by 1's. Chicago, Illinois low temperature in degrees Celsius. Beginning at negative 5, number of dots above each increment is 0, 1, 3, 1, 1, 5, 0, 1, 5, 1, 1, 0, 0, 0, 0.  

<p>Dot plot.</p>
Dot plot from negative 5 to 9 by 1's. London, United Kingdom low temperature in degrees Celsius. Beginning at negative 5, number of dots above each increment is 0, 1, 3, 1, 1, 5, 0, 1, 5, 1, 0, 0, 0, 1, 0.  

Decide if each statement is true or false. Explain your reasoning.

  1. The standard deviation of Christchurch’s temperatures is zero because the data distribution is symmetric.
  2. The standard deviation of St. Louis’s temperatures is equal to the standard deviation of Chicago’s temperatures.
  3. The standard deviation of Chicago’s temperatures is less than the standard deviation of London’s temperatures.
Show Solution
  1. False. Sample response: The standard deviation is a measure of variability and there is some variability in the data set.
  2. True. Sample response: Chicago’s distribution of temperatures is the same as St. Louis’s, but 3 degrees cooler. The two cities have the same variability in temperature, and so they have the same standard deviation.
  3. True. Sample response: London has the same low temperatures as does Chicago except on the hottest day, London’s temperature is 3 degrees warmer than Chicago’s. Therefore, the temperatures in London have more variability than Chicago’s temperatures on these days.
Lesson 13
More Standard Deviation
Majors and Salaries

A college is looking at the data for its most recent college graduates based on their major.

  • The mean salary of 100 recent college graduates who majored in engineering is $63,750 with a standard deviation of $10,020.
  • The mean salary of 100 recent college graduates who majored in business is $52,200 with a standard deviation of $19,400.
  • The mean salary of 100 recent college graduates who majored in the social sciences is $45,230 with a standard deviation of $6,750.

Match each histogram to the majors based on the description.

  • engineering
  • business
  • social sciences
  1.  
    <p>Histogram.</p>
    Histogram from 20 to 95 by 5’s. Salary in thousands of dollars. Beginning at 20 up to but not including 25, height of bar at each interval is 0, 0, 0, 24, 22, 30, 18, 2, 2, 0, 2, 0, 0, 0, 0, 0.
  2.  
    <p>Histogram.</p>
    Histogram from 20 to 95 by 5’s. Salary in thousands of dollars. Beginning at 20 up to but not including 25, height of bar at each interval is 0, 0, 0, 0, 0, 10, 9, 6, 27, 24, 9, 5, 6, 4, 0, 0.  
  3.  
    <p>Histogram.</p>
    Histogram from 20 to 95 by 5’s. Salary in thousands of dollars. Beginning at 20 up to but not including 25, height of bar at each interval is 0, 3, 15, 5, 18, 1, 27, 1, 8, 0, 8, 0, 3, 0, 7, 4.
Show Solution
  1. social sciences
  2. engineering
  3. business
Lesson 14
Outliers
Expecting Outliers

A group of 20 students are asked to report the number of pets they keep in their house. The results are:

0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 3, 4, 4, 4, 21

  • mean: 2.4 pets
  • standard deviation: 4.47 pets
  • Q1: 0.5 pets
  • median: 1 pet
  • Q3: 2.5 pets
  1. Would any of these values be considered outliers? Explain your reasoning.
  2. After being told that they should not count any fish in the report, the value of 3 becomes a 2 and the value of 21 becomes 1. Would these changes affect the median, mean, standard deviation, or interquartile range? If so, would each measure decrease or increase from their original values?
Show Solution
  1. Yes, 21 pets is an outlier since it is greater than 5.5=2.5+1.525.5 = 2.5 + 1.5 \boldcdot 2.
  2. The mean and standard deviation would decrease with the changes. The median would stay the same and the IQR would decrease slightly.
Lesson 15
Comparing Data Sets
Comparing Mascots

A new pet food company wants to sell their product online and use social media to promote themselves. To determine whether to use a dog or a cat as their mascot, they research the number of clicks on links with an image of a dog or a cat. 

mean: 1,263.5 clicks

median: 1,282 clicks

standard deviation: 357.4 clicks

IQR: 409 clicks

<p>Histogram from 0 to 2,400 by 200’s. clicks for dog images. Beginning at 0, up to but not including 200, height of bar at each interval is 1, 0, 5, 2, 11, 21, 25, 20, 9, 4, 2, 0.</p>

mean: 1,105.4 clicks

median: 1,125.5 clicks

standard deviation: 239.3 clicks

IQR: 312.5 clicks

<p>Histogram from 0 to 2,400 by 200’s. Clicks for cat images. Beginning at 0, up to but not including 200, height of bar at each interval is 0, 0, 3, 6, 23, 32, 28, 6, 1, 0, 0, 0.</p>

  1. Based on the shape of the distributions, what measure of center and measure of variability would you use to compare the distributions? Explain your reasoning.
  2. Based on the data shown here, should the company use a dog or cat mascot? Explain your reasoning.
Show Solution
  1. Mean and standard deviation. Since the distributions are approximately symmetric, the mean and standard deviation are the best choice to represent the data.
  2. Sample responses:
    • The company should use a dog mascot since the mean is greater.
    • The company should use a cat mascot since the standard deviation shows that the images are more consistently clicked over 1,000 times while the dog images sometimes get fewer than 200 clicks.
Section D Check
Section D Checkpoint
Problem 1

What does standard deviation measure? How would a distribution with a large standard deviation compare to a distribution with a small standard deviation?

Show Solution
Standard deviation measures variability. A distribution with a large standard deviation would be more spread out than a distribution with a small one.
Problem 2
Describe what an outlier is. Do outliers affect mean or median more?
Show Solution
Sample response: An outlier is an extreme value in a data set that is more than 1.5 times the interquartile range away from the nearest quartile. Outliers affect mean more than median.
Lesson 16
Analyzing Data
No cool-down