Fitting Lines

5 min

Narrative

The mathematical purpose of this activity is for students to be able to visually assess the best line that fits data among a set of choices. Students are given a scatter plot and two lines that may fit the data. Students must select the line that best fits the data. The given lines address many common errors in student thinking about best-fit lines, including going through the most points, dividing the data in half, and connecting the points on both ends of the scatter plot.

Listen for students using the terms "slope" and "yy-intercept."

Launch

Provide students access to the images. Give students 2 minutes of quiet time to work on the questions.

Student Task

Which of the lines is the best fit for the data in each scatter plot? Explain your reasoning.

  1.  

    <p>Scatter plot with two lines.</p>
    A scatterplot. Horizontal, from 40 to 90, by 10's, labeled wins. Vertical, 0 to 5, by 0 point 5’s, labeled number of runs allowed per game. 27 dots trend linearly downward and to the right. A dashed, trending linearly upward to the right, a solid line trending linearly downward and to the right. Dot 1 at 42 comma 4 point 5, above solid line, dot 27 at 86 comma 1 point 6, on solid line. 13 dots above solid line, 12 dots below solid line. 16 dots below dashed line, 10 dots above dashed line.   

  2.  

    <p>Scatter plot with two lines of best fit.</p>
    A scatterplot. Horizontal, from 30 to 90, by 10's, labeled average score from 10 surveys. Vertical, 0 to 50, by 5’s, labeled average amount spent on dinner. 27 dots trend linearly upward and to the right. A dashed and solid line of best fit.  Dot 1 at 40 comma 17, dot 27 at 88 comma 40, with dashed line. Dashed line passes through 10 points, with remaining points above the line. Solid line passes through 4 points, with 10 points above the line and 13 points below the line.

  3.  

    <p>Scatter plot with two lines of best fit.</p>
    A scatterplot. Horizontal, from 50 to 75, by 5's, labeled price of crude oil per barrel. Vertical, 2 point 5 to 3 point 8, by 0 point 1’s, labeled cost of gas at local station.  27 dots trend linearly upward and to the right. A dashed and solid line of best fit. Dot 1 at 50 comma 2 point 5 at the bottom left end of the dashed line. Dashed line passes through 10 dots. Solid line passes thorugh 3 points, with 10 points below the line and 15 points above the line.  

  4.  

    <p>Scatter plot with 2 lines of best fit.</p>
    A scatterplot. Horizontal, from 0 to 70, by 10's, labeled number of floors. Vertical, 300 to 900, by 50’s, labeled height of building, feet. 27 dots trend linearly upward and to the right. A dashed and solid line of best fit. Dot 1 at 25 comma 319, Dot 27 at 60 comma 807, with solid line of best fit, passing through 5 additional points. Dashed line passes through 14 points.

Sample Response

  1. Sample response: The solid line is the better fit for the data since it goes through the middle of the data and follows the negative trend of the data.
  2. Sample response: The solid line is the better fit for the data since it goes through the middle of the data with approximately equal numbers of data points on both sides of the line.
  3. Sample response: The dashed line is the better fit for the data since it goes through the middle of the data with a slope that follows the trend of the data. The solid line has a lot of points above the line to start and a lot of points below the line at the end.
  4. Sample response: The dashed line is the better fit for the data since it goes through the middle of the data, follows the trend of the data, and has a similar number of points on each side. The solid line fits the first and last point perfectly, but most of the data is above the line.
Activity Synthesis (Teacher Notes)

The purpose of this discussion is to understand bad fit, good fit, and best fit. In each scatter plot, the solid line represents the line of best fit—except for the last two graphs, for which the dashed line is the best fit.

Ask a student who uses the term "slope" while working the questions, “Can you explain the relationship between the two lines in the plot of runs and wins using the concept of slope?” (The slope of the dashed line is positive, and the slope of the solid line is negative.)

Ask a student who uses the term "yy-intercept," “Can you explain the significance of the yy-intercept in the question about average survey scores and amount spent on dinner?” (The solid line will have a yy-intercept less than the yy-intercept for the dashed line. Because the two lines have approximately the same slope, they appear parallel in the scatter plot.)

If time permits, discuss questions such as:

  • “Is the dashed line shown with the scatter plot of the data for runs and wins a bad fit, good fit, or best fit?” (The line is a bad fit because it does not show the correct relationship between the variables. It shows that the value of yy increases as the value of xx increases, rather than the value of yy decreasing as the value of xx increases.)
  • “Is the dashed line shown with the scatter plot of the data for oil and gas a bad fit, good fit, or best fit?” (It is the best fit because it is close to going through the middle of the data and follows the same trend as the data.)
  • “What factors helped you select the linear model that fits the data best?” (The line should go through the middle of the data, follow the trend of the data, and have a similar number of points on each side of the line.)
Standards
Building On
  • 8.SP.2·Know that straight lines are widely used to model relationships between two quantitative variables. For scatter plots that suggest a linear association, informally fit a straight line, and informally assess the model fit by judging the closeness of the data points to the line.
  • 8.SP.A.2·Know that straight lines are widely used to model relationships between two quantitative variables. For scatter plots that suggest a linear association, informally fit a straight line, and informally assess the model fit by judging the closeness of the data points to the line.
Building Toward
  • HSS-ID.B.6.b·Informally assess the fit of a function by plotting and analyzing residuals.
  • S-ID.6.b·Informally assess the fit of a function by plotting and analyzing residuals.
  • S-ID.6.b·Informally assess the fit of a function by plotting and analyzing residuals.
  • S-ID.6.b·Informally assess the fit of a function by plotting and analyzing residuals.

15 min

15 min