Sometimes, we can use a linear function as a model of the relationship between two variables. For example, here is a scatter plot that shows heights and weights of 25 dogs together with the graph of a linear function which is a model for the relationship between a dog’s height and its weight.
For some dogs, we can see that the model does a good job of predicting the weight given the height. These correspond to points on or near the line. The model doesn’t do a very good job of predicting the weight given the height for the dogs whose points are far from the line.
For example, there is a dog that is about 20 inches tall and weighs a little more than 16 pounds. The model predicts that the weight would be about 48 pounds. We say that the model overpredicts the weight of this dog. There is also a dog that is 27 inches tall and weighs about 110 pounds. The model predicts that its weight will be a little less than 80 pounds. We say the model underpredicts the weight of this dog. For most of the dogs in this data set, though, the model does a good job of predicting the weight from the height.
Sometimes a data point is far away from the other points or doesn’t fit a trend that all the other points fit. We call these outliers.