Understanding Residuals and Line of Fit through Scatter Plots and Sum of Squared Residuals

Olympic Game	$1$	$2$	$3$	$4$	$5$	$6$
Finishing Time (sec)	$9.87$	$9.85$	$9.69$	$9.63$	$9.81$	$9.80$

Olympic Game

1

2

3

4

5

6

Finishing Time (sec)

9.87

9.85

9.69

9.63

9.81

9.80

$x$	$y$ (Actual)	$y$ Predicted by $y = - 0.1 x + 10$	Residual for $y = - 0.1 x + 10$
$1$	$9.87$	$y = - 0.1 (1) + 10 = 9.9$	$9.87 - 9.9 = - 0.03$
$2$	$9.85$	$y = - 0.1 (2) + 10 = 9.8$	$9.85 - 9.8 = 0.05$
$3$	$9.69$	$y = - 0.1 (3) + 10 = 9.7$	$9.69 - 9.7 = - 0.01$
$4$	$9.63$	$y = - 0.1 (4) + 10 = 9.6$	$9.63 - 9.6 = 0.03$
$5$	$9.81$	$y = - 0.1 (5) + 10 = 9.5$	$9.81 - 9.5 = 0.31$
$6$	$9.80$	$y = - 0.1 (6) + 10 = 9.4$	$9.80 - 9.4 = 0.40$

x

y

(Actual)

y

Predicted by

y = - 0.1 x + 10

Residual for

y = - 0.1 x + 10

1

9.87

y = - 0.1 (1) + 10 = 9.9

9.87 - 9.9 = - 0.03

2

9.85

y = - 0.1 (2) + 10 = 9.8

9.85 - 9.8 = 0.05

3

9.69

y = - 0.1 (3) + 10 = 9.7

9.69 - 9.7 = - 0.01

4

9.63

y = - 0.1 (4) + 10 = 9.6

9.63 - 9.6 = 0.03

5

9.81

y = - 0.1 (5) + 10 = 9.5

9.81 - 9.5 = 0.31

6

9.80

y = - 0.1 (6) + 10 = 9.4

9.80 - 9.4 = 0.40

$x$	$y$ (Actual)	$y$ Predicted by $y = - 0.05 x + 10$	Residual for $y = - 0.05 x + 10$
$1$	$9.87$	$y = - 0.05 (1) + 10 = 9.95$	$9.87 - 9.95 = - 0.08$
$2$	$9.85$	$y = - 0.05 (2) + 10 = 9.9$	$9.85 - 9.9 = - 0.05$
$3$	$9.69$	$y = - 0.05 (3) + 10 = 9.85$	$9.69 - 9.85 = - 0.16$
$4$	$9.63$	$y = - 0.05 (1) + 10 = 9.8$	$9.63 - 9.8 = - 0.17$
$5$	$9.81$	$y = - 0.05 (5) + 10 = 9.75$	$9.81 - 9.75 = 0.06$
$6$	$9.80$	$y = - 0.05 (6) + 10 = 9.70$	$9.80 - 9.70 = 0.10$

x

y

(Actual)

y

Predicted by

y = - 0.05 x + 10

Residual for

y = - 0.05 x + 10

1

9.87

y = - 0.05 (1) + 10 = 9.95

9.87 - 9.95 = - 0.08

2

9.85

y = - 0.05 (2) + 10 = 9.9

9.85 - 9.9 = - 0.05

3

9.69

y = - 0.05 (3) + 10 = 9.85

9.69 - 9.85 = - 0.16

4

9.63

y = - 0.05 (1) + 10 = 9.8

9.63 - 9.8 = - 0.17

5

9.81

y = - 0.05 (5) + 10 = 9.75

9.81 - 9.75 = 0.06

6

9.80

y = - 0.05 (6) + 10 = 9.70

9.80 - 9.70 = 0.10

$x$	Residual for $y = - 0.1 x + 10$	Residual for $y = - 0.05 x + 10$
$1$	$- 0.03$	$- 0.08$
$2$	$0.05$	$- 0.05$
$3$	$- 0.01$	$- 0.16$
$4$	$0.03$	$- 0.17$
$5$	$0.31$	$0.06$
$6$	$0.47$	$0.1$

x

Residual for

y = - 0.1 x + 10

Residual for

y = - 0.05 x + 10

1

- 0.03

- 0.08

2

0.05

- 0.05

3

- 0.01

- 0.16

4

0.03

- 0.17

5

0.31

0.06

6

0.47

0.1

$x$	$23$	$12$	$18$	$30$	$6$	$26$
$y$	$15$	$15$	$17$	$12$	$19$	$15$

x

23

12

18

30

6

26

y

15

15

17

12

19

15

$x$	$y$ (Actual)	$y$ Predicted by the equation	Residual
$6$	$19$	$y = - \frac{1}{3} (6) + \frac{6 4}{3} = \frac{5 8}{3}$	$19 - \frac{5 8}{3} = - \frac{1}{3}$
$12$	$15$	$y = - \frac{1}{3} (12) + \frac{6 4}{3} = \frac{5 2}{3}$	$15 - \frac{5 2}{3} = - \frac{7}{3}$
$18$	$17$	$y = - \frac{1}{3} (18) + \frac{6 4}{3} = \frac{4 6}{3}$	$17 - \frac{4 6}{3} = \frac{5}{3}$
$23$	$15$	$y = - \frac{1}{3} (23) + \frac{6 4}{3} = \frac{4 1}{3}$	$15 - \frac{4 1}{3} = \frac{4}{3}$
$26$	$15$	$y = - \frac{1}{3} (26) + \frac{6 4}{3} = \frac{3 8}{3}$	$15 - \frac{5 8}{3} = \frac{7}{3}$
$30$	$12$	$y = - \frac{1}{3} (30) + \frac{6 4}{3} = \frac{3 4}{3}$	$12 - \frac{3 4}{3} = \frac{2}{3}$

x

y

(Actual)

y

Predicted by the equation

Residual

6

19

y = - \frac{1}{3} (6) + \frac{6 4}{3} = \frac{5 8}{3}

19 - \frac{5 8}{3} = - \frac{1}{3}

12

15

y = - \frac{1}{3} (12) + \frac{6 4}{3} = \frac{5 2}{3}

15 - \frac{5 2}{3} = - \frac{7}{3}

18

17

y = - \frac{1}{3} (18) + \frac{6 4}{3} = \frac{4 6}{3}

17 - \frac{4 6}{3} = \frac{5}{3}

23

15

y = - \frac{1}{3} (23) + \frac{6 4}{3} = \frac{4 1}{3}

15 - \frac{4 1}{3} = \frac{4}{3}

26

15

y = - \frac{1}{3} (26) + \frac{6 4}{3} = \frac{3 8}{3}

15 - \frac{5 8}{3} = \frac{7}{3}

30

12

y = - \frac{1}{3} (30) + \frac{6 4}{3} = \frac{3 4}{3}

12 - \frac{3 4}{3} = \frac{2}{3}

$x$	$y$ (Actual)	$y$ Predicted by the equation	Residual
$6$	$19$	$y = - 0.25 (6) + 20 = 18.5$	$19 - 18.5 = 0.5$
$12$	$15$	$y = - 0.25 (12) + 20 = 17$	$15 - 17 = - 2$
$18$	$17$	$y = - 0.25 (18) + 20 = 15.5$	$17 - 15.5 = 1.5$
$23$	$15$	$y = - 0.25 (23) + 20 = 14.25$	$15 - 14.25 = 0.75$
$26$	$15$	$y = - 0.25 (26) + 20 = 13.5$	$15 - 13.5 = 1.5$
$30$	$12$	$y = - 0.25 (30) + 20 = 12.5$	$12 - 12.5 = - 0.5$

x

y

(Actual)

y

Predicted by the equation

Residual

6

19

y = - 0.25 (6) + 20 = 18.5

19 - 18.5 = 0.5

12

15

y = - 0.25 (12) + 20 = 17

15 - 17 = - 2

18

17

y = - 0.25 (18) + 20 = 15.5

17 - 15.5 = 1.5

23

15

y = - 0.25 (23) + 20 = 14.25

15 - 14.25 = 0.75

26

15

y = - 0.25 (26) + 20 = 13.5

15 - 13.5 = 1.5

30

12

y = - 0.25 (30) + 20 = 12.5

12 - 12.5 = - 0.5

Davontay volunteers to take part in research that checks if a vitamin supplement shortens the length of the flu. The data collected from $10$ patients are shown with a line of fit in the following diagram.

10 data points and line of fit y=-0.5x+6.5

Calculate the residual for

4

months.

Having taken the supplements for 4 months,

A . B . C . D . the patient had flu that lasted for 0.5 days more than what was predicted . the patient had flu that lasted for the predicted time . the patient had flu that lasted for 0.5 days more than what was predicted . the patient had a disease that lasted for 5 days more than what was predicted .

The residual is the actual value minus the predicted value. Residual = lObserved y-value - lPredicted y-value Let's identify the coordinates of the data point when x=4.

It is the point (4,5). Now we calculate the y-value predicted by the line of fit.

With this information, we can calculate the residual. Residual=5-4.5= 0.5 Since y-values represent days, the residual also represents days. The residual is 0.5 days.

We found that the residual for 4 months is 0.5 days.

This means that this person, having taken the supplements for 4 months, had flu that lasted for 0.5 days more than what was predicted. Therefore, the answer is option C.

Tiffaniqua records the winning times for various swim meets at Washington High School. Tiffaniqua and her teacher decides to check if the winning times are related the height of the swimmer. They draw the following residual plot.

Which scatter plot best represents the data?

What can be predicted using the given residual plot?

A . B . C . D . The model is better at predicting the winning times of taller swimmers . The model is better at predicting the winning times of shorter swimmers . The model is better at predicting the winning times of swimmers of average height . None of the above

The residual shows the difference between the actual value and the predicted value from the model. Residual = lObserved y-value - lPredicted y-value A positive residual means the actual value is above the line of fit and a negative residual means the actual value is below the line of fit. Notice that near the vertical axis the residuals are greater compared to further to the right.

Therefore, when a line of fit is drawn on the data sets, the points to the right should be close to the line of fit.

The data set in option A can be the scatter plot of the data compared to the other data sets.

Given the look of the residual plot, the difference between the predicted value and the actual values is less as the height of the swimmers increases.

This means our model is better at predicting the winning times of taller swimmers than shorter swimmers. The answer is then A.

LaShay writes a different linear equation for each of four different studies. To determine if the dependent and independent variable of each study has a linear relationship, she plots the residuals for each study.

Considering the residual plots, in which study can the variables be modeled with a linear equation?

From the scatter plot of residuals, we can determine if the dependent and independent variables are linearly related or not. To do so, we check two things in a residual plot.

Is there a pattern in the scatter plot?
Are the residuals randomly distributed about the x-axis?

Let's now consider the residual plots A and D.

We can recognize a ⋃-shaped pattern in Plot A, and a line in Plot D. Therefore, the variables in these studies are not linearly related.

For Plot B and Plot C, the residuals are randomly distributed about the x-axis. However, the residuals in Plot C vary significantly compared to the residuals in Plot B. Therefore, of the given choices, the variables in study B can be described with a linear equation.

Lines of Fit

Catch-Up and Review

Scatter Plot of Bivariate Data and Line of Fit

Residual

Residual

Practice Finding Sum of Squared Residuals

Determining Which Line of Fit Is Better

Answer

Hint

Solution

Finding the Sum of Squared Residuals Less Than a Certain Value

Answer

Hint

Solution

Practice Determining the Better Model

Finding the Line That Is the Best Fit

Lines of Fit

Recommended exercises

	8 Theory slides
	8 Exercises - Grade E - A
	Each lesson is meant to take 1-2 classroom sessions