{{ 'ml-label-loading-course' | message }}
{{ toc.name }}
{{ toc.signature }}
{{ tocHeader }} {{ 'ml-btn-view-details' | message }}
{{ tocSubheader }}
{{ 'ml-toc-proceed-mlc' | message }}
{{ 'ml-toc-proceed-tbs' | message }}
Lesson
Exercises
Recommended
Tests
An error ocurred, try again later!
Chapter {{ article.chapter.number }}
{{ article.number }}. 

{{ article.displayTitle }}

{{ article.intro.summary }}
Show less Show more expand_more
{{ ability.description }} {{ ability.displayTitle }}
Lesson Settings & Tools
{{ 'ml-lesson-number-slides' | message : article.intro.bblockCount }}
{{ 'ml-lesson-number-exercises' | message : article.intro.exerciseCount }}
{{ 'ml-lesson-time-estimation' | message }}
This lesson aims to show how to find the linear function that better models a scatter plot or data set. Additionally, it will be shown how to use these linear functions to make predictions.

Catch-Up and Review

Here are a few recommended readings before getting started with this lesson.

Explore

Residual Sum on a Scatter Plot

Given a scatter plot, a line of fit can be used to make good predictions of values that are not known. Since there are many possible lines of fit, finding the one that most accurately represents the given data points is an important goal. The goal is finding the line of fit in which the different sums of the residuals is as close to as possible.
Interactive Line of Fit on Scatter Plot
Examine how the different sums of the residuals change when moving the line. The most commonly used residual is the sum of the squared differences.
Discussion

Line of Best Fit

A line of best fit, also known as a regression line, is a line of fit that estimates the relationship between the values of a data set. The equation of the line of best fit has been determined using a strict mathematical method.

Points on a Scatter Plot and Line of Best Fit with an equation of y=1.55x+1.14

One commonly used method to determine a line of best fit is the method of least squares. The methods used to find the line of best fit are usually hard to do by hand. Therefore, a line of best fit can be found by performing a linear regression on a graphing calculator. As an example, consider the data set graphed above.

In reference to the graph, the data points appear to closely follow the line Consequently, even if the data points may not precisely align with any specific line, a linear model can be considered to adequately describe the data.
Example

Finding the Line of Best Fit for Measures of Trees

For a school project, Ramsha wants to investigate if there is a correlation between the width of a tree and its height. To do so, she measured the diameter at chest height and the height of some trees in a local park. Her findings are shown in the following table.

Diameter at chest (cm) Height (m)
a What is the equation of the line of best fit using linear regression? Round the values in the equation to two decimal places.
b What is the correlation coefficient? Round the value to two decimal places. Are the data correlated?
c Graph the data points and the line of best fit.
d Write an interpretation of the intercept and the slope.

Answer

a
b Correlation Coefficient:
Are the Data Correlated? Yes, see solution.
c Graph:
d See solution.

Hint

a Use the linear regression feature on a graphing calculator.
b The correlation coefficient is in the linear regression output on a graphing calculator.
c Use the graphing features on a graphing calculator.
d What do these measures indicate in a line?

Solution

a The line of best fit can be found using a graphing calculator. First, the data values need to be introduced into the calculator. This is done by pressing the button and then selecting the option Edit.
The window in the calculator, which shows Stat and then Edit

Then the data values are written in the columns.

Calculator that shows two lists where you entered values

By pressing the button and then selecting the CALC menu, the option LinReg() can be found. This option gives the line of best fit, expressed as a linear function in slope-intercept form.

Rounding the values of and to two decimal places, the equation of the line of best fit can be written as follows.
b The correlation coefficient can be found on the linear regression results screen from Part A.
The correlation coefficient is the value of on the screen.
The value of varies from to A value close to indicates a negative correlation, while a value close to indicates a positive correlation. Since the correlation coefficient is close to the data has a strong positive correlation.
c To graph the line of best fit, first press and write the equation of the line of best fit.

Then, to graph the scatter plot push the buttons and Choose one of the plots in the list. Select the option ON, choose the type to be a scatter plot, and assign and as XList and Ylist, respectively.

The plot can be made by pressing the button It is possible that after drawing the plot the window-size is not adequate for seeing all the information.

To fix this press and select the option ZoomStat. After doing so the window will resize to show the important information.

d In Part A the equation for the line of best fit was found.
In this equation, the slope is and the intercept is
  • The slope indicates that for every centimeter that the tree grows in diameter, about meters are gained in height.
  • The intercept indicates that the minimum height for which this equation is valid is about meters. This can be because the diameter at chest height is not a good measure for smaller trees.
Pop Quiz

Practice Finding the Line of Best Fit

Use the linear regression feature of a graphing calculator to find the equation of the line of best fit for the given data set. Compare the obtained equation with the equations shown in the applet, and choose the closest one.

Bivariate data and equations
Example

Predicting the Pressure at a Certain Altitude

The following table displays some values of atmospheric pressures at different altitudes.

Altitude (thousand feet)
Pressure (PSI)
a Use linear regression to determine the equation of the line of best fit. Round the values in the equation to two decimal places.
b What is the correlation coefficient? Round the answer to two decimal places. Is the data correlated?
c Draw a graph of the data points and the line of best fit in the same viewing window.
d Interpret the slope and the intercept of the equation of the line of best fit.
e Make a prediction for the pressure at feet. Is this a good prediction?

Answer

a
b Correlation Coefficient:
Is the Data Correlated? Yes, because the correlation coefficient is really close to
c Graph
d See solution.
e Prediction: About PSI
Is This a Good Prediction? Yes, see solution.

Hint

a Use the linear regression feature on a graphing calculator.
b The correlation coefficient is in the linear regression output on a graphing calculator.
c Use the graphing features on a graphing calculator.
d What do these measures indicate in a line?
e Are the data values close to the line of best fit? Does this indicate something?

Solution

a The line of best fit can be found using a graphing calculator. First, the data values need to be introduced into the calculator. This is done by pressing the button and selecting the option Edit.
The window in the calculator, which shows Stat and then Edit

Then the data values can be written in the columns.

Calculator that shows two lists where you entered values

Finally, by pressing the button and then selecting the menu item CALC, the option LinReg() can be found. This option gives a line of best fit, expressed as a linear function in slope-intercept form.

Rounding to two decimal places, the equation for the line of best fit using linear regression can be written as follows.
b The correlation coefficient can be found on the linear regression results screen from Part A.
The correlation coefficient is the value of on the screen.
The value of varies from to A value close to indicates a negative correlation, while a value close to indicates a positive correlation. Since the value is almost the data have a very strong negative correlation.
c To graph the line of best fit, first press and write the equation of the line of best fit.

To graph the scatter plot, first push the buttons and Then, choose one of the plots in the list. Select the option ON, choose the type to be a scatter plot, and assign and as XList and Ylist, respectively.

The plot can be made by pressing the button It is possible that after drawing the plot the window-size is not large enough to see all of the information.

To fix this, press and select the option ZoomStat. After doing that, the window will resize to show the important information.

d In Part A the equation for the line of best fit was found.
In this equation, the slope is and the intercept is
  • The slope indicates that every one thousand feet of altitude, the pressure diminishes by about PSI.
  • The intercept indicates that the pressure at sea level is about PSI.
e A graphing calculator can also be used to make predictions. To do so, first the window size should be changed to fit the prediction. Since the value is given in thousands of feet, the value that should be included for feet is To change the window size, press

To find the value of when press ( and Then press to insert the value of for Finally, press again.

The value of the pressure at feet is about PSI. Since all the data values are close to the line of best fit and the data is strongly correlated, it can be said that this is a good approximation of the actual value.

Solving Using the Equation

This prediction can also be found by substituting for into the equation for the line of best fit.
Solve for
Example

Predicting the Time It Takes to Finish an Exercise

Davontay has a math assignment that consists of eight different exercises. He registered the time (in minutes) in which he completed the first seven exercises.

Exercise
Time (minutes)
a What is the equation for the line of best fit using linear regression? Round the values to two decimal places.
b What is the correlation coefficient? Round the answer to two decimal places. Are the data correlated?
c Draw a graph of the data points and the line of best fit in the same viewing window.
d Interpret the slope and the intercept of the equation of the line of best fit.
e Find a prediction for the for the time it will take Davontay to complete the eighth exercise. Is it a good prediction?

Answer

a
b Correlation Coefficient:
Are the Data Correlated? No, see solution.
c Graph:
d See solution.
e Prediction: About minutes
Is It a Good Prediction? No, see solution.

Hint

a Use the linear regression feature on a graphing calculator or computer.
b The correlation coefficient is in the linear regression output on a graphing calculator.
c Use the graphing features on a graphing calculator.
d What do these measures indicate in a line?
e Are the data values close to the line of best fit? Does this indicate something?

Solution

a The line of best fit can be found using a graphing calculator. First, the data values need to be introduced into the calculator. This is done by pressing the button and then selecting the option Edit.
The window in the calculator, which shows Stat and then Edit

The data values can be written in the columns.

Calculator that shows two lists where you entered values

Finally, by pressing the button and then selecting the menu item CALC, the option LinReg() can be found. This option gives a line of best fit, expressed as a linear function in slope-intercept form.

Rounding to two decimal places, the equation for the line of best fit using linear regression can be written as follows.
It should be noted that in this equation is the number of the exercise and is the time in minutes in which Davontay completed that exercise.
b The correlation coefficient can be found on the linear regression results screen from Part A.
The correlation coefficient is the value of on the screen.
The value of varies from to A value close to indicates a negative correlation, while a value close to indicates a positive correlation. But the value of is close to this means that the data have no correlation.
c To graph the line of best fit, first press and write the equation of the line of best fit.

To graph the scatter plot, first push the buttons and Then, choose one of the plots in the list. Select the option ON, choose the type to be a scatter plot, and assign and as XList and Ylist, respectively.

The plot can be made by pressing the button It is possible that after drawing the plot the window-size is not large enough to see all of the information.

To fix this, press and select the option ZoomStat. After doing that, the window will resize to show the important information.

Looking at the graph, it can be seen that the line of best fit is not close to any of the provided data points.

d In Part A the equation for the line of best fit was found.
In this equation, the slope is and the intercept is
  • The slope indicates that after every exercise, the next one takes about an additional minutes to complete.
  • The intercept indicates that the minimum time to complete an exercise is about minutes.

From Part C it can be noted that the line is not representative of the given data points. This means that these measures do not reflect the reality of the exercises.

e A graphing calculator can also be used to make predictions. To do so, first the window size should be changed to fit the prediction value of To change the window size, press

Then, to find the value of when press ( and Then press to insert the value of for Finally, press again.

The value of when is about This means that Davontay will finish the eighth exercise in less than minutes. Since none of the given data values are really close to the line of best fit and the data is not correlated, it can be said that this is a not good approximation for the actual value.

Solving Using the Equation of the Line of Best Fit

This prediction can also be found by substituting for into the equation for the line of best fit.
Solve for
Closure

Are Lines of Best Fit Always Useful?

In this lesson it was shown how to find the line of best fit for data sets and how to make predictions using these lines. Considering the examples discussed throughout the lesson, it is possible to make two conclusions.

  • The lines of best fit are good to make predictions only if the data have a linear correlation.
  • The stronger the correlation is, the more accurate the predictions are expected to be.
When the data have no linear correlation, making predictions based on a linear model will not be viable. Even the data points will not be accurately represented by the line of best fit.


Loading content