| {{ 'ml-lesson-number-slides' | message : article.introSlideInfo.bblockCount}} |
| {{ 'ml-lesson-number-exercises' | message : article.introSlideInfo.exerciseCount}} |
| {{ 'ml-lesson-time-estimation' | message }} |
Here are a few recommended readings before getting started with this lesson.
A line of best fit, also known as a regression line, is a line of fit whose equation has been determined using a strict mathematical method that estimates the relationship between the values of a data set.
One commonly used method to determine a line of best fit is the method of least squares. It should be noted that the methods used to find the line of best fit are usually hard to do by hand. Therefore, a line of best fit can be found by performing a linear regression on a graphing calculator. As an example, consider the data set graphed above.
x | 0.6 | 1.2 | 2.6 | 3.6 | 4.5 | 6 | 6.6 | 7.1 |
---|---|---|---|---|---|---|---|---|
y | 1.5 | 3.6 | 5.2 | 6.3 | 8.7 | 10.3 | 11.8 | 11.7 |
In reference to the graph, the data points seemingly can nearly be generated by the line y=1.55x+1.14. Consequently, even if the data points do not belong to any particular line, a linear model can be said to describe the data well enough. On the contrary, consider the following data set.
x | 0.6 | 1.2 | 2.6 | 3.6 | 4.5 | 6 | 6.6 | 7.1 |
---|---|---|---|---|---|---|---|---|
y | 1.5 | 8.1 | 9.5 | 12 | 7.1 | 2.5 | 11.6 | 1.5 |
Look at the data points graphed onto a coordinate plane.
Most graphing calculators have a function called linear regression, which can be used to find a precise line of fit using strict rules. This line of fit is then called the line of best fit. For example, consider the following data set.
x | y |
---|---|
1 | 33.12 |
2 | 24.4 |
3 | 16.6 |
4 | 9.3 |
5 | 3.9 |
The line of best fit can be calculated following these 3 steps.
On a graphing calculator, begin by entering the data points. To do so, press the STAT button and select the option Edit.
This gives a number of columns, labeled L1, L2, L3, and so on.
Use the arrow keys to choose where in the lists to fill in the data values. Enter the x-values of the data points in L1 and press ENTER after each value. The same can be done for the the corresponding y-values in column L2.
After entering the values, press the STAT button and select the menu item Calc.
The option LinReg(ax+b)
gives the line of best fit, expressed as a linear function in slope-intercept form. Press the ENTER button until the parameters are given.
In this case, the line of best fit is described by the function y=-7.354x+39.506. The correlation coefficient r is less than -0.99, indicating a strong negative correlation. If r does not appear, press 2ND and 0 to get to the CATALOG, then select DiagnosticOn
and enable it by pressing ENTER twice. Once more, find the line of best fit.
A graphing calculator can also be used to graph the line of best fit. After selecting the option LinReg(ax+b),
choose the option Store RegEQ.
Press VARS and move to the Y-VARS menu. Then, select the option FUNCTION
and press ENTER.
Press the ENTER button until the parameters are given. To graph the scatter plot, first push the buttons 2nd and Y=. Then, choose one of the plots in the list. Select the option ON,
choose the type to be a scatter plot, and assign L1 and L2 as XList
and Ylist,
respectively.
Then, the plot can be made by pressing the button GRAPH. It is possible that after drawing the plot the window-size is not large enough to see all of the information.
To fix this, press ZOOM and select the option ZoomStat.
After doing that, the window will resize to show the important information.
For a school project, Ramsha wants to investigate if there is a correlation between the width of a tree and its height. To do so, she measured the diameter at chest height and the height of some trees in a local park. Her findings are shown in the following table.
Diameter at chest (cm) | Height (m) |
---|---|
8 | 7 |
10 | 10 |
15 | 14 |
18 | 15 |
20 | 18 |
22 | 21 |
25 | 15 |
30 | 20 |
Edit.
Then the data values are written in the columns.
By pressing the STAT button and then selecting the CALC menu, the option LinReg(ax+b)
can be found. This option gives the line of best fit, expressed as a linear function in slope-intercept form.
Then, to graph the scatter plot push the buttons 2nd and Y=. Choose one of the plots in the list. Select the option ON,
choose the type to be a scatter plot, and assign L1 and L2 as XList
and Ylist,
respectively.
The plot can be made by pressing the button GRAPH. It is possible that after drawing the plot the window-size is not adequate for seeing all the information.
To fix this press ZOOM and select the option ZoomStat.
After doing so the window will resize to show the important information.
Use the linear regression feature of a graphing calculator to find the equation of the line of best fit for the given data set. Compare the obtained equation with the equations shown in the applet, and choose the closest one.
The following table displays some values of atmospheric pressures at different altitudes.
Altitude (thousand feet) | 0 | 1 | 2 | 3 | 4 | 5 |
---|---|---|---|---|---|---|
Pressure (PSI) | 14.71 | 14.18 | 13.75 | 13.21 | 12.69 | 12.20 |
Edit.
Then the data values can be written in the columns.
Finally, by pressing the STAT button and then selecting the menu item CALC, the option LinReg(ax+b)
can be found. This option gives a line of best fit, expressed as a linear function in slope-intercept form.
To graph the scatter plot, first push the buttons 2nd and Y=. Then, choose one of the plots in the list. Select the option ON,
choose the type to be a scatter plot, and assign L1 and L2 as XList
and Ylist,
respectively.
The plot can be made by pressing the button GRAPH. It is possible that after drawing the plot the window-size is not large enough to see all of the information.
To fix this, press ZOOM and select the option ZoomStat.
After doing that, the window will resize to show the important information.
To find the value of y when x=6, press CALC (2ND and TRACE). Then press ENTER to insert the value of 6 for x. Finally, press ENTER again.
The value of the pressure at 6000 feet is about 11.7 PSI. Since all the data values are close to the line of best fit and the data is strongly correlated, it can be said that this is a good approximation of the actual value.
Davontay has a math assignment that consists of eight different exercises. He registered the time (in minutes) in which he completed the first seven exercises.
Exercise | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
---|---|---|---|---|---|---|---|
Time (minutes) | 4 | 15 | 7 | 16 | 8 | 15 | 5 |
Edit.
The data values can be written in the columns.
Finally, by pressing the STAT button and then selecting the menu item CALC, the option LinReg(ax+b)
can be found. This option gives a line of best fit, expressed as a linear function in slope-intercept form.
To graph the scatter plot, first push the buttons 2nd and Y=. Then, choose one of the plots in the list. Select the option ON,
choose the type to be a scatter plot, and assign L1 and L2 as XList
and Ylist,
respectively.
The plot can be made by pressing the button GRAPH. It is possible that after drawing the plot the window-size is not large enough to see all of the information.
To fix this, press ZOOM and select the option ZoomStat.
After doing that, the window will resize to show the important information.
Looking at the graph, it can be seen that the line of best fit is not close to any of the provided data points.
From Part C it can be noted that the line is not representative of the given data points. This means that these measures do not reflect the reality of the exercises.
Then, to find the value of y when x=8, press CALC (2ND and TRACE). Then press ENTER to insert the value of 8 for x. Finally, press ENTER again.
The value of y when x=8 is about 10.57. This means that Davontay will finish the eighth exercise in less than 11 minutes. Since none of the given data values are really close to the line of best fit and the data is not correlated, it can be said that this is a not good approximation for the actual value.
In this lesson it was shown how to find the line of best fit for data sets and how to make predictions using these lines. Considering the examples discussed throughout the lesson, it is possible to make two conclusions.