#### Analyzing Lines of Fit

Find the solutions in the app
##### Sections
Exercise name Free?
###### Monitoring Progress
Exercise name Free?
Monitoring Progress 1
Monitoring Progress 2
Monitoring Progress 3
Monitoring Progress 4
###### Exercises
Exercise name Free?
Exercises 1 A residual is the difference between the y-coordinate of the actual data point and the y-coordinate found using the equation for the line of fit. yactual point​−yline of fit​ The residual is positive if the data point lies above the line of fit and it's negative if the data point lies below the line of fit.
Exercises 2 There are two ways to use residuals to check the goodness of your line of fit,By graphing With a calculator. Graphing Because residuals are the difference between the y-coordinate of the data point and the y-coordinate produced by the line of fit, data point’s y−line of fit’s y=residual, the scatter plot of residuals should be centered around the x-axis if its a good line of fit. Positive residuals will be above the x-axis and negative residuals will be below the x-axis.If the line of fit is a bad fit, you will have too many positive or negative residuals and not enough of the other.The above residual graph shows a line of fit that lies below most of the data points rather than being centrally placed, there are 8 positive residuals and only 3 negative residuals. If the line of fit is a good fit, the scatter plot will be evenly divided by the x-axis like the one below.Calculator If you have many data points, you may want to use a graphing calculator to calculate the goodness of fit. You can enter all of your data points and use the linear regression functions to find an r value, also known the correlation coefficient. The values for r will always be within the range: -1≤r≤1. When r is close to -1, it is a strong negative correlation and the line is a good fit. When r is close to 1, it is a strong positive correlation and the line is a good fit. When r is close to 0, it is a weak correlation, the line is a bad fit or the data just has no correlation to it.
Exercises 3 Interpolation and extrapolation are very similar, they are both processes by which we use existing data to make educated guesses about future data. The main difference between the two processes has to do with their prefixes. What do "extra" and "inter" mean to you? Typically, they mean:Extra: Above and beyond, in addition to something. Inter: Inside, between or among the group.These are used the same in this case as well! Extrapolation is using the known data to make predictions outside the known range. Interpolation is using the known data to make predictions inside the known range.
Exercises 4 A correlation coefficient tells us two things:The strength of the correlation, it's strong if r is close to ∣1∣ and weak if it's close to 0. If the correlation is positive or negative. Let's look at what each of the given values of r tells us.rPositive or Negative?Strong or Weak? -0.98NegativeStrong 0.96PositiveStrong -0.09NegativeWeak 0.97PositiveStrong The only correlation coefficient that doesn't match the features of another is r=-0.09. It is an extremely weak fit, less than 10% of the data points can be approximated with the line of fit. The other three values for r are extremely strong correlations, almost all the data points can be explained by the line of fit.
Exercises 5 Let's begin by making a table of the residual values.xyy=4x−5y-Value from modelResidual -4-184(-4)−5-21-18−(-21)=3 -3-134(-3)−5-17-13−(-17)=4 -2-104(-2)−5-13-10−(-13)=3 -1-74(-1)−5-9-7−(-9)=2 0-24(0)−5-5-2−(-5)=3 104(1)−5-10−(-1)=1 264(2)−536−3=3 3104(3)−5710−7=3 4154(4)−51115−11=4 Now we can create a scatter plot using the given x-values and our residuals. Remember, if the model is a good fit for the data, the scatter plot will be evenly distributed above and below the x-axis. Also, there will be no apparent patterns.This line of fit does not model the data well. It is not evenly distributed above and below the x-axis. The residual scatter plot shows that every residual is positive.
Exercises 6 Let's begin by making a table of the residual values.xyy=6x+4y-Value from modelResidual 1136(1)+41013−10=3 2146(2)+41614−16=-2 3236(3)+42223−22=1 4266(4)+42826−28=-2 5316(5)+43431−34=-3 6426(6)+44042−40=2 7456(7)+44645−46=-1 8526(8)+45252−52=0 9626(9)+45862−58=4 Now we can create a scatter plot using the given x-values and our residuals. Remember, if the model is a good fit for the data, the scatter plot will be evenly distributed above and below the x-axis. Also, there will be no apparent patterns.This line of fit models the data well. It is evenly distributed above and below the x-axis.
Exercises 7 Let's begin by making a table of the residual values.xyy=-1.3x+1y-Value from modelResidual -89-1.3(-8)+111.49−11.4=-2.4 -610-1.3(-6)+18.810−8.8=1.2 -45-1.3(-4)+16.25−6.2=-1.2 -28-1.3(-2)+13.68−3.6=4.4 0-1-1.3(0)+11-1−1=-2 21-1.3(2)+1-1.61−(-1.6)=2.6 4-4-1.3(4)+1-4.2-4−(-4.2)=0.2 6-12-1.3(6)+1-6.8-12−(-6.8)=-5.2 8-7-1.3(8)+1-9.4-7−(-9.4)=2.4 Now we can create a scatter plot using the given x-values and our residuals. Remember, if the model is a good fit for the data, the scatter plot will be evenly distributed above and below the x-axis. Also, there will be no apparent patterns.This line of fit models the data well. It is evenly distributed above and below the x-axis.
Exercises 8 Let's begin by making a table of the residual values.xyy=-0.5x−2y-Value from modelResidual 4-1-0.5(4)−2-4-1−(-4)=3 6-3-0.5(6)−2-5-3−(-5)=2 8-6-0.5(8)−2-6-6−(-6)=0 10-8-0.5(10)−2-7-8−(-7)=-1 12-10-0.5(12)−2-8-10−(-8)=-2 14-10-0.5(14)−2-9-10−(-9)=-1 16-10-0.5(16)−2-10-10−(-10)=0 18-9-0.5(18)−2-11-9−(-11)=2 20-9-0.5(20)−2-12-9−(-12)=3 Now we can create a scatter plot using the given x-values and our residuals. Remember, if the model is a good fit for the data, the scatter plot will be evenly distributed above and below the x-axis. Also, there will be no apparent patterns.This line of fit does not model the data well. It is not evenly distributed above and below the x-axis. We can see that the residual points form a ⋃-shaped pattern, which suggests the data are not linear.
Exercises 9 Let's begin by making a table of the residual values. Note that, in our table, y represents the growth in inches of an elk's antlers in week x.xyy=-0.7x+6.8y-Value from modelResidual 16.0-0.7(1)+6.86.16.0−6.1=-0.1 25.5-0.7(2)+6.85.45.5−5.4=0.1 34.7-0.7(3)+6.84.74.7−4.7=0 43.9-0.7(4)+6.84.03.9−4.0=-0.1 53.3-0.7(5)+6.83.33.3−3.3=0 Now we can create a scatter plot using the given x-values and our residuals. Remember, if the model is a good fit for the data, the scatter plot will be evenly distributed above and below the x-axis. Also, there will be no apparent patterns.This line of fit models the data well. It is evenly distributed above and below the x-axis.
Exercises 10 Let's begin by making a table of the residual values. Note that, in our table, y represents the approximate number(in thousands) of movie tickets sold in month x.xyy=1.3x+27y-Value from modelResidual 1271.3(1)+2728.327−28.3=-1.3 2281.3(2)+2729.628−29.6=-1.6 3361.3(3)+2730.936−30.9=5.1 4281.3(4)+2732.228−32.2=-4.2 5321.3(5)+2733.532−33.5=-1.5 6351.3(6)+2734.835−34.8=0.2 Now we can create a scatter plot using the given x-values and our residuals. Remember, if the model is a good fit for the data, the scatter plot will be evenly distributed above and below the x-axis. Also, there will be no apparent patterns.As we can see, the points are not evenly dispersed about the horizontal axis. Therefore, the line of fit does not model the data well.
Exercises 11 Let's begin by entering the data into our calculator and using the linear regression analysis tools.We can round the values of a and b and substitute them into the equation y=ax+b. This gives us the equation for the line of best fit. y=2.1x−8​ We can see how the line fits with the data by plotting the data points and graphing the line on the same coordinate plane.The calculator output gives us the value of the correlation coefficient, r. r=0.9803≈0.980​ This tells us that correlation is both positive and very strong. We can tell that it is strong because it is extremely close to 1, which would be a direct correlation explained by a line that goes through all of the points.
Exercises 12 Let's begin by entering the data into our calculator and using the linear regression analysis tools.We can round the values of a and b and substitute them into the equation y=ax+b. This gives us the equation for the line of best fit. y=-1.3x+8​ We can see how the line fits with the data by plotting the data points and graphing the line on the same coordinate plane.The calculator output gives us the value of the correlation coefficient, r. r=-0.8858≈-0.886​ This tells us that correlation is both negative and strong. We can tell that it is strong because it is close to -1, which would be a direct correlation explained by a line that goes through all of the points.
Exercises 13 Let's begin by entering the data into our calculator and using the linear regression analysis tools.We can round the values of a and b and substitute them into the equation y=ax+b. This gives us the equation for the line of best fit. y=1.4x+16​ We can see how the line fits with the data by plotting the data points and graphing the line on the same coordinate plane.The calculator output gives us the value of the correlation coefficient, r. r=0.9986≈0.999​ This tells us that correlation is both positive and very strong. We can tell that it is strong because it is extremely close to 1, which would be a direct correlation explained by a line that goes through all of the points.
Exercises 14 Let's begin by entering the data into our calculator and using the linear regression analysis tools.We can round the value of b and substitute it along with a into the equation y=ax+b. This gives us the equation for the line of best fit. y=-x+11​ We can see how the line fits with the data by plotting the data points and graphing the line on the same coordinate plane.The calculator output gives us the value of the correlation coefficient, r. r=-0.4435≈-0.444​ This tells us that correlation is both negative and moderate. We can tell that it is moderate because it is around -0.5, which is a halfway between a direct correlation and no correlation.
Exercises 15 The written equation has interchanged the values of a and b. According to the display, we have a=-4.47 and b=23.16. The coefficient to x should thus be -4.47 and the constant 23.16. Therefore, our equation should be y=-4.47x+23.16.
Exercises 16 When looking at linear regression output on a calculator, we can learn about the correlation and goodness of our line of fit by interpreting the correlation coefficient. Be sure that you are looking at the value of r, not the value of r2. In this case, we have: r=-.9994724136. When r is close to ∣1∣, it means that there is a strong correlation and when r is close to 0, it means that there is a weak correlation. This r value indicates a very strong correlation. A positive value for r indicates a positive correlation and a negative value indicates a negative correlation. In this case, we have a negative correlation. Therefore, this data has a strong negative correlation, not a strong positive correlation.
Exercises 17
Exercises 18
Exercises 19
Exercises 20
Exercises 21 When you use your phone a lot, the battery dies faster than if you leave it untouched for that same amount of time. Therefore, if you are talking on the phone, the battery will lose more of its charge the longer you spend talking. A line of fit that matches this situation would be likely to resemble the graph below.As you talk, the battery life is being drained. This is a negative correlation and a causal relationship, the phone usage is causing the battery life to decrease. Notice, the domain can only be values within the first quadrant. You cannot talk for a negative number of minutes and you cannot continue talking after the phone battery has been fully drained.
Exercises 22 Does the height of a toddler correlate to the size of their vocabulary? More often than not, taller toddlers will have a larger vocabulary. The line of fit would probably look something like the graph below, showing a positive correlation.Now the question is: Does change in height cause change in vocabulary size? The answer is: Definitely not! When looking at a situation like this, we must keep all possible outside factors in mind. Height completely depends on genetics and health while vocabulary size depends on many things including how often a child is read to by a parent or guardian. Correlation doesn’t imply causation! The most plausible explanation here is that taller kids are more often older and humans learn more words as they age.