Analyzing Lines of Fit

Download for free
Find the solutions in the app
Android iOS
Exercises marked with requires Mathleaks premium to view it's solution in the app. Download Mathleaks app on Google Play or iTunes AppStore.
Sections
Communicate Your Answer
Exercise name Free?
Communicate Your Answer 2
Communicate Your Answer 3
Monitoring Progress
Exercise name Free?
Monitoring Progress 1
Monitoring Progress 2
Monitoring Progress 3
Monitoring Progress 4
Exercises
Exercise name Free?
Exercises 1 A residual is the difference between the y-coordinate of the actual data point and the y-coordinate found using the equation for the line of fit. yactual point​−yline of fit​ The residual is positive if the data point lies above the line of fit and it's negative if the data point lies below the line of fit.
Exercises 2 There are two ways to use residuals to check the goodness of your line of fit,By graphing With a calculator. Graphing Because residuals are the difference between the y-coordinate of the data point and the y-coordinate produced by the line of fit, data point’s y−line of fit’s y=residual, the scatter plot of residuals should be centered around the x-axis if its a good line of fit. Positive residuals will be above the x-axis and negative residuals will be below the x-axis.If the line of fit is a bad fit, you will have too many positive or negative residuals and not enough of the other.The above residual graph shows a line of fit that lies below most of the data points rather than being centrally placed, there are 8 positive residuals and only 3 negative residuals. If the line of fit is a good fit, the scatter plot will be evenly divided by the x-axis like the one below.Calculator If you have many data points, you may want to use a graphing calculator to calculate the goodness of fit. You can enter all of your data points and use the linear regression functions to find an r value, also known the correlation coefficient. The values for r will always be within the range: -1≤r≤1. When r is close to -1, it is a strong negative correlation and the line is a good fit. When r is close to 1, it is a strong positive correlation and the line is a good fit. When r is close to 0, it is a weak correlation, the line is a bad fit or the data just has no correlation to it.
Exercises 3 Interpolation and extrapolation are very similar, they are both processes by which we use existing data to make educated guesses about future data. The main difference between the two processes has to do with their prefixes. What do "extra" and "inter" mean to you? Typically, they mean:Extra: Above and beyond, in addition to something. Inter: Inside, between or among the group.These are used the same in this case as well! Extrapolation is using the known data to make predictions outside the known range. Interpolation is using the known data to make predictions inside the known range.
Exercises 4 A correlation coefficient tells us two things:The strength of the correlation, it's strong if r is close to ∣1∣ and weak if it's close to 0. If the correlation is positive or negative. Let's look at what each of the given values of r tells us.rPositive or Negative?Strong or Weak? -0.98NegativeStrong 0.96PositiveStrong -0.09NegativeWeak 0.97PositiveStrong The only correlation coefficient that doesn't match the features of another is r=-0.09. It is an extremely weak fit, less than 10% of the data points can be approximated with the line of fit. The other three values for r are extremely strong correlations, almost all the data points can be explained by the line of fit.
Exercises 5 Let's begin by making a table of the residual values.xyy=4x−5y-Value from modelResidual -4-184(-4)−5-21-18−(-21)=3 -3-134(-3)−5-17-13−(-17)=4 -2-104(-2)−5-13-10−(-13)=3 -1-74(-1)−5-9-7−(-9)=2 0-24(0)−5-5-2−(-5)=3 104(1)−5-10−(-1)=1 264(2)−536−3=3 3104(3)−5710−7=3 4154(4)−51115−11=4 Now we can create a scatter plot using the given x-values and our residuals. Remember, if the model is a good fit for the data, the scatter plot will be evenly distributed above and below the x-axis. Also, there will be no apparent patterns.This line of fit does not model the data well. It is not evenly distributed above and below the x-axis. The residual scatter plot shows that every residual is positive.
Exercises 6 Let's begin by making a table of the residual values.xyy=6x+4y-Value from modelResidual 1136(1)+41013−10=3 2146(2)+41614−16=-2 3236(3)+42223−22=1 4266(4)+42826−28=-2 5316(5)+43431−34=-3 6426(6)+44042−40=2 7456(7)+44645−46=-1 8526(8)+45252−52=0 9626(9)+45862−58=4 Now we can create a scatter plot using the given x-values and our residuals. Remember, if the model is a good fit for the data, the scatter plot will be evenly distributed above and below the x-axis. Also, there will be no apparent patterns.This line of fit models the data well. It is evenly distributed above and below the x-axis.
Exercises 7 Let's begin by making a table of the residual values.xyy=-1.3x+1y-Value from modelResidual -89-1.3(-8)+111.49−11.4=-2.4 -610-1.3(-6)+18.810−8.8=1.2 -45-1.3(-4)+16.25−6.2=-1.2 -28-1.3(-2)+13.68−3.6=4.4 0-1-1.3(0)+11-1−1=-2 21-1.3(2)+1-1.61−(-1.6)=2.6 4-4-1.3(4)+1-4.2-4−(-4.2)=0.2 6-12-1.3(6)+1-6.8-12−(-6.8)=-5.2 8-7-1.3(8)+1-9.4-7−(-9.4)=2.4 Now we can create a scatter plot using the given x-values and our residuals. Remember, if the model is a good fit for the data, the scatter plot will be evenly distributed above and below the x-axis. Also, there will be no apparent patterns.This line of fit models the data well. It is evenly distributed above and below the x-axis.
Exercises 8 Let's begin by making a table of the residual values.xyy=-0.5x−2y-Value from modelResidual 4-1-0.5(4)−2-4-1−(-4)=3 6-3-0.5(6)−2-5-3−(-5)=2 8-6-0.5(8)−2-6-6−(-6)=0 10-8-0.5(10)−2-7-8−(-7)=-1 12-10-0.5(12)−2-8-10−(-8)=-2 14-10-0.5(14)−2-9-10−(-9)=-1 16-10-0.5(16)−2-10-10−(-10)=0 18-9-0.5(18)−2-11-9−(-11)=2 20-9-0.5(20)−2-12-9−(-12)=3 Now we can create a scatter plot using the given x-values and our residuals. Remember, if the model is a good fit for the data, the scatter plot will be evenly distributed above and below the x-axis. Also, there will be no apparent patterns.This line of fit does not model the data well. It is not evenly distributed above and below the x-axis. We can see that the residual points form a ⋃-shaped pattern, which suggests the data are not linear.
Exercises 9 Let's begin by making a table of the residual values. Note that, in our table, y represents the growth in inches of an elk's antlers in week x.xyy=-0.7x+6.8y-Value from modelResidual 16.0-0.7(1)+6.86.16.0−6.1=-0.1 25.5-0.7(2)+6.85.45.5−5.4=0.1 34.7-0.7(3)+6.84.74.7−4.7=0 43.9-0.7(4)+6.84.03.9−4.0=-0.1 53.3-0.7(5)+6.83.33.3−3.3=0 Now we can create a scatter plot using the given x-values and our residuals. Remember, if the model is a good fit for the data, the scatter plot will be evenly distributed above and below the x-axis. Also, there will be no apparent patterns.This line of fit models the data well. It is evenly distributed above and below the x-axis.
Exercises 10 Let's begin by making a table of the residual values. Note that, in our table, y represents the approximate number(in thousands) of movie tickets sold in month x.xyy=1.3x+27y-Value from modelResidual 1271.3(1)+2728.327−28.3=-1.3 2281.3(2)+2729.628−29.6=-1.6 3361.3(3)+2730.936−30.9=5.1 4281.3(4)+2732.228−32.2=-4.2 5321.3(5)+2733.532−33.5=-1.5 6351.3(6)+2734.835−34.8=0.2 Now we can create a scatter plot using the given x-values and our residuals. Remember, if the model is a good fit for the data, the scatter plot will be evenly distributed above and below the x-axis. Also, there will be no apparent patterns.As we can see, the points are not evenly dispersed about the horizontal axis. Therefore, the line of fit does not model the data well.
Exercises 11 Let's begin by entering the data into our calculator and using the linear regression analysis tools.We can round the values of a and b and substitute them into the equation y=ax+b. This gives us the equation for the line of best fit. y=2.1x−8​ We can see how the line fits with the data by plotting the data points and graphing the line on the same coordinate plane.The calculator output gives us the value of the correlation coefficient, r. r=0.9803≈0.980​ This tells us that correlation is both positive and very strong. We can tell that it is strong because it is extremely close to 1, which would be a direct correlation explained by a line that goes through all of the points.
Exercises 12 Let's begin by entering the data into our calculator and using the linear regression analysis tools.We can round the values of a and b and substitute them into the equation y=ax+b. This gives us the equation for the line of best fit. y=-1.3x+8​ We can see how the line fits with the data by plotting the data points and graphing the line on the same coordinate plane.The calculator output gives us the value of the correlation coefficient, r. r=-0.8858≈-0.886​ This tells us that correlation is both negative and strong. We can tell that it is strong because it is close to -1, which would be a direct correlation explained by a line that goes through all of the points.
Exercises 13 Let's begin by entering the data into our calculator and using the linear regression analysis tools.We can round the values of a and b and substitute them into the equation y=ax+b. This gives us the equation for the line of best fit. y=1.4x+16​ We can see how the line fits with the data by plotting the data points and graphing the line on the same coordinate plane.The calculator output gives us the value of the correlation coefficient, r. r=0.9986≈0.999​ This tells us that correlation is both positive and very strong. We can tell that it is strong because it is extremely close to 1, which would be a direct correlation explained by a line that goes through all of the points.
Exercises 14 Let's begin by entering the data into our calculator and using the linear regression analysis tools.We can round the value of b and substitute it along with a into the equation y=ax+b. This gives us the equation for the line of best fit. y=-x+11​ We can see how the line fits with the data by plotting the data points and graphing the line on the same coordinate plane.The calculator output gives us the value of the correlation coefficient, r. r=-0.4435≈-0.444​ This tells us that correlation is both negative and moderate. We can tell that it is moderate because it is around -0.5, which is a halfway between a direct correlation and no correlation.
Exercises 15 The written equation has interchanged the values of a and b. According to the display, we have a=-4.47 and b=23.16. The coefficient to x should thus be -4.47 and the constant 23.16. Therefore, our equation should be y=-4.47x+23.16.
Exercises 16 When looking at linear regression output on a calculator, we can learn about the correlation and goodness of our line of fit by interpreting the correlation coefficient. Be sure that you are looking at the value of r, not the value of r2. In this case, we have: r=-.9994724136. When r is close to ∣1∣, it means that there is a strong correlation and when r is close to 0, it means that there is a weak correlation. This r value indicates a very strong correlation. A positive value for r indicates a positive correlation and a negative value indicates a negative correlation. In this case, we have a negative correlation. Therefore, this data has a strong negative correlation, not a strong positive correlation.
Exercises 17
Exercises 18
Exercises 19
Exercises 20
Exercises 21 When you use your phone a lot, the battery dies faster than if you leave it untouched for that same amount of time. Therefore, if you are talking on the phone, the battery will lose more of its charge the longer you spend talking. A line of fit that matches this situation would be likely to resemble the graph below.As you talk, the battery life is being drained. This is a negative correlation and a causal relationship, the phone usage is causing the battery life to decrease. Notice, the domain can only be values within the first quadrant. You cannot talk for a negative number of minutes and you cannot continue talking after the phone battery has been fully drained.
Exercises 22 Does the height of a toddler correlate to the size of their vocabulary? More often than not, taller toddlers will have a larger vocabulary. The line of fit would probably look something like the graph below, showing a positive correlation.Now the question is: Does change in height cause change in vocabulary size? The answer is: Definitely not! When looking at a situation like this, we must keep all possible outside factors in mind. Height completely depends on genetics and health while vocabulary size depends on many things including how often a child is read to by a parent or guardian. Correlation doesn’t imply causation! The most plausible explanation here is that taller kids are more often older and humans learn more words as they age.
Exercises 23 Neither buying a hat make your head bigger/smaller nor bigger/smaller head makes you buy more hats. This means that correlation between the number of hats you own and the size of your head is very unlikely.
Exercises 24 Since, on average, heavier dogs are bigger, they have longer tails as well. However, gaining extra weight by a dog won't make its tail extend. This means that there is a positive correlation between the weight of a dog and the length of its tail but there is no causal relationship.
Exercises 25 Examples of data with a strong correlation but without a causal relationship are often discussed in the statistics world because: Correlation doesn’t imply causation! For example, did you know that ice cream sales can predict murder rates? It's true. The more ice cream that is sold in an area, the higher the rate of serious crimes. There is a strong correlation between the two variables. But does one cause the other? Does eating ice cream making people want to commit crimes? We thought ice cream makes people happy! The truth is that there are underlying factors that come into play. More ice cream is purchased in larger cities because there are more people. Larger cities have high crime rates because there are more people. More ice cream is purchased in summer because it is hot outside. There is a higher crime rate in summer because it is easier to get away after you've committed the crime.
Exercises 26 We can look at each scatter plot and determine its matching correlation coefficient by noting a few key features. Is it a strong or weak correlation? Is it a positive or negative correlation? Let's look at the four given graphs.GraphStrong or weak?Positive or negative? aStrongPositive bStrongNegative cNo correlationNo correlation dWeakPositive Now, let's look at the information we can gather from the correlation coefficients.CoefficientStrong or weak?Positive or negative? A, r=0-02No correlationNo correlation B, r=0.98-StrongPositive C, r=-0.97StrongNegative D, r=0.69-WeakPositive Now that we have noted the key features from each piece of information, we can match the graphs to the correlation coefficients. We have that: ​a→Bb→Cc→Ad→D​
Exercises 27
Exercises 28 In order to see how the new point would affect the correlation, let's plot the original data and line of best fit. Then we can add the new data point and see how it relates to the others.This point is relatively far away from the other data points as well as the line of best fit. Including this point would definitely weaken the correlation.
Exercises 29
Exercises 30 The scatter plot below depicts the attendance numbers at two separate towns' local beaches for one consecutive week. Both beaches had relatively low attendance Monday through Friday, most likely because those are working days. On Saturday, the weather was beautiful and both beaches were very busy. On Sunday, the weather was great but only Beach 1 was open.Correlation? The correlation is positive and, other than the Sunday point (19,0), very strong. Possible correlation coefficients could range from between 0.7<r<0.9. It might be smart to remove the outlier when calculating a model for this relationship.Causal? Whether or not this is a causal relationship depends on quite a few factors, the main ones being:How close are the towns? If they are very close, residents of one town may travel to the other town's beach occasionally. If they are far away, this may not be as possible. Are the beaches private for citizens of that particular town? If yes, the other town's residents would not be allowed to go to the other beach.More than likely, the biggest factors of the attendance numbers for both beaches would be weather and day of the week. The attendance of Beach 1 does not directly influence the attendance of Beach 2, outside factors do. Therefore, it is not a causal relationship.
Exercises 31
Exercises 32 The graph of a linear function can be portrayed as a single, straight line in a coordinate plane. To begin determining if the data given in the table represents a linear function, let's first plot the data as (x,y) coordinate pairs.If the function is linear, connecting these points will form a straight line. Otherwise, we will have shown that the function is nonlinear.Now that we have connected all of our points, we can see that they do not all lie on the same line in the coordinate plane. Therefore, the function is nonlinear.
Exercises 33 The graph of a linear function can be portrayed as a single, straight line in a coordinate plane. To begin determining if the data given in the table represents a linear function, let's first plot the data as (x,y) coordinate pairs.If the function is linear, connecting these points will form a straight line. Otherwise, we will have shown that the function is nonlinear.Since we can use a straight edge to connect all of the given points, they lie on the same line in the coordinate plane. Therefore, the function is linear.