A1
Algebra 1 View details
5. Bivariate Quantitative Data
Continue to next lesson
Lesson
Exercises
Tests
Chapter 4
5. 

Bivariate Quantitative Data

Bivariate quantitative data involves two sets of numerical values and their relationship. Visual representations, such as scatter plots, help in spotting patterns or trends within these data sets. A line of fit, often drawn on scatter plots, provides a generalized trajectory, suggesting how one set of data might predict or influence the other. For instance, in real-world applications, bivariate data can be used to compare the ages and incomes of individuals, temperatures and ice cream sales, or even the years of experience and job performance ratings. Through these methods, analysts, researchers, and students can draw meaningful conclusions from complex sets of data.
Show more expand_more
Problem Solving Reasoning and Communication Error Analysis Modeling Using Tools Precision Pattern Recognition
Lesson Settings & Tools
9 Theory slides
7 Exercises - Grade E - A
Each lesson is meant to take 1-2 classroom sessions
Bivariate Quantitative Data
Slide of 9
This lesson will explore how two quantities are related and how to make predictions by finding a line of fit for scatter plots.

Catch-Up and Review

The recommended reading is information that is helpful or necessary to understand before beginning the lesson.

Explore

A Set of Points on a Coordinate Plane

The following applet shows three graphs — each with a set of points distributed on a coordinate plane.
Three scatter plots. The first with a positive correlation, the second with a negative correlation, and the third with no correlation
Pay close attention to the behavior of the dependent variable as the independent variable increases. Does y increase, decrease, or can it not be distinguished?
Challenge

Attendance at an Aquatic Park

Magdalena is fascinated by her local aquatic park and is eager to analyze how temperatures influence attendance. The following graph represents the data she collected — the average number of people that attend the park at specific temperatures.

Scatter Plot Showing the Number of Attendants at an Aquatic Park on Specific Temperatures
Next week's forecast is extreme heat as temperatures will be around 110 degrees Farenheit. Using the forecast and her results, Magdalena wonders if she can predict the number of people who will visit the park. Can the park expect more than 10 000 visitors?
Discussion

Analyzing Scatter Plots and Identifying Relationships Between Data Sets

Valuable conclusions and predictions are made about a situation based on collected data. Before such statements can be made, the data is analyzed by using tools such as graphs. A scatter plot, for example, is used to identify the correlation between a pair of data sets.

Concept

Scatter Plot

A scatter plot is a graph that shows each observation of a bivariate data set as an ordered pair in a coordinate plane. Consider the following example, where a scatter plot illustrates the results gathered at a local ice cream parlor. This study records the number of ice creams sold and the corresponding air temperature.

Scatter plot of the number of ice creams sold based on temperatures with a positive correlation.
Among other insights, the graph shows that when the temperature is about 100^(∘)F, approximately 4000 ice creams are sold. Additionally, as the temperature increased, the number of sales also increased. In this case, it can be said that there is a positive correlation between the variables of the data set — the number of ice creams sold and the air temperature.
Concept

Correlation

A correlation is a relation between two data sets. For example, consider two data sets, one consisting of temperatures and the other consisting of the number of coats sold. A decrease in the temperature may imply an increase in the number of coats sold. Based on the trend of the bivariate data, three types of correlations are possible which can be described using scatter plots.

Positive Correlation: As x increases, y also increases (Scatter plot with points near to a non-visible line with positive slope); Negative Correlation: As x increases, y also decreases (Scatter plot with points near to a non-visible line with negative slope); No Correlation: There is no relationship between data sets, resulting in a random pattern in the scatter plot (Scatter plot with points points at random positions).

Knowing the type of correlation helps analyze trends and make predictions based on data. Furthermore, the shape of the patterns formed by positive and negative correlations can be thought to have a positive and negative slope, respectively. The applet below shows how a data set transforms from a random pattern to a positive or a negative correlation.

Pop Quiz

Identifying Correlations from Scatter Plots

The following applet shows different scatter plots. Select the type of correlation that matches the scatter plot shown.

An applet that asks to identify the type of correlation shown
Discussion

Lines of Fit for Scatter Plots

Once the scatter plot of a data set is drawn and the type of correlation is identified, predictions can be made about the trend of the data by using lines of fit.

Concept

Line of Fit

When data sets have a positive or negative correlation, the trend of the data can be modeled using a line of fit, also called a trend line. This line is drawn on a scatter plot near most of the data points, which appear evenly distributed above and below the line.

line of fit of the scatter plot that shows the kitten's mean weight agaist their age

The scatter plot above shows the mean weights of kittens from the same litter in relation to their age. In this case, a line of fit could be drawn quite seamlessly. When drawing a line of fit, the following characteristics should be considered.

  • The data needs to have either a positive or negative correlation.
  • While a line of fit is not unique and does not create an exact distribution, ideally, about half of the points should be above the line and about half below the line.
  • An equation of the line can be found using two of its points. These points do not necessarily belong to the bivariate data set.
Ultimately, a line of fit can be used to make predictions and generalize the trends of data sets. Additionally, when a line of fit is determined using strict mathematical methods, it is commonly referred to as a line of best fit.
Example

Analyzing Daily Situations Using Scatter Plots and Lines of Fit

At an aquatic park, a student-volunteer named Tadeo noticed a dedicated person who swims long distances in the lazy lagoon every Saturday morning.

A swimmer going for it in the Lazy Lagoon

Tadeo is amazed and wants to analyze how many calories the swimmer burns compared to the distance swam. He observes and records the swimmer diligently.

Distance (km) Calories Burned
16 980
15 880
14 860
13 740
12 720
11 680
10 595
9 560
8 490
7 400
6 380
a Make a scatter plot of the data.
b What type of correlation does the data have? Justify the answer.
c Draw a line of fit for the scatter plot.
d Find an equation for the line of fit.

Answer

a Example Answer:
scatter plot of calories burned distance kilometers traveled
b Positive, see solution.
c Example Answer:
line of fit of the scatter plot
d Example Equation: y=50x+110

Hint

a Let the x-variable represent the distance and the y-variable the calories burned.
b How does y change as x increases?
d Points should be evenly distributed above and below the line of fit.
d Use two points on the line.

Solution

a To draw the scatter plot, let x be the distance in kilometers and y the calories burned. With this in mind, the information from the table can be shown on a scatter plot.
scatter plot of number of calories burned by the number of kilometers run
b From the scatter plot previously drawn, it can be seen that as the distance increases, the number of calories burned also increases. Therefore, the bivariate data has a positive correlation.
c Since the data has a positive correlation, it can be modeled with a line of fit. The line of fit is not unique. However, ideally, the number of points below and above the line is expected to be similar.
line of fit of the scatter plot
d Because the equation of a line can be found using any two points on the line, two points whose coordinates can be easily identified will be marked on the graph of the line of fit.
line of fit of the scatter plot
For this case, the points ( 4, 310) and ( 16, 910) will be used. Substituting these points into the Slope Formula will give the slope of the line. Note that the points on the line do not necessarily match the data on the data set.
m = y_2-y_1/x_2-x_1
m=910- 310/16- 4
Evaluate right-hand side
m=600/12
m= 50
Now that the slope is known, the equation in point-slope form of a line can be used to find a partial equation of the line of fit. y-y_1&=m(x-x_1) &⇓ y-y_1&= 50(x-x_1) To complete the equation, any of the two points can be substituted above. For simplicity, ( 4, 310) will be used.
y-y_1=50(x-x_1)
y- 310=50(x- 4)
Solve for y
y-310=50x-200
y=50x+110
Example

Making Predictions Using Lines of Fit

Zosia and Vincenzo are poster designers at the aquatic park. Right now, they are promoting a 3D movie about the life of dolphins called Above and Below the Line.

Movie poster of dolphins made by the designers

They recorded the number of tickets sold each week with the purpose of using the data to determine whether they should continue to advertise the movie on a billboard. The scatter plot shows the collected data.

scatter plot of the number of tickets sold per week
a Draw a line of fit for the scatter plot.
b Find an equation for the line of fit in slope-intercept form.
c If the expected number of tickets sold on week 12 is more than 150 000, Vincenzo and Zosia will keep the movie for at least two more weeks on the billboard. Use the line of fit to predict if the movie stays on the billboard.

Answer

a Example Line:
line of fit for the number of tickets sold per week
b Example Equation: y=-25x+525
c Example Answer: Because the expected number of tickets sold on week 12 is about 237 000, they may decide to keep the movie.

Hint

a What type of correlation does the scatter plot have?
b Use two points on the line of fit to find the slope and the y-intercept of the line.
c Evaluate the equation found in Part B for x=12.

Solution

a The scatter plot shows that the number of tickets sold decreases as time passes. Therefore, the data has a negative correlation and can be modeled by a line of fit. Recall that the line of fit is close to most of the data points, while the points are ideally half above and half below the line of fit.
line of fit for the number of tickets sold per week
b The equation of a line in slope-intercept form has the following form.
y=mx+b In this equation, m is the slope and b the y-intercept of the line. The slope of the line of fit can be found by using the Slope Formula. m=y_2-y_1/x_2-x_1 The points ( 1, 500) and ( 7, 350) are on the line of fit. This means that they can be substituted in the above formula.
m = y_2-y_1/x_2-x_1
m=350- 500/7- 1
Evaluate right-hand side
m=-150/6
m=-150/6
m=-25
The slope m= -25 can be substituted to obtain a partial equation of the line of fit. y=mx+b substitute y= -25x+b Now, the y-intercept can be found by substituting any of the points into the partial equation. In this case, ( 1, 500) will be used.
y=-25x+b
500=-25( 1)+b
Solve for b
500=-25+b
525=b
b=525
Finally, the equation of the line of fit can be completed by substituting 525 for b. y=-25x+b substitute y=-25x+ 525
c The line of fit describes the trend of the data. This means that it can be used to predict how many tickets would be sold on the 12^\text{th} week. Therefore, by evaluating the equation of the line of fit for x=12, the expected number of tickets sold can be found.
y=-25x+525
y=-24( 12)+525
Evaluate right-hand side
y=-288+525
y=237
In week 12 the expected number of tickets sold is 237 000, which is greater than the minimum required by Vincenzo and Zosia. Therefore, they may decide to keep the movie on the billboard for at least two more weeks.
Closure

Applying Lines of Fit to Solve Real-Life Situations

In this lesson, it was taught how to analyze bivariate data using scatter plots and lines of fit. These mathematical concepts can now be used to solve the Challenge. It is now recognizable that Magdalena created a scatter plot to show the aquatic park visitors in relation to the temperatures.

Scatter Plot Showing the Number of Attendants at an Aquatic Park on Specific Temperatures
Help Magdalena predict if more than 10 000 people are expected to attend the park next week, given that the temperature will be around 110^(∘)F. Justify the prediction.

Answer

Yes, see solution.

Hint

Begin by drawing a line of fit for the scatter plot. Then, use two points on the line to find its equation. Finally, evaluate the equation for x=110.

Solution

The scatter plot shows that the number of attendants increases as the temperature increases, which means the data has a positive correlation. Therefore, it can be modeled with a line of fit.

Scatter Plot Showing the Number of Attendants at an Aquatic Park on Specific Temperatures
By finding the equation of the line of fit, it can be predicted how many people would attend the park if the temperature is about 110^(∘)F. To do so, the equation in slope-intercept form of a line can be used. y=mx+b In this equation, m is the slope and b the y-intercept of the line. The slope can be calculated by using the Slope Formula. m = y_2-y_1/x_2-x_1 Because the points ( 70, 7000) and ( 80, 8000) are on the line of fit, they will be used to find the slope.
m = y_2-y_1/x_2-x_1
m=8000- 7000/80- 70
Evaluate right-hand side
m=1000/10
m=100
Now, the slope m= 100 can be substituted to obtain a partial equation of the line of fit. y=mx+b substitute y= 100x+b By substituting one of the points on the line of fit, the y-intercept can be found. In this case, ( 70, 7000) will be used.
y=100x+b
7000=1000( 70)+b
Solve for b
7000=7000+b
0=b
b=0
Therefore, the y-intercept b is 0. With this information, the equation of the line of fit can be written. y=100x+b substitute y&=100x+ 0 y&=100x Finally, using this equation, the number of people that would attend the park if the temperature is about 110^(∘)F can be predicted. To do so, the equation needs to be evaluated for x=110.
y=100x
y=100( 110)
y=11 000
It is expected that more than 10 000 people will attend the park next week. What a brilliant way to make a prediction. Magdalena has really helped the aquatic park prepare for the influx of visitors surely to come.



Bivariate Quantitative Data
Exercise 1.1
>
2
e
7
8
9
×
÷1
=
=
4
5
6
+
<
log
ln
log
1
2
3
()
sin
cos
tan
0
.
π
x
y