Scatter Plots, Bivariate Data, and Trend Lines: Understanding Associations

Shoe Size	Height (in.)
8	63
7	59
6	60
5	55
7	58
8	60
6	56
7	64
5	54

Shoe Size

Height (in.)

Similarities	Differences
Both describe the relationship between two random variables. Both use scatter plots for analyzing relationships between variables.	Correlation detects linear relationships; association detects both linear and non-linear relationships. Correlation quantifies a relationship with a number between - 1 and 1; association does not quantify.

Similarities

Differences

Both describe the relationship between two random variables.
Both use scatter plots for analyzing relationships between variables.

Correlation detects linear relationships; association detects both linear and non-linear relationships.
Correlation quantifies a relationship with a number between - 1 and 1; association does not quantify.

Temperature	Number of Ice Cream Sales
15^(∘)C	20
18^(∘)C	18
21^(∘)C	42
24^(∘)C	55
27^(∘)C	70
30^(∘)C	65
33^(∘)C	120
36^(∘)C	116
39^(∘)C	130
42^(∘)C	140

Temperature

Number of Ice Cream Sales

15^(∘)C

18^(∘)C

21^(∘)C

24^(∘)C

27^(∘)C

30^(∘)C

33^(∘)C

120

36^(∘)C

116

39^(∘)C

130

42^(∘)C

140

Temperature	Number of Ice Cream Sales	Ordered Pairs
15^(∘)C	20	(15,20)
18^(∘)C	18	(18,18)
21^(∘)C	42	(21,42)
24^(∘)C	55	(24,55)
27^(∘)C	70	(27,70)
30^(∘)C	65	(30,65)
33^(∘)C	120	(33,120)
36^(∘)C	116	(36,116)
39^(∘)C	130	(39,130)
42^(∘)C	140	(42,140)

Temperature

Number of Ice Cream Sales

Ordered Pairs

15^(∘)C

(15,20)

18^(∘)C

(18,18)

21^(∘)C

(21,42)

24^(∘)C

(24,55)

27^(∘)C

(27,70)

30^(∘)C

(30,65)

33^(∘)C

120

(33,120)

36^(∘)C

116

(36,116)

39^(∘)C

130

(39,130)

42^(∘)C

140

(42,140)

Cluster	A group of data points that are close together on a graph.
Gap	An empty space or interval between groups of data points on a graph.
Outlier	A data point that is noticeably in different place from the other data points on a graph.

Cluster

A group of data points that are close together on a graph.

Gap

An empty space or interval between groups of data points on a graph.

Outlier

A data point that is noticeably in different place from the other data points on a graph.

Time	Number of Ice Cream Sold
8	4
10	7
12	13
14	15
16	18
18	22
20	20
22	23

Time

Number of Ice Cream Sold

Ages of Buyers	Number of Ice Cream Sales
5	10
10	6
15	16
20	18
25	10
30	20
35	25
40	6
45	9
50	3
55	13
60	8

Ages of Buyers

Number of Ice Cream Sales

Ages of Buyers	Number of Ice Cream Sales	Ordered Pair
5	10	(5,10)
10	6	(10,6)
15	16	(15,16)
20	18	(20,18)
25	10	(25,10)
30	20	(30,20)
35	25	(35,25)
40	6	(40,6)
45	9	(45,9)
50	3	(50,3)
55	13	(55,13)
60	8	(60,8)

Ages of Buyers

Number of Ice Cream Sales

Ordered Pair

(5,10)

(10,6)

(15,16)

(20,18)

(25,10)

(30,20)

(35,25)

(40,6)

(45,9)

(50,3)

(55,13)

(60,8)

The table displays the number of birds that visit bird feeding houses during different times of the day.

Time of the Day	Number of Birds Observed
3:00AM	15
6:00AM	25
9:00AM	13
12:00PM	30
3:00PM	47
6:00PM	35
9:00PM	44

Select the scatter plot where the x-axis represents time and the y-axis represents the number of birds that visit the bird feeding house.

Choose the correct option that interprets the scatter plot of the data.

Make a conjecture about the number of birds that visit the bird feeding houses at midnight.

A scatter plot is a graph that shows the associations between the variables of a data set. The table shows the number of birds that visit a bird feeding house over time.

Time of the Day	Number of Birds Observed
3:00AM	15
6:00AM	25
9:00AM	13
12:00PM	30
3:00PM	47
6:00PM	35
9:00PM	44

We want to create a scatter plot. First, we need to represent the data as ordered pairs (x,y), where x is the time and y is the number of birds that visit the bird feeding house. For simplicity, we will write the time as hours after midnight.

Time of the Day	Number of Birds Observed	Ordered Pair
3:00AM	15	(3,15)
6:00AM	25	(6,25)
9:00AM	13	(9,13)
12:00PM	30	(12,30)
3:00PM	47	(15,47)
6:00PM	35	(18,35)
9:00PM	44	(21,44)

Now, we can plot the ordered pairs on a coordinate plane to graph the scatter plot.

The correct option is B.

We can analyze the type of association between the variables in the data set and identify outliers or clusters to interpret the scatter plot. Remember, scatter plots can illustrate various patterns of association between two datasets.

We can use the shape of the distribution of a scatter plot to determine the type of association. Let's look at the scatter plot from Part A!

We can see that the points lie close to a line. As the time increases, the number of birds that visit the bird feeding house also increases. This means that the slope of the line is positive. Therefore, the scatter plot shows a positive linear association. Next, remember some important definitions.

Outlier	An outlier is a data point that is set off from the other data points.
Cluster	A cluster is a group of points that lie close together.

With these definitions in mind, we can draw the following conclusions from the scatter plot.

There are no outliers
There are no clusters.

Note that clusters and outliers are found by observing a graph. This means that they are subjective and a different observer can interpret the graph differently. Our answer is just an example answer.

The scatter plot shows a positive linear association. There are no clusters or outliers.

To make a conjecture about the number of bird visits produced at midnight, we can follow the pattern until the x-value on the scatter plot from Part A reaches 24. Let's do it!

We observe on the diagram that at midnight, we can expect to see around 60 birds. However, it is essential to remember that this is an estimation based on the given data, and the actual value may vary.

Ten students took two tests covering the same content. The scatter plot displays their scores.

Select all statements that are true.

We are given a scatter plot that shows the scores of two tests of the same content for 10 students.

Here, we will select all statements that are true. The coordinates of the points in the scatter plot will helps us interpret the statements, so we will make a table that shows the x- and y-values of the points.

	x-coordinate (Score for First Test)	y-coordinate (Score for Second Test)
Student 1	45	60
Student 2	45	70
Student 3	55	65
Student 4	60	65
Student 5	65	75
Student 6	70	70
Student 7	80	75
Student 8	85	90
Student 9	90	95
Student 10	95	30

After considering the statements separately we will select the correct ones. Let's begin with the first two statements.

First Two Statements

We will consider the first two statements.

I. Five of the scores for the first test were at least 70. |- II. Five of the scores for the second test were less than 70.

We can highlight the options with first scores of at least 70 by bolding them. Additionally, we will emphasize the second scores that are less than 70 by coloring them to red.

	x-coordinate (Score for First Test)	y-coordinate (Score for Second Test)
Student 1	45	60
Student 2	45	70
Student 3	55	65
Student 4	60	65
Student 5	65	75
Student 6	70	70
Student 7	80	75
Student 8	85	90
Student 9	90	95
Student 10	95	30

We can observe from the table that 5 of the scores for the first test were at least 70, while 4 of the scores for the second test were less than 70. Based on this information, we can conclude that the first statement is correct, whereas the second one is false.

Third Statement

To determine the number of students who scored higher on the second test than on the first one, we can refer to the table we created.

	x-coordinate (Score for First Test)	y-coordinate (Score for Second Test)
Student 1	45	60
Student 2	45	70
Student 3	55	65
Student 4	60	65
Student 5	65	75
Student 6	70	70
Student 7	80	75
Student 8	85	90
Student 9	90	95
Student 10	95	30

There are 7 students who scored higher on the second test, so the statement is false.

Fourth Statement

Let's identify the outlier in the data set. Remember, an outlier is a data point that is significantly different from the other values in the data set.

From the scatter plot, we can observe that the point (95, 30) is an outlier, so this statement is true.

Last Two Statements

Lastly, we will analyze the last two statements.

V. There is a negative association between the scores of the first and second tests. |- VI. There is a positive association between the scores of the first and second tests.

Notice from the scatter plot that as the first test scores of the students increase, their second test scores also increase, indicating a positive association.

This implies that the fifth statement is false, and the sixth statement is true.

The table displays the number of cars sold at a dealership over an eight-year period.

Year, x	Cars Sold, y
1	300
2	500
3	600
4	900
5	1400
6	1500
7	1750
8	2400

Choose the scatter plot and the line of fit where the x-axis represents the year and the y-axis represents the number of cars sold.

Which equation corresponds to the line of fit in the correct option from part A?

Choose the correct options that interpret the slope and the y-intercept of the line of fit.

We have a table that displays the number of cars sold at a dealership over an eight-year period. With this data, we want to create a scatter plot and establish a line of fit. Let's begin by organizing the given data into ordered pairs.

Year, x	Cars Sold, y	Ordered Pairs (x,y)
1	300	(1,420)
2	500	(2,500)
3	600	(3,600)
4	900	(4,900)
5	1400	(5,1400)
6	1500	(6,1500)
7	1750	(7,1750)
8	2400	(8,2400)

Plot these ordered pairs on a coordinate plane.

A line of fit is a line drawn on a scatter plot that closely aligns with most of the data points. It serves as a tool for estimating data on a graph. It is important to note that this line does not necessarily need to intersect any of the data points.

Now, we will determine the equation of our line of fit using two points that lie on the line. It is preferable to select points from the provided data set for ease of calculation.

However, in this case, it seems that none of these points lie on the line. This means that we will choose two points on the line that do not belong to the given data set. We can use the points (2,400) and (6,1600).

First, find the slope between these two points.

The slope of the line is 300. We can substitute this value into the equation of a line in slope-intercept form. y= 300x+b Now, let's determine the y-intercept b. We can substitute the coordinates of the point ( 2, 400) into the equation and solve for b, as this point lies on the line.

The y-intercept of the line is - 200. We can also substitute this value into the equation to complete the equation. y=300x+( -200) ⇕ y=300x-200

Let's consider the equation we obtained in Part B. Recall that in our case, x represents the year and y represents the number of cars sold. y= 300x - 200 We can interpret the slope and y-intercept of the equation in the given context as follows.

A slope of 300 means that the number of cars sold increases by about 300 units per year.
A y-intercept of - 200 has no meaningful interpretation in this context — we cannot have a negative number of cars sold.

In conclusion, options I and III are the correct ones among the given options.

Scatter Plots and Trend Lines

Catch-Up and Review

Relationship Between a Line and Data Points

Bivariate Data

Scatter Plot

Association

Extra

Identifying Associations from Scatter Plots

Exploring Ice Cream Sales and Town Temperature

Answer

Hint

Solution

Line of Fit

Determine the Trend Line

Impact of Time of the Day on Ice Cream Sales

Answer

Hint

Solution

Impact of Buyer Age on Ice Cream Sales

Answer

Hint

Solution

Line of Best Fit

Line of Best Fit

First Two Statements

Third Statement

Fourth Statement

Last Two Statements

Scatter Plots and Trend Lines

Recommended exercises

	12 Theory slides
	9 Exercises - Grade E - A
	Each lesson is meant to take 1-2 classroom sessions