Sign In
| 12 Theory slides |
| 9 Exercises - Grade E - A |
| Each lesson is meant to take 1-2 classroom sessions |
Here are a few recommended readings before getting started with this lesson.
Bivariate data is a set of data that has been collected in two variables. It shows a relationship between the variables. Each data value in one variable corresponds to a data value in the other variable. A data set with the shoe sizes and heights of people is an example of bivariate data.
Shoe Size | Height (in.) |
---|---|
8 | 63 |
7 | 59 |
6 | 60 |
5 | 55 |
7 | 58 |
8 | 60 |
6 | 56 |
7 | 64 |
5 | 54 |
A scatter plot is used to represent bivariate data.
A scatter plot is a graph that shows each observation of a bivariate data set as an ordered pair in a coordinate plane. Consider the following example, where a scatter plot illustrates the results gathered at a local ice cream parlor. This study records the number of ice creams sold and the corresponding air temperature.
The table outlines the similarities and differences between association and correlation.
Similarities | Differences |
---|---|
|
|
The following applet shows various scatter plots. Choose the type of association that best describes the relationship depicted in each scatter plot.
In a small town Mathville, the local ice cream shop plans to introduce a new flavor.
Zosia is eager to explore factors influencing sales before the big reveal. She begins by analyzing the relationship between ice cream sales and town temperature. The provided bivariate data is the result of the first investigation.
Temperature | Number of Ice Cream Sales |
---|---|
15^(∘)C | 20 |
18^(∘)C | 18 |
21^(∘)C | 42 |
24^(∘)C | 55 |
27^(∘)C | 70 |
30^(∘)C | 65 |
33^(∘)C | 120 |
36^(∘)C | 116 |
39^(∘)C | 130 |
42^(∘)C | 140 |
Clusters: None
Gaps: Falls between 70-116 number of sales.
Temperature | Number of Ice Cream Sales | Ordered Pairs |
---|---|---|
15^(∘)C | 20 | (15,20) |
18^(∘)C | 18 | (18,18) |
21^(∘)C | 42 | (21,42) |
24^(∘)C | 55 | (24,55) |
27^(∘)C | 70 | (27,70) |
30^(∘)C | 65 | (30,65) |
33^(∘)C | 120 | (33,120) |
36^(∘)C | 116 | (36,116) |
39^(∘)C | 130 | (39,130) |
42^(∘)C | 140 | (42,140) |
Now plot the points on a coordinate plane to construct the scatter plot.
From the scatter plot previously drawn, it can be seen that as the temperature increases, the number of ice creams sold also increases almost constantly.
Cluster | A group of data points that are close together on a graph. |
---|---|
Gap | An empty space or interval between groups of data points on a graph. |
Outlier | A data point that is noticeably in different place from the other data points on a graph. |
With this information in mind, take a look at the scatter plot.
Draw some conclusions about the clusters, gaps, and outliers observed in the scatter plot.
When data sets have a positive or negative correlation, the trend of the data can be modeled using a line of fit, also called a trend line. This line is drawn on a scatter plot near most of the data points, which appear evenly distributed above and below the line.
The scatter plot above shows the mean weights of kittens from the same litter in relation to their age. In this case, a line of fit could be drawn quite seamlessly. When drawing a line of fit, the following characteristics should be considered.
Given a scatter plot and a line, determine whether the line is a trend line.
Zosia believes it might be best to launch the new flavor when the town's temperature is on the rise. Her next investigation focuses on the connection between ice cream sales and the time of day. She records the number of ice cream sales during the time of day.
Time | Number of Ice Cream Sold |
---|---|
8 | 4 |
10 | 7 |
12 | 13 |
14 | 15 |
16 | 18 |
18 | 22 |
20 | 20 |
22 | 23 |
Notice that by changing what the x- and y-axes represent, a different scatter plot can be created.
Note that different observers may draw different lines of fit, as this depends on their observations of the data points.
y=mx+n Select two points that are on the line of fit. Note that these points do not necessarily have to be from the data set, but they must lie on the line of fit.
Substitute values
Subtract terms
a/b=.a /2./.b /2.
x= 10, y= 8
a/c* b = a* b/c
Multiply
Calculate quotient
LHS-14=RHS-14
Rearrange equation
There is a positive linear association between the time of the day and the number of ice creams sold. Additionally, when the time of the day is 4, it is expected that there will be no ice cream sales.
Zosia analyzes the relationship between ice cream sales and the ages of buyers within a week. The provided bivariate data represents this investigation. Note that the ages were rounded to the nearest 5 years.
Ages of Buyers | Number of Ice Cream Sales |
---|---|
5 | 10 |
10 | 6 |
15 | 16 |
20 | 18 |
25 | 10 |
30 | 20 |
35 | 25 |
40 | 6 |
45 | 9 |
50 | 3 |
55 | 13 |
60 | 8 |
Ages of Buyers | Number of Ice Cream Sales | Ordered Pair |
---|---|---|
5 | 10 | (5,10) |
10 | 6 | (10,6) |
15 | 16 | (15,16) |
20 | 18 | (20,18) |
25 | 10 | (25,10) |
30 | 20 | (30,20) |
35 | 25 | (35,25) |
40 | 6 | (40,6) |
45 | 9 | (45,9) |
50 | 3 | (50,3) |
55 | 13 | (55,13) |
60 | 8 | (60,8) |
Now plot the points on a coordinate plane to construct the scatter plot.
BestFit
Zosia ultimately concluded that the new ice cream flavor could be introduced during warmer weather and daylight hours, based on her investigation. However, it is crucial to recognize that this decision is based on observations. The ice cream seller may encounter different outcomes when introducing the new flavor due to external factors.
Remember that the lines of fit are drawn during the lesson to help in interpreting the scatter plots by assessing the closeness of all data points to the line. This suggests that different lines of fit can also be drawn for each example. However, there is only one line of best fit for the association.
A line of best fit, also known as a regression line, is a line of fit that estimates the relationship between the values of a data set. The equation of the line of best fit has been determined using a strict mathematical method.
The table displays the number of birds that visit bird feeding houses during different times of the day.
Time of the Day | Number of Birds Observed |
---|---|
3:00AM | 15 |
6:00AM | 25 |
9:00AM | 13 |
12:00PM | 30 |
3:00PM | 47 |
6:00PM | 35 |
9:00PM | 44 |
Select the scatter plot where the x-axis represents time and the y-axis represents the number of birds that visit the bird feeding house.
A scatter plot is a graph that shows the associations between the variables of a data set. The table shows the number of birds that visit a bird feeding house over time.
Time of the Day | Number of Birds Observed |
---|---|
3:00AM | 15 |
6:00AM | 25 |
9:00AM | 13 |
12:00PM | 30 |
3:00PM | 47 |
6:00PM | 35 |
9:00PM | 44 |
We want to create a scatter plot. First, we need to represent the data as ordered pairs (x,y), where x is the time and y is the number of birds that visit the bird feeding house. For simplicity, we will write the time as hours after midnight.
Time of the Day | Number of Birds Observed | Ordered Pair |
---|---|---|
3:00AM | 15 | (3,15) |
6:00AM | 25 | (6,25) |
9:00AM | 13 | (9,13) |
12:00PM | 30 | (12,30) |
3:00PM | 47 | (15,47) |
6:00PM | 35 | (18,35) |
9:00PM | 44 | (21,44) |
Now, we can plot the ordered pairs on a coordinate plane to graph the scatter plot.
The correct option is B.
We can analyze the type of association between the variables in the data set and identify outliers or clusters to interpret the scatter plot. Remember, scatter plots can illustrate various patterns of association between two datasets.
We can use the shape of the distribution of a scatter plot to determine the type of association. Let's look at the scatter plot from Part A!
We can see that the points lie close to a line. As the time increases, the number of birds that visit the bird feeding house also increases. This means that the slope of the line is positive. Therefore, the scatter plot shows a positive linear association. Next, remember some important definitions.
Outlier | An outlier is a data point that is set off from the other data points. |
---|---|
Cluster | A cluster is a group of points that lie close together. |
With these definitions in mind, we can draw the following conclusions from the scatter plot.
Note that clusters and outliers are found by observing a graph. This means that they are subjective and a different observer can interpret the graph differently. Our answer is just an example answer.
The scatter plot shows a positive linear association. There are no clusters or outliers.
To make a conjecture about the number of bird visits produced at midnight, we can follow the pattern until the x-value on the scatter plot from Part A reaches 24. Let's do it!
We observe on the diagram that at midnight, we can expect to see around 60 birds. However, it is essential to remember that this is an estimation based on the given data, and the actual value may vary.
Ten students took two tests covering the same content. The scatter plot displays their scores.
We are given a scatter plot that shows the scores of two tests of the same content for 10 students.
Here, we will select all statements that are true. The coordinates of the points in the scatter plot will helps us interpret the statements, so we will make a table that shows the x- and y-values of the points.
x-coordinate (Score for First Test) | y-coordinate (Score for Second Test) | |
---|---|---|
Student 1 | 45 | 60 |
Student 2 | 45 | 70 |
Student 3 | 55 | 65 |
Student 4 | 60 | 65 |
Student 5 | 65 | 75 |
Student 6 | 70 | 70 |
Student 7 | 80 | 75 |
Student 8 | 85 | 90 |
Student 9 | 90 | 95 |
Student 10 | 95 | 30 |
After considering the statements separately we will select the correct ones. Let's begin with the first two statements.
We will consider the first two statements.
I. Five of the scores for the first test were at least 70. |- II. Five of the scores for the second test were less than 70.
We can highlight the options with first scores of at least 70 by bolding them. Additionally, we will emphasize the second scores that are less than 70 by coloring them to red.
x-coordinate (Score for First Test) | y-coordinate (Score for Second Test) | |
---|---|---|
Student 1 | 45 | 60 |
Student 2 | 45 | 70 |
Student 3 | 55 | 65 |
Student 4 | 60 | 65 |
Student 5 | 65 | 75 |
Student 6 | 70 | 70 |
Student 7 | 80 | 75 |
Student 8 | 85 | 90 |
Student 9 | 90 | 95 |
Student 10 | 95 | 30 |
We can observe from the table that 5 of the scores for the first test were at least 70, while 4 of the scores for the second test were less than 70. Based on this information, we can conclude that the first statement is correct, whereas the second one is false.
To determine the number of students who scored higher on the second test than on the first one, we can refer to the table we created.
x-coordinate (Score for First Test) | y-coordinate (Score for Second Test) | |
---|---|---|
Student 1 | 45 | 60 |
Student 2 | 45 | 70 |
Student 3 | 55 | 65 |
Student 4 | 60 | 65 |
Student 5 | 65 | 75 |
Student 6 | 70 | 70 |
Student 7 | 80 | 75 |
Student 8 | 85 | 90 |
Student 9 | 90 | 95 |
Student 10 | 95 | 30 |
There are 7 students who scored higher on the second test, so the statement is false.
Let's identify the outlier in the data set. Remember, an outlier is a data point that is significantly different from the other values in the data set.
From the scatter plot, we can observe that the point (95, 30) is an outlier, so this statement is true.
Lastly, we will analyze the last two statements.
V. There is a negative association between the scores of the first and second tests. |- VI. There is a positive association between the scores of the first and second tests.
Notice from the scatter plot that as the first test scores of the students increase, their second test scores also increase, indicating a positive association.
This implies that the fifth statement is false, and the sixth statement is true.
The table displays the number of cars sold at a dealership over an eight-year period.
Year, x | Cars Sold, y |
---|---|
1 | 300 |
2 | 500 |
3 | 600 |
4 | 900 |
5 | 1400 |
6 | 1500 |
7 | 1750 |
8 | 2400 |
Choose the scatter plot and the line of fit where the x-axis represents the year and the y-axis represents the number of cars sold.
We have a table that displays the number of cars sold at a dealership over an eight-year period. With this data, we want to create a scatter plot and establish a line of fit. Let's begin by organizing the given data into ordered pairs.
Year, x | Cars Sold, y | Ordered Pairs (x,y) |
---|---|---|
1 | 300 | (1,420) |
2 | 500 | (2,500) |
3 | 600 | (3,600) |
4 | 900 | (4,900) |
5 | 1400 | (5,1400) |
6 | 1500 | (6,1500) |
7 | 1750 | (7,1750) |
8 | 2400 | (8,2400) |
Plot these ordered pairs on a coordinate plane.
A line of fit is a line drawn on a scatter plot that closely aligns with most of the data points. It serves as a tool for estimating data on a graph. It is important to note that this line does not necessarily need to intersect any of the data points.
Now, we will determine the equation of our line of fit using two points that lie on the line. It is preferable to select points from the provided data set for ease of calculation.
However, in this case, it seems that none of these points lie on the line. This means that we will choose two points on the line that do not belong to the given data set. We can use the points (2,400) and (6,1600).
First, find the slope between these two points.
The slope of the line is 300. We can substitute this value into the equation of a line in slope-intercept form. y= 300x+b Now, let's determine the y-intercept b. We can substitute the coordinates of the point ( 2, 400) into the equation and solve for b, as this point lies on the line.
The y-intercept of the line is - 200. We can also substitute this value into the equation to complete the equation. y=300x+( -200) ⇕ y=300x-200
Let's consider the equation we obtained in Part B. Recall that in our case, x represents the year and y represents the number of cars sold. y= 300x - 200 We can interpret the slope and y-intercept of the equation in the given context as follows.
In conclusion, options I and III are the correct ones among the given options.