Sign In
| 11 Theory slides |
| 9 Exercises - Grade E - A |
| Each lesson is meant to take 1-2 classroom sessions |
Here are a few recommended readings before getting started with this lesson.
Match the Finnish name with the Latin name.
The following box plots show the distribution of the heights (in feet and inches) of the players on the Ohio State Buckeyes men's basketball and football teams in the 2020–2021 season.
Considering the chart, match each respective box plot with the correct team.Of the two sports, which tends to place more importance on a player's height? Identify the maximum and the median heights of each team.
The box plots show that the range of heights is similar for both teams, but Team B, on average, has taller players.
Height tends to be more advantageous in basketball than in football. Therefore, it is reasonable to conclude from the box plots that Team B is the basketball team and Team A is the football team.
Represented using a histogram, the same data set is used to show the distribution of the height of the players on the two teams.
The table below shows the average monthly high temperatures across three small towns.
One town is located in the State of Alaska, another in Florida, and the other in Nebraska.
Analyzing the data set and map, try to match each town with the correct corresponding state. Note that, generally, northern states tend to be colder than southern states.
Think about the relationship between the location and climate of each state of Alaska, Nebraska, and Florida. Then, consider each month's average high temperatures as shown in the data set. Which town is the warmest? Which town is the coldest?
Investigating the given data set and map, the following observations can be made.
Considering each observation from the data set and map, it is likely that Noma is in Florida, Mekoryuk is in Alaska, and Nehawka is in Nebraska.
The following applet shows the histograms of two data sets. Move the slider to investigate the observations separately.
(a−b)2=a2−2ab+b2
LHS+a2+2ab+b2≤RHS+a2+2ab+b2
LHS/4≤RHS/4
Factor out 2
a2+2ab+b2=(a+b)2
Simplify quotient
LHS≤RHS
The table below shows the average monthly low temperatures of two cities — Kansas City and Seattle. The two cities given are not necessarily in order.
According to the data set, City A and B have annual average low temperatures around 45∘F and 43∘F, respectively. Referencing the map below, Seattle is located much further north than Kansas City. It is typical that northern states — on average — are colder than southern states. Nevertheless, Seattle experiences less variance in temperature changes during each season due to the ocean's tempering effect on the climate.
Use the ranges and standard deviations of the data set, along with the given geographical information, to determine which cities are pairs.Think of the climate similarities and differences of coastal areas compared to inland areas. Find the range and standard deviation of the temperatures.
Using the range and standard deviation — measures of spread — will help compare the two cities' average low temperatures. A graphing calculator can be used to find the standard deviation.
Range | Standard Deviation | |
---|---|---|
City A | 55−36=19 | 6.7 |
City B | 66−18=48 | 16.4 |
These measures of spread show that the temperature throughout the year changes much less in City A than in City B. Based on that analysis, and considering the information given about the tempering effect of the ocean, it is reasonable to conclude that City A is Seattle and City B is Kansas City. What a cool conclusion to make.
In the US stock market, a measure of how much a stock price fluctuates during a certain period of time is called historical volatility. The following data set from the year 2020 contains information about the daily closing stock price (in dollars) of two companies.
Low | Mean | High | Standard Deviation | |
---|---|---|---|---|
APDN | $2.52 | $6.89 | $15.21 | $2.24 |
DSS | $4.04 | $6.90 | $10.89 | $1.69 |
Which stock price was less volatile in 2020?
Which part of the table gives information about the spread of the stock price?
The numbers in the given table can be interpreted into the following sentences.
Both the range and standard deviation are smaller for DSS. These interpretations indicate that DSS's stock price fluctuated less over the year than the stock price of APDN. Therefore, it can be concluded that the stock price of DSS was less volatile in 2020.
Consider the following two histograms where neither the labels nor scales are specified.
Both of these histograms represent different distributions, and both have 26 columns.
In a lottery, all numbers are drawn with equal probability.
In a mathematics competition, only a few students answer all questions correctly. Still, a lot of students will be able to answer at least some of the questions correctly. Consequently, it is likely that the histogram is shaped like a mountain, with a peak in the center and low ends. The shape of Histogram A reflects this behavior.
In a lottery, all numbers are drawn with equal probability, so in the long run, it can be expected that there is little difference between the frequencies. The shape of Histogram B reflects this.
There is even more fascinating information to be discovered from the shapes of the histograms.
The height of the lone tall bar furthest to the left in Histogram A shows that in the AMC 8 competition, there were plenty of participants in 2020 who did not answer a single question correctly! Well, it is much more likely, however, that these participants registered but did not attend the competition.
Histogram A's peak shows that in 2020, on average, students in the AMC 8 competition answered less than half of the questions correctly.
The fluctuation of the bar heights in Histogram B shows that although an even distribution of the numbers is expected on the Powerball draw, some numbers historically came out fewer times.
The bar corresponding to 24 is more than twice as high as the bar corresponding to 16. However, this does not mean that 24 is twice as likely to come out in a draw. Nor does this mean that players should now play 16 because it will eventually catch up. The data is historical; it does not have any effect on the next draw.
Mean Length to Height Ratio | |
---|---|
Abramis Bjorkna | 2.55 |
Leuciscus Rutilus | 3.75 |
Osmerus Eperlanus | 5.95 |
Esox Lucius | 6.33 |
Next, the actual drawings can be used to find their length to height ratios. This measurement, however, is in pixels instead of centimeters. Most image software on a standard computer can show these measurements. Here, they are given. Recall that the drawings use the Finnish names.
The results, in increasing order, can be summarized as follows.
Length to Height Ratio (Images) | |
---|---|
Pasuri | 120361≈3.01 |
Särki | 106393≈3.71 |
Hauki | 59358≈6.07 |
Norssi | 55358≈6.51 |
The numbers in the two tables do not match exactly, which would make sense given that they are measured using different measurements, and the images are not matching in scale. Still, in both tables, two species have a ratio above 5 and two species have a ratio below 4. That means the following distinction can be made.
Latin Name (Data Set) | Finnish Name (Images) | |
---|---|---|
Longer Fishes | Osmerus eperlanus and Esox lucius | Hauki and Norssi |
Taller Fishes | Abramis bjorkna and Leuciscus rutilus | Pasuri and Särki |
Latin Name | Finnish Name | English Name |
---|---|---|
Abramis Bjorkna | Pasuri | Bream |
Leuciscus Rutilus | Särki | Roach |
Osmerus Eperlanus | Norssi | Smelt |
Esox Lucius | Hauki | Pike |
A manager can choose between two machines producing gadgets. To help the manager decide which to choose, each machine has run for 30 days producing gadgets. The following box plot represents the data of what was produced.
If we look at the box plots, we can see that Machine 2 has a median that is above the median of Machine 1.
This means that, in general, Machine 2 produced more gadgets per day. Be aware that it does not mean that Machine 2 produces more gadgets every single day. The median just allows comparing the estimated numbers for the whole 30-day period.
The interquartile range (IQR) is the difference between Q_3 and Q_1.
From the diagram, we see that the interquartile range is greater for Machine 2.
Standard deviation measures how spread out the observations are from the mean. Observing the given diagram, we see that the box plot for Machine 1 is more compact. However, this does not necessarily mean that the standard deviation is smaller. We would have to know the individual observations to determine the standard deviation. Therefore, the answer cannot be determined.
Which machine has the smallest [-0.1cm] standard deviation? [0.15cm]
Cannot be determined.
The box plots show how well two equally sized classes did on an English test with a maximum score of 15 points.
Which of the following statements are true?Let's go through the statements one at the time.
Examining the box plot, we see that Class 2 has one person who scored 13 out of 15 points, which was the best score out of both classes. However, this does not mean that Class 2 did better overall. To determine this, we would have to compare the mean score, because it takes all observations into account. Therefore, we cannot be certain that this statement is true.
The standard deviation tells us how spread out the observations are from the mean. If we look at the box plot, we see that Class 1 is more compact. However, this does not necessarily mean that the standard deviation is smaller. We would have to know the individual observations to determine the standard deviation. Therefore, we cannot say whether this statement is true or not.
Consider the following example scores. |c|c| Class1 & Class2 2,2,2,2,2,2 & 0,2,2,2,2,2,2 5,5,5,5,5,5,5 & 3,3,3,3,3,3 6,7,7,7,7,7,7,7 & 3,3,3,3,3,3,3 10,10,10,10,10,10 & 7,7,7,7,7,7,13 These are possible scores for the given box plots. Here the standard deviation of Class 1, which is is about 2.76, is greater than the standard deviation of Class 2, which is about 2.64. Even though the data for Class 1 is more compact, the standard deviation is greater for Class 1.
The range is the difference between the maximum value and the minimum value. From the diagram, we see that Class 2 has a lower minimum and a greater maximum compared to Class 1. Therefore, it is true that Class 2 has the greater range.
The median is the vertical bar inside the box of a box plot.
As we can see, Class 1 has the greater median. Therefore, this statement is true.
During the last year Diego and Emily have been running 10 kilometer races every week. They both applied for the track and field team this year. The school coach asks the assistant coach to present the runners' results with two box plots.
Based on the box plots, which runner is likely the best choice for the track and field team?Practice makes perfect. Over time, as with everything, we tend to get better if we practice. Examining the box plots, it would appear that Diego is doing better overall as he is a more consistent runner. However, Emily has a personal best of 41 minutes while Diego's personal best is 42 minutes.
Very likely, they achieved these personal bests during their most recent runs. Therefore, Emily is likely the best candidate for the track and field team.
In a factory with 2000 employees, management would like to lower production time. The mean today is 37 minutes per unit. To lower the time, management would like to improve working conditions for their employees.
Half of the employees which work morning shifts are given extra breaks during the day. The other half, who works the day shift, gets to go home earlier. Two months later, management measures the average production time again for the two teams during twelve consecutive days.Let's calculate the mean production time for each shift.
To find the mean using our graphing calculator, we must first enter the values into lists. To do this, we press STAT and choose Edit.
Then, we enter the values in the first two columns.
Next, we press STAT and scroll right until we reach CALC. There we choose the first option, 1-Var Stats.
By default, the calculator will use List 1 when calculating these statistics, so we just have to press ENTER until we get a result.
The new mean for the morning shift is about 29.1 minutes per unit.
To calculate the mean for the day shift, remember that we already entered the data into List 2. Press STAT and choose 1-Var Stats.
Having chosen 1-Var Stats,
let's switch from L_1 to L_2 by pressing 2nd and 2. Then, we continue pushing ENTER until we get a result.
The day shift team has a mean of 25.4 minutes per unit.
As we can see, both strategies managed to lower the mean production time. However, shortening the working day produced the lowest mean of minutes per unit. This means the day shift had the best efficiency increase to 25.4 minutes per minute. What is the mean of the team who had [-0.1cm] the best efficiency increase? 25.4 minutes/unit
Based on the context, the smaller the standard deviation, the more consistent the production times. Let's compare the standard deviations of both shifts. Notice that we already found the standard deviation in Part A — it is denoted as σ x. Let's show those summaries side by side and mark the standard deviations.
As we can see, the morning shift has the smallest standard deviation. Therefore, they had more consistent results. What is the standard deviation of the team [-0.1cm] that had the most consistent results? 2.1 minutes
Notice that the standard deviation describes the spread of the values around the mean. However, it is the mean that tells us how efficient the team is on average.
Mean | Standard Deviation | Strategy | |
---|---|---|---|
Morning Shift | 29.1 | 2.1 | Give additional breaks |
Day Shift | 25.4 | 3.1 | Shorten the working day |
Therefore, management would get better results if they let their employees go home earlier.
Diego loves a particular type of candy. However, some bags he finishes very fast and others take longer. He wonders why this is. Each bag is supposed to contain 40 pieces of candy and he does not eat very fast.
To find the standard deviation we will use a graphing calculator. First, we enter the values into a list. To do this we press STAT, choose Edit,
and enter the values in the first column.
Next, we press STAT and choose CALC. There we select the first option, 1-Var Stats.
The calculator will automatically use List 1 when calculating the statistics. Finally, we press ENTER until we get a result.
The standard deviation is about 5.8. This is why there is such big difference in the number of candies per bag.
To calculate the standard deviation for the 15 new bags, we begin by entering these values in the first column.
Finally, we go to STAT and choose CALC. There we select the first option, 1-Var Stats,
and we get the standard deviation.
The new standard deviation is about 1.3.
Since the standard deviation of the second data set is much lower, the company is now more consistent and has improved.