Sign In
| 11 Theory slides |
| 9 Exercises - Grade E - A |
| Each lesson is meant to take 1-2 classroom sessions |
Here are a few recommended readings before getting started with this lesson.
Match the Finnish name with the Latin name.
The following box plots show the distribution of the heights (in feet and inches) of the players on the Ohio State Buckeyes men's basketball and football teams in the 2020–2021 season.
Of the two sports, which tends to place more importance on a player's height? Identify the maximum and the median heights of each team.
The box plots show that the range of heights is similar for both teams, but Team B, on average, has taller players.
Height tends to be more advantageous in basketball than in football. Therefore, it is reasonable to conclude from the box plots that Team B is the basketball team and Team A is the football team.
Represented using a histogram, the same data set is used to show the distribution of the height of the players on the two teams.
The table below shows the average monthly high temperatures across three small towns.
One town is located in the State of Alaska, another in Florida, and the other in Nebraska.
Analyzing the data set and map, try to match each town with the correct corresponding state. Note that, generally, northern states tend to be colder than southern states.
Think about the relationship between the location and climate of each state of Alaska, Nebraska, and Florida. Then, consider each month's average high temperatures as shown in the data set. Which town is the warmest? Which town is the coldest?
Investigating the given data set and map, the following observations can be made.
Considering each observation from the data set and map, it is likely that Noma is in Florida, Mekoryuk is in Alaska, and Nehawka is in Nebraska.
The following applet shows the histograms of two data sets. Move the slider to investigate the observations separately.
(a−b)2=a2−2ab+b2
LHS+a2+2ab+b2≤RHS+a2+2ab+b2
LHS/4≤RHS/4
Factor out 2
a2+2ab+b2=(a+b)2
Simplify quotient
LHS≤RHS
The table below shows the average monthly low temperatures of two cities — Kansas City and Seattle. The two cities given are not necessarily in order.
According to the data set, City A and B have annual average low temperatures around 45∘F and 43∘F, respectively. Referencing the map below, Seattle is located much further north than Kansas City. It is typical that northern states — on average — are colder than southern states. Nevertheless, Seattle experiences less variance in temperature changes during each season due to the ocean's tempering effect on the climate.
Think of the climate similarities and differences of coastal areas compared to inland areas. Find the range and standard deviation of the temperatures.
Using the range and standard deviation — measures of spread — will help compare the two cities' average low temperatures. A graphing calculator can be used to find the standard deviation.
Range | Standard Deviation | |
---|---|---|
City A | 55−36=19 | 6.7 |
City B | 66−18=48 | 16.4 |
These measures of spread show that the temperature throughout the year changes much less in City A than in City B. Based on that analysis, and considering the information given about the tempering effect of the ocean, it is reasonable to conclude that City A is Seattle and City B is Kansas City. What a cool conclusion to make.
In the US stock market, a measure of how much a stock price fluctuates during a certain period of time is called historical volatility. The following data set from the year 2020 contains information about the daily closing stock price (in dollars) of two companies.
Low | Mean | High | Standard Deviation | |
---|---|---|---|---|
APDN | $2.52 | $6.89 | $15.21 | $2.24 |
DSS | $4.04 | $6.90 | $10.89 | $1.69 |
Which stock price was less volatile in 2020?
Which part of the table gives information about the spread of the stock price?
The numbers in the given table can be interpreted into the following sentences.
Both the range and standard deviation are smaller for DSS. These interpretations indicate that DSS's stock price fluctuated less over the year than the stock price of APDN. Therefore, it can be concluded that the stock price of DSS was less volatile in 2020.
Consider the following two histograms where neither the labels nor scales are specified.
Both of these histograms represent different distributions, and both have 26 columns.
In a lottery, all numbers are drawn with equal probability.
In a mathematics competition, only a few students answer all questions correctly. Still, a lot of students will be able to answer at least some of the questions correctly. Consequently, it is likely that the histogram is shaped like a mountain, with a peak in the center and low ends. The shape of Histogram A reflects this behavior.
In a lottery, all numbers are drawn with equal probability, so in the long run, it can be expected that there is little difference between the frequencies. The shape of Histogram B reflects this.
There is even more fascinating information to be discovered from the shapes of the histograms.
The height of the lone tall bar furthest to the left in Histogram A shows that in the AMC 8 competition, there were plenty of participants in 2020 who did not answer a single question correctly! Well, it is much more likely, however, that these participants registered but did not attend the competition.
Histogram A's peak shows that in 2020, on average, students in the AMC 8 competition answered less than half of the questions correctly.
The fluctuation of the bar heights in Histogram B shows that although an even distribution of the numbers is expected on the Powerball draw, some numbers historically came out fewer times.
The bar corresponding to 24 is more than twice as high as the bar corresponding to 16. However, this does not mean that 24 is twice as likely to come out in a draw. Nor does this mean that players should now play 16 because it will eventually catch up. The data is historical; it does not have any effect on the next draw.
Mean Length to Height Ratio | |
---|---|
Abramis Bjorkna | 2.55 |
Leuciscus Rutilus | 3.75 |
Osmerus Eperlanus | 5.95 |
Esox Lucius | 6.33 |
Next, the actual drawings can be used to find their length to height ratios. This measurement, however, is in pixels instead of centimeters. Most image software on a standard computer can show these measurements. Here, they are given. Recall that the drawings use the Finnish names.
The results, in increasing order, can be summarized as follows.
Length to Height Ratio (Images) | |
---|---|
Pasuri | 120361≈3.01 |
Särki | 106393≈3.71 |
Hauki | 59358≈6.07 |
Norssi | 55358≈6.51 |
The numbers in the two tables do not match exactly, which would make sense given that they are measured using different measurements, and the images are not matching in scale. Still, in both tables, two species have a ratio above 5 and two species have a ratio below 4. That means the following distinction can be made.
Latin Name (Data Set) | Finnish Name (Images) | |
---|---|---|
Longer Fishes | Osmerus eperlanus and Esox lucius | Hauki and Norssi |
Taller Fishes | Abramis bjorkna and Leuciscus rutilus | Pasuri and Särki |
Latin Name | Finnish Name | English Name |
---|---|---|
Abramis Bjorkna | Pasuri | Bream |
Leuciscus Rutilus | Särki | Roach |
Osmerus Eperlanus | Norssi | Smelt |
Esox Lucius | Hauki | Pike |
A company is interested in knowing how many miles their employees have to walk to and from work during a given week. They conduct a survey with 100 men and 100 women and present the results with two box plots.
One of the assistant managers, Tearrik, is asked to make a single boxplot showing every observation. The next morning, he produces the following box plot to show upper management.
Notice that each box plot contains 100 observations, which is an even number. This means the median will be the mean of the 50^(th) and the 51^(st) observations. Similarly, the lower quartile is the mean of the 25^(th) and 26^(st) observations, and the upper quartile is the mean of the 75^(th) and 76^(st) observations. Therefore, each section of our box plots contains 25 observations.
By using this information we can identify how many observations should fall in the different sections of our two boxplots.
Notice that the combined boxplot contains 200 observations. Therefore, each of its four sections must contain 50 observations. Let's replace the two boxplots in the diagram above with the combined boxplot presented by Tearrik. We will keep the three intervals we have identified.
From the women's boxplot, at least 75 observations are less than or equal to 11.5. From the men's boxplot, at least 50 observations are less than or equal to 11.5. Combined, we have that at least 125 are less than or equal to 11.5 — so less than 12. This implies that the median of the combined data cannot be 12 as Tearrik's box plot states. Therefore, Vincenzo is correct.
Consider three numbers and add four to each number.
We will call the three numbers x_1, x_2, and x_3. Now let's add 4 to each of these numbers, which gives us a new data set. x_1+4, x_2+4, x_3+4 We will first write two expressions — one for the old mean x_O, and another for the new mean x_N. x_O &= x_1+x_2+x_3/3 [0.5em] x_N &= (x_1+4)+(x_2+4)+(x_3+4)/3 Next, we will simplify the right-hand side of the second expression.
If we examine the right-hand side, we see that we have the sum of a fraction that matches the old mean and 4.
Therefore, adding four to each number increased the mean by 4.
Let's write the standard deviation for the original data set. We have three numbers, so the denominator is 3.
We will also write an expression for the new standard deviation. Remember that the new mean is x_N=x_O+4.
As we can see, σ_N is the same expression as σ_O. This means the standard deviation remains unchanged by adding the same number to each observation from the data set.
Why did the mean increase by 4? Well, when we add 4 to each to each number, every observation in the data set is pushed to the right by 4 units. Therefore, it makes sense that the mean increased by 4.
However, since every number increased by the same amount, their relative distance to the mean remains unchanged. Therefore the standard deviation, which measures spread, is unchanged.