Sign In
| 12 Theory slides |
| 11 Exercises - Grade E - A |
| Each lesson is meant to take 1-2 classroom sessions |
Here are a few recommended readings before getting started with this lesson.
Izabella's favorite candy, Frutty, is sold in packs of thirty candies with three different flavors — apple, orange, and banana.
Begin by finding the range of the data, then draw a number line which covers this range.
The smallest number in the data set is 8 and the largest is 12. This means that the dot plot can be displayed above a horizontal number line that covers at least the numbers from 8 to 12. Here, a number line from 7 to 13 will be used.
From here, the dot plot can be drawn as follows.
A multiple-choice test has ten questions. After grading the test, the teacher produced the following dot plot to show how many correct answers each student had on the test.
How many students are there in the class?
Each dot represents the performance of one student on the test.
Each dot represents the performance of a student on the test. For example, since there is one dot above the number 4, it means that one student answered four questions correctly. The rest of the dot plot can be interpreted similarly.
Number | Dots Above the Number | Conclusion |
---|---|---|
0,1,2,3 | 0 | There are no students who answered fewer than four questions correctly. |
4 | 1 | One student answered four questions correctly. |
5 | 3 | Three students answered five questions correctly. |
6 | 2 | Two students answered six questions correctly. |
7 | 4 | Four students answered seven questions correctly. |
8 | 5 | Five students answered eight questions correctly. |
9 | 3 | Three students answered nine questions correctly. |
10 | 2 | Two students answered all ten questions correctly. |
A college hockey team played 23 games during a season. An enthusiastic fan made a dot plot of the number of goals the team scored in each game.
Group the data in a frequency table using the intervals asked in the prompt. The first interval will be the ages 40–44.
The frequency table below shows the grouping of the data starting at 40 and using 5-year intervals.
Interval | Frequency |
---|---|
40–44 | 2 |
45–49 | 7 |
50–54 | 12 |
55–59 | 13 |
60–64 | 8 |
65–69 | 2 |
70–74 | 1 |
Use these intervals and frequencies to draw the histogram.
In 1936, Sir Ronald Aymler Fisher published a paper entitled The Use of Multiple Measurements in Taxonomic Problems.
Fisher investigated several measurements of three species of flowers.
The histogram below shows the summary of the data about the sepal length of the Iris virginica flowers.
How many Iris Virginica flowers did Fisher investigate in this paper?
Consider the height of the rectangles in the histogram.
In a histogram, the height of the rectangles shows the frequency of the data elements in the corresponding interval.
A ranger is surveying a forest. He randomly selected 40 loblolly pines (Pinus taeda) and measured their heights. The histogram below is the summary of the data.
Rearrange the data in increasing order and find the five-number summary.
The box-plot is built using these points.
Putting all this together gives the box plot.
In the 1994 report The Population Biology of Abalone (Haliotis species) in Tasmania,
the authors presented and investigated the measurements of 4177 blacklip abalones.
The lengths of the shells in millimeters are summarized in the box plot below.
How many blacklip abalones' lengths were shorter than 90 millimeters in this experiment?
Which part of the box plot is at 90?
The left side of the box is at 90, so the first quartile of the lengths is 90 millimeters.
Note that from the box plot, the only conclusion we can make is that the number of blacklip abalones shorter than 90 millimeters is less than 1045.
In fact, there were 60 blacklip abalones with a length of 90 millimeters in the experiment. The answer option 1007 reflects the actual answer to the question, but to get this value, the full data is needed — the box plot is not enough.
The heights, in feet, of red alder (Alnus rubra) trees in a forest are summarized in the following box plot.
In some cases, scientists use visual representations that go beyond the three types of plots discussed in this lesson. For example, the report about the blacklip abalones also contains data about their sex. This can be used to present a summary of the length in a stacked histogram.
The following box plot represents a data set of 107 observations.
Let's consider the statements one at a time.
From the diagram, we see that the maximum value is 90. Therefore, this value is definitely in the data set which means the statement A is true.
Be careful not to confuse the median and mean of a data set. The vertical line inside of a box plot shows the median of a data set. The box plot does not show the mean.
We do not have enough information to calculate the mean. It could be 70, but it does not have to be. That means statement B is not true.
Notice that there are 107 observations in the data set. This means that the 54^(th) observation is the median when the data set is ordered from least to greatest.
Similarly, since we have an odd number of observations on the left and right side of the median, the quartiles are represented by the 25^(th) and 81^(th) observation, respectively. That is, the set contains the first and third quartiles.
From the box plot, we see that the upper quartile is 80. Consequently, the data set contains the value 80. Statement C is true.
From the diagram, we see that the lower quartile is 60. In any data set, at most 25 % of the observations are below the lower quartile Q_1.
Therefore, statement D is true.
Let's first count the number of observations that are in each part of the box. Let's zoom in on the box and illustrate the number of observations in it.
Calculate the sum of the observations. 1+26+1+26+1=55 As we can see, 55 observations are somewhere in the box including the start and end of the box. In theory, it is possible that all values in the box only fall on the quartiles and median. Let's calculate what percentage of the set these 55 observations represent. 55/107≈ 51.4 % The 55 observations represent more than 50 % of the set. Therefore, statement E is false.
In a statistics class, all the students had their heights measured. The values are presented in the following histogram.
How many students are at most 163 centimeters tall? Use the following dot plot to solve the question.
The students who are at least 175 centimeters tall are those in the last three bars of the histogram.
These bars show 2, 1, and 2 students respectively, for a total of 5 students.
Let's mark the part of the histogram showing students who are at least 160 centimeters tall but less than 180 centimeters tall.
Next, we will add the lengths of these bars to obtain the number of students within this height range. 5+6+4+2=17 To determine the percentage of students who are in this interval, we must also find the total number of students in the class. To do that, we will sum the number of students in each interval of the histogram. 3 + 2 + 5 + 6 + 4 + 2 + 1 + 2 = 25 Now we can determine the percentage of students in this interval.
Therefore, 68 % are at least 160 centimeters tall but less than 180 centimeters tall.
Notice that the measure 163 centimeters is in the 160-165 bar.
However, a histogram does not tell us anything about the distribution of values within one of the bars. That means we cannot use the histogram to determine how many students are less than 163 centimeters tall. Here is where the dot plot comes in handy.
A dot plot shows the individual observations. As we can see, all students in the histogram's 160-165 bar
are, in fact, 160 centimeters tall. We now can determine the total number of students who are at most 163 centimeters tall.
1+2+2+5=10
The following box plot represents a data set of 15 observations.
Which of the following dot plots could represent the box plot?
Examining the box plot, we notice that it has a minimum value of 0 and a maximum value of 10. Of the six box plots, dot plot B has a minimum value of 1 and dot plot E has a maximum value of 9. Therefore, they cannot describe the box plot.
To determine which of the remaining options could fit the box plot, we should consider the number of observations in the data set. Since we have 15 observations, the median and quartiles are represented by observations in the data set.
As we can see, the lower and upper quartiles are the 4^(th) and 12^(th) observations, respectively and the median is the 8^(th) observation.
Let's mark the 4^(th), 8^(th), and 12^(th) observations in the remaining dot plots. If a dot plot reflects the box plot, these observations should be at the tick-marks corresponding to 3, 5, and 8, respectively.
As we can see, A and D have the correct quartiles and median while C and F do not. Therefore, we also have to discard C and F.
Of the given dot plots, only A and D could be used to describe the box plot.
Let's first draw a diagram that illustrates the data set in ascending order.
To get the highest possible sum for this data set, we want each observation to be as high as possible.
From the exercise, we have been given the minimum and maximum value. This means the first and last observations are locked in.
To find the median, we have to consider the number of observations in the data set. The data set contains 7 observations which is an odd number. Therefore, the median must be the 4^(th) observation.
To find the quartiles, we divide the observations that are on the left and right sides of the median in two equal halves. Because the number of observations on both sides is odd, the lower and upper quartile will be the 2^(nd) and 6^(th) observations, respectively. This locks in two more values.
We have two remaining values which we can pick arbitrarily as long as they are not less than the previous observation and not greater than the observation that comes after. Since we want the sum of to be the greatest possible, we will let these observations take the same value as the one that comes after.
By adding all of the observations, we get the greatest possible sum of the data set. 1+2+3+3+4+4+5=22