{{ 'ml-label-loading-course' | message }}

{{ tocSubheader }}

{{ 'ml-toc-proceed-mlc' | message }}

{{ 'ml-toc-proceed-tbs' | message }}

An error ocurred, try again later!

Chapter {{ article.chapter.number }}

{{ article.number }}. # {{ article.displayTitle }}

{{ article.intro.summary }}

{{ 'ml-btn-show-less' | message }} {{ 'ml-btn-show-more' | message }} {{ 'ml-heading-abilities-covered' | message }}

{{ 'ml-heading-lesson-settings' | message }}

| {{ 'ml-lesson-number-slides' | message : article.intro.bblockCount}} |

| {{ 'ml-lesson-number-exercises' | message : article.intro.exerciseCount}} |

| {{ 'ml-lesson-time-estimation' | message }} |

In this first lesson of the statistics unit, dot plots, box plots, and histograms will be used to analyze data. ### Catch-Up and Review

**Here are a few recommended readings before getting started with this lesson.**

Explore

Example

Izabella's favorite candy, Frutty, is sold in packs of thirty candies with three different flavors — apple, orange, and banana.

Izabella wants to know how many banana-flavored candies there are in each pack, so she bought ten packs and counted the number of banana candies in each. Her results are as follows.$10,8,10,9,12,9,10,10,12,10 $

Draw a dot plot to represent the data.
Begin by finding the range of the data, then draw a number line which covers this range.

The smallest number in the data set is $8$ and the largest is $12.$ This means that the dot plot can be displayed above a horizontal number line that covers at least the numbers from $8$ to $12.$ Here, a number line from $7$ to $13$ will be used.

The number of dots drawn on the dot plot above a certain number should match the frequency of that number in the data set.$10,8,10,9,12,9,10,10,12,10 $

Given the data set compare the frequencies of the numbers. - The number $10$ appears five times in the data set, so there should be five dots above number $10$ on the number line.
- The number $8$ appears once in the data set, so there should be one dot above number $8$ on the number line.
- The number $9$ appears twice in the data set, so there should be two dots above number $9$ on the number line.
- The number $12$ appears twice in the data set, so there should be two dots above number $12$ on the number line.

From here, the dot plot can be drawn as follows.

Example

A multiple-choice test has ten questions. After grading the test, the teacher produced the following dot plot to show how many correct answers each student had on the test.

How many students are there in the class?

{"type":"text","form":{"type":"math","options":{"comparison":"1","nofractofloat":false,"keypad":{"simple":true,"useShortLog":false,"variables":[],"constants":[]}},"text":"<span class=\"katex\"><span class=\"katex-html\" aria-hidden=\"true\"><\/span><\/span>"},"formTextBefore":null,"formTextAfter":null,"answer":{"text":["20"]}}

Each dot represents the performance of one student on the test.

Each dot represents the performance of a student on the test. For example, since there is one dot above the number $4,$ it means that one student answered four questions correctly. The rest of the dot plot can be interpreted similarly.

Number | Dots Above the Number | Conclusion |
---|---|---|

$0,1,2,3$ | $0$ | There are no students who answered fewer than four questions correctly. |

$4$ | $1$ | $One$ student answered four questions correctly. |

$5$ | $3$ | $Three$ students answered five questions correctly. |

$6$ | $2$ | $Two$ students answered six questions correctly. |

$7$ | $4$ | $Four$ students answered seven questions correctly. |

$8$ | $5$ | $Five$ students answered eight questions correctly. |

$9$ | $3$ | $Three$ students answered nine questions correctly. |

$10$ | $2$ | $Two$ students answered all ten questions correctly. |

$1+3+2+4+5+3+2=20 $

There are $20$ students in the class who took this test.
Pop Quiz

A college hockey team played $23$ games during a season. An enthusiastic fan made a dot plot of the number of goals the team scored in each game.

Example

The following data set shows the ages of the first $45$ presidents of the United States when their presidencies began. The president's name and presidential period can be displayed by clicking on and holding down each point.
### Answer

### Hint

### Solution

Starting at age $40,$ group the data into $5-year$ intervals and draw a histogram of the results.

Group the data in a frequency table using the intervals asked in the prompt. The first interval will be the ages $40–44.$

The frequency table below shows the grouping of the data starting at $40$ and using $5-year$ intervals.

Interval | Frequency |
---|---|

$40–44$ | $2$ |

$45–49$ | $7$ |

$50–54$ | $12$ |

$55–59$ | $13$ |

$60–64$ | $8$ |

$65–69$ | $2$ |

$70–74$ | $1$ |

Use these intervals and frequencies to draw the histogram.

Example

In $1936,$ Sir Ronald Aymler Fisher published a paper entitled The Use of Multiple Measurements in Taxonomic Problems.

Fisher investigated several measurements of three species of flowers.

The histogram below shows the summary of the data about the sepal length of the *Iris virginica* flowers.

How many *Iris Virginica* flowers did Fisher investigate in this paper?

{"type":"text","form":{"type":"math","options":{"comparison":"1","nofractofloat":false,"keypad":{"simple":true,"useShortLog":false,"variables":[],"constants":[]}},"text":"<span class=\"katex\"><span class=\"katex-html\" aria-hidden=\"true\"><\/span><\/span>"},"formTextBefore":null,"formTextAfter":null,"answer":{"text":["50"]}}

Consider the height of the rectangles in the histogram.

In a histogram, the height of the rectangles shows the frequency of the data elements in the corresponding interval.

- There is one flower with a sepal length between $45$ and $49$ millimeters.
- There are no flowers with a sepal length between $50$ and $54$ millimeters.
- There are six flowers with a sepal length between $55$ and $59$ millimeters.
- There are seventeen flowers with a sepal length between $60$ and $64$ millimeters.
- There are fourteen flowers with a sepal length between $65$ and $69$ millimeters.
- There are six flowers with a sepal length between $70$ and $74$ millimeters.
- There are six flowers with a sepal length between $75$ and $79$ millimeters.

$1+0+6+17+14+6+6=50 $

Fisher investigated the data of about $50$ Pop Quiz

A ranger is surveying a forest. He randomly selected $40$ loblolly pines (*Pinus taeda*) and measured their heights. The histogram below is the summary of the data.

Example

The following table shows the test scores of a class of $26$ students.
### Answer

### Hint

### Solution

$8.515.558.513.513.5 11121567.5 16781213 12.51391510.5 1110.5815.511.5 $

Draw a box plot of the data.
Rearrange the data in increasing order and find the five-number summary.

A box plot is a visual representation of the five-number summary of data. It is a scaled diagram that shows the relative positions of the $minimum$ and $maximum$ $values,$ the $median,$ and the $first$ and $third$ $quartiles.$ The first step in finding these values is to arrange the data in increasing order.

$5810.51213.516 68.51112.515 78.5111315 7.5911.51315.5 810.51213.515.5 $

With an ordered data set, the minimum and maximum are easily identifiable. Here, the minimum is $5$ and the maximum is $16.$ These are marked on a number line.
Since there are $26$ values, the median is the mean of the numbers at the $13th$ and $14th$ positions.
$5810.51213.516 68.51112.515 78.5111315 7.5911.51315.5 810.51213.515.5 $

Now, the median can be determined by calculating the average of $11$ and $11.5.$
$211+11.5 =11.25 $

The median is $11.25.$ This is also marked on the number line. The first quartile is the median of the first half of the data.
$5810.5 68.511 78.511 7.59 810.5 $

The third quartile is the median of the second half of the data.
$1213.516 12.515 1315 11.51315.5 1213.515.5 $

The first quartile is $8.5$ and the third quartile is $13.5.$ These are also marked on the number line.
The box-plot is built using these points.

- The quartiles mark the boundaries of the box.
- The minimum and maximum values mark the end of the whiskers.
- The median marks the position of the line that divides the box.

Putting all this together gives the box plot.

Example

In the $1994$ report The Population Biology of Abalone (

the authors presented and investigated the measurements of $4177$ blacklip abalones.
*Haliotis* species) in Tasmania,

The lengths of the shells in millimeters are summarized in the box plot below.

How many blacklip abalones' lengths were shorter than $90$ millimeters in this experiment?

{"type":"choice","form":{"alts":["<span class=\"katex\"><span class=\"katex-html\" aria-hidden=\"true\"><span class=\"base\"><span class=\"strut\" style=\"height:0.64444em;vertical-align:0em;\"><\/span><span class=\"mord\">1<\/span><span class=\"mord\">0<\/span><span class=\"mord\">0<\/span><span class=\"mord\">7<\/span><\/span><\/span><\/span>","<span class=\"katex\"><span class=\"katex-html\" aria-hidden=\"true\"><span class=\"base\"><span class=\"strut\" style=\"height:0.64444em;vertical-align:0em;\"><\/span><span class=\"mord\">1<\/span><span class=\"mord\">0<\/span><span class=\"mord\">4<\/span><span class=\"mord\">5<\/span><\/span><\/span><\/span>","<span class=\"katex\"><span class=\"katex-html\" aria-hidden=\"true\"><span class=\"base\"><span class=\"strut\" style=\"height:0.64444em;vertical-align:0em;\"><\/span><span class=\"mord\">1<\/span><span class=\"mord\">0<\/span><span class=\"mord\">6<\/span><span class=\"mord\">7<\/span><\/span><\/span><\/span>"],"noSort":true},"formTextBefore":"","formTextAfter":"","answer":0}

Which part of the box plot is at $90?$

The left side of the box is at $90,$ so the first quartile of the lengths is $90$ millimeters.

The problem is now to find out how many data points are less than the first quartile. The first quartile is the median of the lower half of the data set. In this experiment there are $4177$ data points, so by dividing this by $2,$ the number of data points in the lower half can be found.$24177 =2088.5 $

This means that in the lower half, there are $2088$ data points. Now, by dividing $2088$ by $2,$ the placement of the lower quartile can be found.
$22088 =1044 $

The lower quartile is the average of the $1044th$ and the $1045th$ data points. Since the lower quartile is $90,$ the $1045th$ data point is not less than $90.$ Therefore, the number of blacklip abalones that are shorter than $90$ millimeters is less than $1045.$ The only option that meets this condition is $1007.$
Note that from the box plot, the only conclusion we can make is that the number of blacklip abalones shorter than $90$ millimeters is less than $1045.$

- It is possible for the $1044th$ data to be $89,$ in which case the answer to the question would be $1044.$
- It is possible for the $1044th$ data to be $90,$ in which case the answer to the question would be less than $1044.$

In fact, there were $60$ blacklip abalones with a length of $90$ millimeters in the experiment. The answer option $1007$ reflects the actual answer to the question, but to get this value, the full data is needed — the box plot is not enough.

Pop Quiz

The heights, in feet, of red alder (*Alnus rubra*) trees in a forest are summarized in the following box plot.

Closure

In some cases, scientists use visual representations that go beyond the three types of plots discussed in this lesson. For example, the report about the blacklip abalones also contains data about their sex. This can be used to present a summary of the length in a *stacked histogram*.