{{ toc.signature }}
{{ 'ml-toc-proceed-mlc' | message }}
{{ 'ml-toc-proceed-tbs' | message }}
An error ocurred, try again later!
Chapter {{ article.chapter.number }}
{{ article.number }}.

# {{ article.displayTitle }}

{{ article.intro.summary }}
{{ ability.description }}
Lesson Settings & Tools
 {{ 'ml-lesson-number-slides' | message : article.intro.bblockCount }} {{ 'ml-lesson-number-exercises' | message : article.intro.exerciseCount }} {{ 'ml-lesson-time-estimation' | message }}
Data sets appear in various scenarios in everyday life. Statistical displays are used for their analysis and making conclusions about these data sets. This lesson will introduce different types of statistical displays and explain what they are best used for.

### Catch-Up and Review

Here are a few recommended readings before getting started with this lesson.

Challenge

## Which Display to Choose?

Consider four different situations that contain a data set.

 Situation A company wants to compare the monthly sales performance of its Product A over the course of a year. A survey is conducted to gather data on the distribution of ages among participants in a community event. An analysis is conducted to compare the distribution of exam scores in a class. A study tracks the temperature variations over the course of a week in a particular city.
There exist a few statistical displays, which can be used to show data from these and similar situations. Which of the following four displays fit best for each given situation? Click on the displays' names to see how they look.
Discussion

## Dot Plot

A dot plot, also known as line plot, is a way to represent numerical data in which each data point is represented with a dot above a horizontal number line. The dots representing the same measurements are stacked above each other. Consider the following data set.
• There are two s in this data set, so on the corresponding dot plot two dots are stacked above the number of the number line.
• There is one in this data set, so on the corresponding dot plot a single dot is drawn above the number of the number line.
• There are five s in this data set, so on the corresponding dot plot five dots are stacked above the number of the number line.
For data sets containing more than data points, dot plots are often inconvenient and other representations are preferred.
Example

## Math Competition Results

Tadeo bought a new scientific magazine Imagineer, which includes a lot of different graphs and math-related discussions on real-life topics. The first article discussed the results of a recent math competition among high-school students.

The article illustrated the grades of the participants obtained on the scale from to Interpret the dot plot.

For comparison, the article also listed the grades of the last year's participants. Construct a dot plot from the given data set.

a Median:

Range:

Shape: Symmetric
b Dot Plot

### Hint

a Determine the median of the data set, find the maximum and minimum values and the range of the data. Analyze the overall shape of the graph and any distinctive features.
b Draw a horizontal line, title the dot plot and choose the labels for the line, either categories or sequential consecutive numbers. Find the frequency of each value in the data set and place the corresponding number of dots over it on the line.

### Solution

a Interpreting a dot plot consists of three steps.
 Step Determine the center (median) by finding the middle data point. Find the maximum and minimum values on the graph. Use these values to calculate the spread (range) of the data. Analyze the overall shape of the graph. Note any other features of interest on the graph.

Complete each step one at a time.

### Step

First, the total number of data points should be found. This is why begin by counting all the dots in the given dot plot.

There are data points. This means that there were participants in the math competition that obtained a grade. The median of the data set is, therefore, the data point, as it divides the set into halves. Find its value on the dot plot.

The data point is which means that the median or the center of the data set is This measure indicates the middle value of all the grades obtained by the participants.

### Step

Now, find the maximum and minimum values on the dot plot.

The minimum grade obtained by a participant is and the maximum grade is To find the range, calculate the difference between these values.
This means that the range of grades obtained by the participants of the math competition is

### Step

The next step is to analyze the overall shape of the graph.

The overall shape of this graph appears to be the bell shape of a normal distribution, meaning the grades are overall normally distributed, and the plot is symmetric in shape.

b A dot plot can be constructed by following these three steps.
 Step Draw a horizontal line to begin the dot plot. Title the dot plot based on the problem, and label the plot with the categories/numbers. When labeling the line with numbers, the numbers must be sequential and in a consecutive order. Determine the frequency for each piece of data provided in the problem. Place dots over each category or number on the horizontal line that corresponds to the frequency for each piece of data as depicted in the table.

Complete each step one at a time.

### Step

First, a horizontal line should be drawn. The given data set includes the grades of the participants from last year, so the title of the dot plot can be Grades From Last Year.
Also, the grades vary from to so the dot plot can be labeled with the consecutive integers from to

### Step

The next step is to determine the frequency of each grade obtained by the participants from last year's math competition. To do so, count how many times each grade from to appears and write that number in a table next to the grade.

### Step

Lastly, place dots over each grade from to the number of times it appears in the data set. Use the values from the frequency table.

This way the dot plot for the given data set of values was formed.

Discussion

## Frequency Distribution

A frequency distribution, sometimes called a histogram distribution, is a representation that displays the number of observations within a given interval. It is used to show the empirical or theoretical frequency of occurrence of each possible value in a data set, often recorded in a frequency table. Frequency distributions of categorical data are typically presented using a bar graph.

The most common types of distributions are symmetric frequency distribution and skewed frequency distribution.
Discussion

## Histogram

A histogram is a graphical illustration of a frequency distribution of a data set that contains numerical data. Histograms have several defining characteristics.

• The data is grouped into specific ranges of values known as intervals.
• All intervals in a histogram must be the same size.
• Interval data is marked in groups along the horizontal axis.
• A histogram is a collection of rectangles drawn above intervals.
• The height of each rectangle is proportional to the frequency of the data in the corresponding interval.
Consider an example situation. A fruit store wants to examine the weights of the apples they sell. To see the distribution, it is not necessary to show each apple's weight individually. Instead, the apples can be grouped by their weights in intervals of and so on.
A histogram looks similar to a bar graph. The difference is that a histogram has numbers on the horizontal axis and the bars cannot have a space between each other because the data is continuous.
Discussion

## Drawing a Histogram

A data set can be illustrated with a histogram. Consider the following data set.
To draw a histogram for this data set, there are four steps to follow.
1
Choose the Number of Intervals
expand_more
The first step to drawing a histogram is deciding what intervals of numbers it will have. Remember that each interval must have the same length and all data points must lie in an interval. First, count the numbers in the data set.
One method to find a suitable number of intervals is to take the square root of the number of data points. Since there are numbers in the data set, calculate the square root of
This means that the histogram can have either three or four intervals. In this case, it will have four intervals.
2
Determine the Size of the Intervals
expand_more
Next, it is necessary to determine the size of the intervals. This can be done by identifying the lowest and highest data value in the set.
Since the lowest data value in the set is and the highest is using four intervals with a range of will cover numbers from to and, therefore, will encompass all data points. The intervals of the histogram will be the following.
3
Make a Frequency Table
expand_more

The next step is to make a frequency table showing how many data points lie in each interval.

Interval Data Points Frequency
4
Draw a Histogram
expand_more

From the frequency table, the histogram can be constructed by drawing a bar over each interval with a height corresponding to the found frequency.

Example

## Analyzing Ticket Sales

Another article in Imagineer focuses on the upcoming opening of a new screen room in the local cinema Movieton. The theater had screens for the last years. The histogram in the article illustrates the distribution of ticket sales for a fiscal week in the year

Interpret the bar graph by describing its shape, center, and any extreme values if they exist. Use the bar graph to determine what day tends to have the most ticket sales, and what the average amount of ticket sales is on that day.

The article also examined the distribution of ages of people attending the cinema theater. As a reference, it listed the ages of cinema visitors on a random evening. Draw a histogram for this data set.

a Independent Variable: The days of the week

Dependent Variable: The number of tickets sold
Distribution: Left-Skewed

Most Ticket Sales: Saturday
b Histogram:

### Hint

a Identify the independent and dependent variables. List the frequencies of every bin and use the values to interpret the data. Then describe the bar graph's shape.
b Choose the number of intervals and determine their sizes. Make a frequency table and use the values to draw a histogram.

### Solution

a Interpreting a bar graph involves three steps.
 Step Identify the independent and dependent variable. List the frequency in each bin. Interpret the data and describe the bar graph's shape. Use the interpretation to answer any questions about the data.

Complete each step one at a time.

### Step

First, the independent and dependent variables need to be identified. The horizontal line lists days of a week while the vertical line represents the number of tickets sold.

The article analyzes how many tickets are sold on different days of a week and which day has the most sales. This means that the independent variable is the days of the week and the dependent variable is the number of tickets sold on each day.

### Step

Next, the frequency in each bin should be listed and interpreted. Use the values given in the bar graph to indicate the height of each bin. Remember that the vertical line represents the number of tickets sold on each day.

Day Frequency
Monday tickets were sold.
Tuesday tickets were sold.
Wednesday tickets were sold.
Thursday tickets were sold.
Friday tickets were sold.
Saturday tickets were sold.
Sunday tickets were sold.

### Step

Lastly, interpret the data and describe the bar graph's shape. The bar graph shows that the distribution of ticket sales is left-skewed.

Friday and Saturday are the days with the most number of tickets sold, and respectively. Also, the largest number of tickets tend to be sold on Saturday, and that number of tickets is

b To draw a histogram, follow these four steps.
 Step Choose the number of intervals Determine the size of the intervals Make a frequency table Draw a histogram

Complete each step one at a time.

### Step

The first step is determining what intervals of numbers it will have. Remember that each interval must have the same length and all data points must lie in an interval. First, count the numbers in the data set.
To find a suitable number of intervals, take the square root of the number of data points. In this case, there are data points.
This means that the histogram can have either four or five intervals. In this case, it will have four intervals.

### Step

Next, the size of the intervals needs to be determined. This can be done by identifying the lowest and highest data value in the set.
Since the lowest data value in the set is and the highest is using four intervals with a range of will cover numbers from to and, therefore, will encompass all data points. The intervals of the histogram will be the following.

### Step

The next step is to make a frequency table showing how many data points lie in each interval.

Interval Data Points Frequency

### Step

From the frequency table, the histogram can be constructed by drawing a bar over each interval with a height corresponding to the found frequency.

Discussion

## Box Plot

A box plot or box and whisker plot can be used to illustrate the distribution of a data set. A box plot has three parts.

• A rectangular box that extends from the first to the third quartiles and with a line between and indicating the position of the median.
• A segment attached to the left of the box that extends from the first quartile to the minimum of the data set.
• A segment attached to the right of the box that extends from the third quartile to the maximum of the data set.

A box plot is a scaled figure, usually presented above a number line. The set of numbers used to draw the box plot is called the five-number summary of the data set. Each of the five numbers is labeled accordingly.

A box plot provides a visual illustration of the distribution of a data set. Each segment of the chart contains one quarter, or of the data, and the center of the data lies inside the box. The further apart the segments are, the greater the spread is for that quarter of the data.
Discussion

## Drawing a Box Plot

A box plot can be used to display any data set of numbers. To draw it, the minimum, maximum, median, and first and third quartiles of the data set need to be identified. The following data set gives the test scores for a grade.
There are four steps to follow to draw a box plot for the data set.
1
Find the Minimum and Maximum
expand_more
Sometimes, the data is given in ascending order. When it is not, it is necessary to begin by ordering the data points from least to greatest.
Now the minimum and maximum are easily identifiable in this ordered data set. Here, the minimum is and the maximum is These values are marked above a number line with a line segment, indicating the range of the box plot.

2
Determine the Median
expand_more
Counting all the data points gives the conclusion that there are values in the set.
To find the median, recall that it is the value that lies in the middle of a data set. Since there are values, the median is the mean of the numbers at the and position.
Now, determine the median by calculating the mean of and
The median is Mark this value as a vertical line segment in the range above the number line. Remember that the line for the median falls inside the box.
3
Determine the Quartiles
expand_more
The next step is to find the first and third quartiles of the data set. The median divides the set into two smaller sets, each with values.
The first quartile is the middle, value in the first set with smaller values. It equals
The third quartile is the median of the second set with greater values. It equals the value of
4
Draw the Box Plot
expand_more

The first and third quartiles are marked as the left and right sides of the box plot. The box plot can be completed by drawing a box between the quartiles.

Example

## Air Pollution Levels

The next article in Imagineer focused on an analysis of air quality data collected from different urban and rural areas around the world. It included a box plot to visualize variations in pollution levels.

The box plot presented in the article is based on the scores from to where is the greatest level of pollution, given to different areas by experts according to various factors of air pollution. Interpret the box plot.

For comparison, the article also listed the air pollution levels in each of the considered areas years ago. Construct a box plot from the given data set.

a Minimum:

Maximum:
Median:
First Quartile:

Third Quartile:
b Box Plot

### Hint

a Determine the minimum, maximum, median, first quartile and third quartile of the data set. Use the definitions of these concepts to find the meanings of the values.
b Draw a line segment between the minimum and maximum above a number line. Mark the median as a vertical line segment inside the box. Use the first and third quartiles to find the left and right borders of the box.

### Solution

a Interpreting a box plot means identifying its minimum, maximum, median, first and third quartiles.
Recall what each part of a box plot represents.

Now, consider the given box plot.

By comparing the general box plot with this one, the minimum, maximum, median, first and third quartiles can be determined.
Next, use the definitions of each concept to see what each of these values mean.
Concept Value Meaning
Minimum The least level of pollution in the analyzed areas is out of
Maximum The greatest level of pollution in the analyzed areas is out of
Median The average level of pollution in the analyzed areas is out of
First Quartile of the analyzed areas have the level of pollution at or less.
Third Quartile of the analyzed areas have the level of pollution at or more.
b A box plot can be constructed by following these four steps.
 Step Order the data set from least to greatest value. Find the minimum and maximum. Determine the median. Determine the first and third quartiles. Draw a box plot.

Complete each step one at a time.

### Step

Begin by ordering the given data set from least to greatest value.
Now the minimum and maximum are easily identifiable in this ordered data set. Here, the minimum is and the maximum is

### Step

To find the median of the data set, the number of values in the set should first be determined. Begin by counting all the data points.