PA
Pre-Algebra View details
6. Statistical Displays
Continue to next lesson
Lesson
Exercises
Tests
Chapter 12
6. 

Statistical Displays

Statistical displays are essential tools for visually representing data and interpreting its distribution, trends, and patterns. These visual methods help simplify complex datasets, making it easier to analyze and communicate findings. Dot plots, histograms, and box plots each serve unique purposes in summarizing data, illustrating its central tendencies, spread, and overall shape. Understanding these displays allows for more effective comparisons and conclusions in various fields, such as education, research, and business. By mastering these tools, interpreting data and making informed decisions becomes more efficient and reliable.
Show more expand_more
Problem Solving Reasoning and Communication Error Analysis Modeling Using Tools Precision Pattern Recognition
Lesson Settings & Tools
19 Theory slides
10 Exercises - Grade E - A
Each lesson is meant to take 1-2 classroom sessions
Statistical Displays
Slide of 19
Data sets appear in various scenarios in everyday life. Statistical displays are used to visualize data sets and some of their properties. They are also very useful when analyzing and making conclusions about data sets. This lesson will introduce different types of statistical displays and explain how to choose the best statistical display for different data sets.

Catch-Up and Review

Here are a few recommended readings before getting started with this lesson.

Challenge

Which Display to Choose?

Consider four different situations that involve a data set.

Situation 1 A tech company recently launched a new smartphone model. They conduct a survey where customers rate their satisfaction with the phone on a scale of 1 to 10.
Situation 2 A survey is conducted to gather data on the distribution of ages among participants in a community event.
Situation 3 A teacher wants to analyze the distribution of test scores in her class by finding the median and quartiles of the scores.
Situation 4 A study tracks the temperature variations over the course of a week in a particular city.
There are a few types of statistical displays that can be used to visualize data from these and similar situations. Which of the following four displays fit best each given situation? Click on the display names to see how they look.
Statistical Displays: histogram, dot plot, line graph, box plot
Discussion

Dot Plot

A dot plot, also known as a line plot, is a way to represent numerical or categorical data in which each data point is represented with a dot above a horizontal line with numbers or categories. Dots representing the same measurement are stacked above each other. Consider the following data set. { 4, 4, 3, 1, 4, 4, 1, 4 }

  • There are two 1s in this data set, so two dots are stacked above the number 1 on the number line of the corresponding dot plot.
  • There is one 3 in this data set, so a single dot is drawn above number 3.
  • There are five 4s in this data set, so five dots are stacked above number 4 on the number line.
A dot plot illustrating the data set given in the text.
For data sets containing more than 20 data points, dot plots are often inconvenient and other representations are preferred.
Example

Math Competition Results

Tadeo bought the latest edition of the scientific magazine Imagineer, which includes a lot of different graphs and math-related discussions on real-life topics. The first article discussed the results of a recent math competition among high school students.

Math competition results dot plot
a The article illustrated the scores the participants earned on a scale of 0 to 10. Interpret the dot plot by determining some of its characteristics, like the center and spread, and by analyzing its shape.
b For comparison, the article also listed the scores of the last year's participants. Construct a dot plot from the given data set.

Last Year's Results 1, 3, 4, 4, 6, 6, 6, 7, 7, 7, 8, 8, 9

Answer

a Median: 6

Range: 8
Shape: Symmetric

b Dot Plot
The dot plot

Hint

a Determine the median of the data set, then find the maximum and minimum values and the range of the data. Analyze the overall shape of the graph and any distinctive features it may have.
b Title the dot plot and draw a number line. Find the frequency of each value in the data set and place that number of dots over the corresponding number on the line.

Solution

a Interpreting a dot plot consists of three steps.
Step 1 Determine the center (median) by finding the middle data point.
Step 2 Find the maximum and minimum values on the graph. Use these values to calculate the spread (range) of the data.
Step 3 Analyze the overall shape of the graph. Note any other interest features it may have.

Complete each step one at a time.

Step 1

First, find the total number of data points in the set by counting all the dots in the given dot plot.

The dot plot

There are 15 data points, which suggests that there were 15 participants in the math competition that earned a score. The median of the data set is the 8th data point because it divides the set into two even halves. Locate this value on the dot plot.

The dot plot and the median equal 6

The 8th data point is a 6, which means that the median, or the center, of the data set is 6. This measure indicates the middle value of all the scores earned by the participants.

Step 2

The next step is to find the maximum and minimum values on the dot plot.

The dot plot with the minimum of 2 and the maximum of 10

The minimum score earned by a participant is 2 and the maximum score is 10. To find the range, calculate the difference between these values. Range=10-2=8 The range of scores earned by the participants of the math competition is 8.

Step 3

The final step is to analyze the overall shape of the graph.

The dot plot

Notice that the left and right halves of the dot plot are not exactly identical. However, since they still resemble each other very closely, the plot can be considered symmetric in shape.

b A dot plot can be constructed by following these three steps.
Step 1 Title the plot based on the problem. Draw a number line to begin the dot plot, being sure to use values that are appropriate for the data set.
Step 2 Determine the frequency of each value.
Step 3 Place dots over each number on the number line that corresponds to the frequency for each value in the data set.

Complete each step one at a time.

Step 1

First, think of a title for the dot plot. The given data set includes the scores of the participants from last year, so the title of the dot plot can be something like Scores From Last Year. Notice that the scores range from 1 to 9. 1, 3, 4, 4, 6, 6, 6, 7, 7, 7, 8, 8, 9 This means that the number line of the dot plot can be labeled with the consecutive integers from 1 to 9.

The dot plot

Step 2

The next step is to determine the frequency of each score earned by the participants from last year's math competition. Count how many times each score from 1 to 9 appears in the set and write that number in a table next to the score.

Grade Frequency
1 1
2 0
3 1
4 2
5 0
6 3
7 3
8 2
9 1

Step 3

Lastly, place dots over each score from 1 to 9 equal to the number of times it appears in the data set. This is where the frequency table is helpful.

The dot plot

The dot plot for the given data set was successfully created.

Discussion

Frequency Distribution

Dot plots are not the only type of frequency distribution.

A frequency distribution is a representation that displays the number of observations within a given interval or category. It is used to show the empirical or theoretical frequency of occurrence of each possible value in a data set, often recorded in a frequency table. Frequency distributions of categorical data are typically presented using a bar graph.

Bar graph
The most common types of distributions are symmetric frequency distribution and skewed frequency distribution.
Discussion

Histogram

A special type of frequency distribution is the histogram.

A histogram is a graphical illustration of a frequency distribution of a data set that contains numerical data. Histograms have several defining characteristics.

  • The data is grouped into specific ranges of values known as intervals.
  • All intervals in a histogram must be the same size.
  • Interval data is marked in groups along the horizontal axis.
  • A histogram is a collection of rectangles drawn above the intervals.
  • The height of each rectangle is proportional to the frequency of the data in the corresponding interval.
Consider an example situation. A grocery store wants to examine the weights of the apples they sell. To read the distribution, it is not necessary to show each apple's weight individually. Instead, the apples can be grouped by their weights in intervals of 10 grams: 70 -- 79g, 80 -- 89g, and so on.
A histogram showing the distribution of the weight of apples
A histogram looks similar to a bar graph. The difference is that a histogram always has intervals of numbers on the horizontal axis and the bars cannot have a space between each other because the data is continuous.
Discussion

Drawing a Histogram

Some data sets can be illustrated with a histogram. Consider the following data set. 13, 11, 4, 11, 21, 25, 37, 17, 8, 19, 26, 15 To draw a histogram for this data set, there are four steps to follow.
1
Choose the Number of Intervals
expand_more

The first step to drawing a histogram is deciding how many intervals it will have. Remember that each interval must be the same length and all data points must lie within one of intervals. First, count the number of data points in the set. cccccccccccc 13, & 11, & 4, & 11, & 21, & 25, 1 & 2 & 3 & 4 & 5 & 6 [0.2cm] 37, & 17, & 8, & 19, & 26, & 15 7 & 8 & 9 & 10 & 11 & 12 One method to finding a suitable number of intervals is to take the square root of the number of data points. Since there are 12 numbers in the data set, calculate the square root of 12. sqrt(12)=3.464101... This means that the histogram can have either three or four intervals. This example histogram will use four intervals.

2
Determine the Size of the Intervals
expand_more

Determine the size of the intervals. Start by identifying the lowest and highest data value in the set. Lowest:& 4 Highest:& 37 Since the lowest data value in the set is 4 and the highest is 37, using four intervals with a range of 10 will cover the numbers from 1 to 40 and, therefore, will encompass all data points. The intervals of the histogram will be the following. 1 -- 10, 11 -- 20, 21 -- 30, 31 -- 40

3
Make a Frequency Table
expand_more

The next step is to make a frequency table showing how many data points lie in each interval.

Interval Data Points Frequency
1-10 4, 8 2
11-20 11, 11, 13, 15, 17, 19 6
21-30 21, 25, 26 3
31-40 37 1
4
Draw the Histogram
expand_more

The histogram can be constructed by drawing a bar over each interval with a height corresponding to the frequency listed in the table.

Histogram
Example

Analyzing Ticket Sales

Another article in Imagineer focused on the upcoming opening of a new screen room in the local theater Movieton. The theater has had two screens for the last 10 years. The bar graph in the article shows the average number of tickets sold on each day of the week throughout the year 2023.

Bar graph showing the ticket sales from Monday to Sunday
a Interpret the bar graph to determine what day tends to have the most ticket sales. What is the average number of ticket sales on that day?
b The article also examined the distribution of ages of people attending the theater. As a reference, it listed the ages of moviegoers on a random evening. Draw a histogram for this data set.

24, 20, 23, 24, 9, 11, 42, 15, 17, 60, 18, 25, 26, 7, 28, 30, 32, 70, 45, 35, 37, 19, 13, 22, 49, 27, 21, 55, 39, 24

Answer

a Independent Variable: Day of the week

Dependent Variable: Number of tickets sold
Most Ticket Sales: Saturday with 342 tickets

b Histogram:
Histogram

Hint

a Identify the independent and dependent variables. List the frequencies of each bar and use the values to interpret the data.
b Choose the number of intervals and determine their sizes. Make a frequency table and use the values to draw the histogram.

Solution

a Interpreting a bar graph involves three steps.
Step 1 Identify the independent and dependent variables.
Step 2 List the frequency in each bar.
Step 3 Interpret the data and describe the bar graph's shape. Use the interpretation to answer any questions about the data.

Complete each step one at a time.

Step 1

First, the independent and dependent variables need to be identified. The horizontal line lists the days of a week, while the vertical line represents the average number of tickets sold.

Bar graph

The article analyzes how many tickets are sold on average on different days of a week and which day has the most sales. This means that the independent variable is the day of the week and the dependent variable is the average number of tickets sold on that day.

Step 2

Next, the frequency in each bar should be listed and interpreted. Use the values given in the bar graph to indicate the height of each bar. Remember, the vertical line represents the average number of tickets sold on each day.

Day Frequency
Monday 64
Tuesday 70
Wednesday 62
Thursday 137
Friday 295
Saturday 342
Sunday 260

Step 3

Lastly, interpret the data.

Bar graph

Fridays and Saturdays have the highest numbers of average tickets sold, 295 and 342 respectively. The greatest number of tickets tends to be sold on Saturdays, when an average of 342 tickets are sold.

b To draw a histogram, follow these four steps.
Step 1 Choose the number of intervals.
Step 2 Determine the size of the intervals.
Step 3 Make a frequency table.
Step 4 Draw the histogram.

Complete each step one at a time.

Step 1

The first step is determining how many intervals the histogram will have. All of the intervals must be the same length and the range covered by them must contain all of the data points in the set, so start by counting the number of values in the data set. 1 24, 2 20, 3 23, 4 24, 5 9, 6 11, 7 42, 8 15, 9 17, 10 60, 11 18, 12 25, 13 26, 14 7, 15 28, 16 30, 17 32, 18 70, 19 45, 20 35, 21 37, 22 19, 23 13, 24 22, 25 49, 26 27, 27 21, 28 55, 29 39, 30 24 To find a suitable number of intervals, take the square root of the number of data points. In this case, there are 30 data points. sqrt(30)=5.477225... This means that the histogram can have either five or six intervals. This solution will draw a histogram will have five intervals.

Step 2

Next, the size of the intervals needs to be determined. This can be done by identifying the lowest and highest data values in the set. Lowest:& 7 Highest:& 70 Since the lowest data value in the set is 7 and the highest is 70, using five intervals with a range of 15 will cover the numbers from 1 to 75, which will include all data points. Then the intervals of the histogram will be the following. 1 - 15, 16 - 30, 31 - 45, 46-60, 61-75

Step 3

The next step is to make a frequency table showing how many data points lie in each interval.

Interval Data Points Frequency
1-15 7, 9, 11, 13, 15 5
16-30 17, 18, 19, 20, 21, 22, 23, 24, 24, 24, 25, 26, 27, 28, 30 15
31-45 32, 35, 37, 39, 42, 45 6
46-60 49, 55, 60 3
61-75 70 1

Step 4

The histogram can be constructed by drawing a bar over each interval with a height corresponding to the frequency listed in the table.

Histogram
Discussion

Box Plot

Some statistical displays show the spread of a data set instead of its frequency.

A box plot, or box and whisker plot, can be used to illustrate the distribution of a data set. A box plot has three parts.

  • A rectangular box that extends from the first to the third quartiles (Q_1 and Q_3) with a line between Q_1 and Q_3 indicating the position of the median.
  • A segment attached to the left of the box that extends from the first quartile to the minimum of the data set.
  • A segment attached to the right of the box that extends from the third quartile to the maximum of the data set. The two segments are called whiskers.

If a data set has outliers, they are marked as separate points to the left and/or right of the whiskers. A box plot is a scaled figure and is usually presented above a number line. The set of numbers used to draw the box plot is called the five-number summary of the data set. Each of the five numbers is labeled below.

Boxplot shown above a number line with a five-number summary from left to right as 1, 3, 5, 8, 10.
A box plot provides a visual illustration of the distribution of a data set. Each segment of the plot contains one quarter, or 25 %, of the data, and the center 50 % of the data lies inside the box. The further apart the segments are, the greater the spread is for that quarter of the data.
Discussion

Drawing a Box Plot

One type of statistical display that is often used for illustrating data sets is the box plot.

A box plot can be used to display the distribution of a data set of numbers. To draw it, the minimum, maximum, median, and first and third quartiles of the data set need to be identified. The following data set gives the test scores of a particular class. 8.5, 11, 16, 12.5, 11, 15.5, 12, 7, 13, 10.5, 5, 15, 8, 9, 8, 8.5, 6 , 12, 15, 15.5, 13.5, 7.5, 13, 10.5, 11.5, 13.5 There are four steps to follow to draw a box plot for the data set.
1
Find the Minimum and Maximum
expand_more

Sometimes, the data is given in sequential order. When it is not, it is necessary to start by ordering the data points from least to greatest. 5, 6, 7, 7.5, 8, 8, 8.5, 8.5, 9, 10.5, 10.5, 11, 11, 11.5, 12, 12, 12.5, 13, 13, 13.5, 13.5, 15, 15, 15.5, 15.5, 16 Now the minimum and maximum are easily identifiable in this ordered data set. Note that outliers can be ignored when determining the minimum and maximum. There are no outliers in this data set, so the minimum is 5 and the maximum is 16. These values are marked above a number line with two short vertical segments, indicating the range of the box plot.

The number line and the beginning of the box plot showing maximum at 5 and minimum value at 16


2
Determine the Median
expand_more

To find the median of the data set, count the number of values in the set. 1 5, 2 6, 3 7, 4 7.5, 5 8, 6 8, 7 8.5, 8 8.5, 9 9, 10 10.5, 11 10.5, 12 11, 13 11, 14 11.5, 15 12, 16 12, 17 12.5, 18 13, 19 13, 20 13.5, 21 13.5, 22 15, 23 15, 24 15.5, 25 15.5, 26 16 There are 26 values in the set. The median of a data set is the value that lies in the middle of the set. Since there are 26 values in this set, the median is the mean of the numbers in the 13th and 14th positions. 1 5, 2 6, 3 7, 4 7.5, 5 8, 6 8, 7 8.5, 8 8.5, 9 9, 10 10.5, 11 10.5, 12 11, 13 11, 14 11.5, 15 12, 16 12, 17 12.5, 18 13, 19 13, 20 13.5, 21 13.5, 22 15, 23 15, 24 15.5, 25 15.5, 26 16 ⇓ Median=11+11.5/2=11.25 The median is 11.25. Mark this value with a vertical line segment in the range above the number line. Remember that the line for the median will later appear inside the box.

The unfinished box plot with the minimum value at 5, maximum value at 16, and median at 11.25
3
Determine the Quartiles
expand_more

The next step is to find the first and third quartiles of the data set. The median divides the set into two smaller sets, each with 13 values. Set1 1 5, 2 6, 3 7, 4 7.5, 5 8, 6 8, 7 8.5, 8 8.5, 9 9, 10 10.5, 11 10.5, 12 11, 13 11 [0.5em] Median: 11.25 [0.5em] 1 11.5, 2 12, 3 12, 4 12.5, 5 13, 6 13, 7 13.5, 8 13.5, 9 15, 10 15, 11 15.5, 12 15.5, 13 16 Set2 The first quartile is the median of the first set. Since there are 13 values in the set, the median is the 7th value, 8.5. Set1 1 5, 2 6, 3 7, 4 7.5, 5 8, 6 8, 7 8.5, 8 8.5, 9 9, 10 10.5, 11 10.5, 12 11, 13 11 The third quartile is the median of the second set. This will also be the 7th value, or 13.5. 1 11.5, 2 12, 3 12, 4 12.5, 5 13, 6 13, 7 13.5, 8 13.5, 9 15, 10 15, 11 15.5, 12 15.5, 13 16 Set2

4
Draw the Box Plot
expand_more

The first and third quartiles are marked as the left and right sides of the box plot.

The unfinished box plot with the minimum value at 5, maximum value at 16, and median at 11.25, first quartile at 8.5 and third quartile at 13.5.

The box plot can be completed by drawing a box between the first and third quartiles Q_1 and Q_3. Then, the left and right borders of the box are connected by horizontal segments to the minimum and maximum.

Boxplot shown above a number line with a five-number summary from left to right as 5, 8.5, 11.25, 13.5, 16.

The box plot is now complete.

Example

Air Pollution Levels

The next article in Imagineer focused on an analysis of air quality data collected from different urban and rural areas around the world. It included a box plot to visualize variations in pollution levels.

The box plot with the numbers 24, 39, 63.5, 83, 98
a The box plot presented in the article is based on scores from 0 to 100, where 100 is the greatest level of pollution. Write the five-number summary for the box plot.
b For comparison, the article also listed the air pollution levels in each of the considered areas 10 years ago. Construct a box plot from the given data set.

14, 26, 63, 42, 36, 22, 71, 18, 45, 55, 32, 9, 60, 48, 12, 69, 52, 25, 38, 7

Answer

a Minimum: 24

Maximum: 98
Median: 63.5
First Quartile: 39
Third Quartile: 83

b
The box plot with the numbers 7, 20, 37, 53.5, 71

Hint

a Determine the minimum, maximum, median, first quartile and third quartile of the data set.
b Draw two short vertical segments to mark the minimum and maximum above a number line. Mark the median with a vertical line segment inside the box. Use the first and third quartiles to find the left and right borders of the box.

Solution

a To write the five-number summary for the box plot, its minimum, maximum, median, first and third quartiles need to be determined.

Minimum= ? Maximum= ? Median= ? First Quartile= ? Third Quartile= ? Recall what each part of a box plot represents.

Boxplot shown above a number line with a five-number summary from left to right as 1, 3, 5, 8, 10

Now consider the given box plot again.

The box plot with the numbers 24, 39, 63.5, 83, 98

By comparing the general box plot to this one, the minimum, maximum, median, first and third quartiles can be easily identified. Minimum&= 24 Maximum&= 98 Median&= 63.5 First Quartile&= 39 Third Quartile&= 83 The values alone are not very helpful, so to contextualize the values, consider what each of them means in the given context.

Concept Value Meaning
Minimum 24 The lowest level of pollution in the analyzed areas earned 24 out of 100 points.
Maximum 98 The highest level of pollution in the analyzed areas earned 98 out of 100 points.
Median 63.5 The average pollution score in the analyzed areas is 63.5 out of 100.
First Quartile 39 A quarter of the analyzed areas have a pollution score of 39 or lower.
Third Quartile 83 A quarter of the analyzed areas have a pollution score of 83 or higher.
b A box plot can be constructed by following four steps.
Step 1 Order the data set from least to greatest. Identify the minimum and maximum values.
Step 2 Determine the median.
Step 3 Determine the first and third quartiles.
Step 4 Draw the box plot.

Complete each step one at a time.

Step 1

Start by ordering the given data values from least to greatest. 7, 9, 12, 14, 18, 22, 25, 26, 32, 36, 38, 42, 45, 48, 52, 55, 60, 63, 69, 71 Now the minimum and maximum are easily identifiable in this ordered data set. Here, the minimum is 7 and the maximum is 71.

Step 2

To find the median of the data set, count how many values there are in the data set. 1 7, 2 9, 3 12, 4 14, 5 18, 6 22, 7 25, 8 26, 9 32, 10 36, 11 38, 12 42, 13 45, 14 48, 15 52, 16 55, 17 60, 18 63, 19 69, 20 71 Now look for the value that lies in the middle of a data set. Since there are 20 values, the median is the mean of the numbers in the 10th and 11th positions. 1 7, 2 9, 3 12, 4 14, 5 18, 6 22, 7 25, 8 26, 9 32, 10 36, 11 38, 12 42, 13 45, 14 48, 15 52, 16 55, 17 60, 18 63, 19 69, 20 71 ⇓ Median=36+38/2=37 The median is 37.

Step 3

The next step is to find the first and third quartiles of the data set. The median divides the set into two smaller sets, each with 10 values. Set1 1 7, 2 9, 3 12, 4 14, 5 18, 6 22, 7 25, 8 26, 9 32, 10 36 [0.5em] Median: 37 [0.5em] 1 38, 2 42, 3 45, 4 48, 5 52, 6 55, 7 60, 8 63, 9 69, 10 71 Set2 The first quartile is the middle of the first set. For this set, this means finding the mean of the 5th and 6th values. Set1 1 7, 2 9, 3 12, 4 14, 5 18, 6 22, 7 25, 8 26, 9 32, 10 36 ⇓ Q_1=18+22/2=20 The third quartile is the median of the second set. For this half of the data set, the median is the mean of the 5th and 6th values. 1 38, 2 42, 3 45, 4 48, 5 52, 6 55, 7 60, 8 63, 9 69, 10 71 Set2 ⇓ Q_3=52+55/2=53.5

Step 4

To draw a box plot, organize the five-number summary of the data set. Minimum&= 7 Maximum&= 71 Median&= 37 Q_1&= 20 Q_3&= 53.5 Mark the minimum and maximum values above a number line with two vertical segments, indicating the range of the box plot.

The number line and the beginning of the box plot showing maximum at 7 and minimum value at 71

Next, mark the median with a vertical line segment inside the range above the number line. Remember that the line for the median falls inside the box.

The unfinished box plot with the minimum value at 7, maximum value at 71, and median at 37

The first and third quartiles are marked as the left and right sides of the box plot. The box plot can be completed by drawing a box between the quartiles and two horizontal segments between the left and right sides of the box and the minimum and maximum values.

The box plot with the numbers 7, 20, 37, 53.5, 71

The box plot is complete.

Discussion

Describing the Shape of a Distribution

The distribution of a data set shows the arrangement of data values. Here are a few concepts that can be used to describe a distribution.

Concept Definition
Cluster Data values that are grouped closely together
Gap Numbers that have no data values
Peak The most frequently occurring values, or the mode
Symmetry How the left side of the distribution looks compared to the right side
Outlier A data value that does not seem to fit with the rest of the set

Consider a distribution displayed with the following dot plot.

A dot plot illustrating a data set

Since the data is evenly distributed between the left and right sides, it is a symmetric distribution. It has a cluster of several data values within the 5-9 interval. There are gaps at 4 and 10 because there are no data values at these points on the number line. The value 7 is a peak because it is the most frequently occurring value. There appears to be an outlier at 15.

A dot plot illustrating a data set with the cluster, gap, peak, outlier marked
Discussion

Measures of Center and Spread

There are different measures of center and spread available for describing a data distribution. For example, measures of center are mean and median, while measures of spread are interquartile range and mean absolute deviation. Consider the following diagram to determine which measures to use to describe a .

Flow chart that says: Is the data distribution symmetric? If yes, then use the mean to describe the center. Use the mean absolute deviation to describe the spread. If no, then use the median to describe the center. Use the interquartile range to describe the spread.
Note that if there is an outlier in the data distribution, the distribution is usually not symmetric.
Example

Internet Usage Levels

Tadeo was especially interested in an article about Internet usage among teenagers. Curious to learn more, he decided to check out a website referenced in the article.

The dot plot for the numbers of hours spent on the internet
a Describe the shape of the distribution. Choose the appropriate measures to describe the center and spread of the distribution.
b The website also had a dot plot for the number of text messages sent by teenagers in one day.
A dot plot illustrating the number of text messages sent in one day

Choose the appropriate measures to describe the center and spread of the distribution. Describe the shape of the distribution.

Answer

a Shape: Symmetric

Measure of Center: Mean
Measure of Spread: Mean absolute deviation

b Shape: Not symmetric

Measure of Center: Median
Measure of Spread: Interquartile range

Hint

a A distribution is symmetric when the left side of the data is similar to the right side. Decide which measures of center and spread are the most appropriate based on the symmetry of the distribution.
b Determine if the distribution symmetric or not by analyzing the shape of the dot plot. Choose the most appropriate measures of center and spread based on the symmetry of the distribution.

Solution

a Begin by closely examining the dot plot showing the number of hours spent on the Internet by teenagers.
A dot plot illustrating the number of hours spent on the internet

The data is fairly evenly distributed between the left and right side, so it is a roughly symmetric distribution. The value 5 is a peak because it is the most frequently occurring value. The data values are clustered around 5.

A dot plot illustrating the number of text messages sent on one day with a peak identified

Next, decide which measures of center and spread are the most appropriate based on the symmetry of the distribution. Remember which measures can be used in each situation.

Measure of Center Measure of Spread
Symmetric Distribution Mean Mean absolute deviation
Non-Symmetric Distribution Median Interquartile range

Since this distribution is symmetric, it is best to use the mean as the measure of center and the mean absolute deviation as the measure of spread.

b Consider the dot plot that shows the number of text messages sent by teenagers in one day.
A dot plot illustrating the number of text messages sent on one day

Here, the left side of the data is different than the right side, so the distribution is not symmetric. There are two clusters of data values within the intervals 17-20 and 22-24, separated by a gap at 21. The peak of the data set is at 23.

A dot plot illustrating the number of text messages sent on one day with the clusters, gap and peak identified

The most appropriate measures of center and spread can be determined by looking at the symmetry of the distribution. When the distribution is not symmetric, as in this case, it is best to use the median as the measure of center and the interquartile range as the measure of spread.

Discussion

Line Graph

Analyzing data sets can involve more than just considering the frequency of the data values or analyzing their distribution. There is a statistical display that shows the relationship between two values.

A line graph is used to show how a set of data changes with respect to another quantity, often a period of time. To make a line graph, a scale and intervals for the coordinate axes are chosen. The data points are then graphed and a line connecting the points drawn. Consider a table of values that represents the growth of a plant over several weeks.

Plant Growth
Week 1 2 3 4 5
Height (in.) 1.5 2.3 4 6.2 8

The height data includes values from 1.5 to 8, so a scale from 0 to 10 inches with an interval of 1 inch is reasonable. The horizontal axis can represent time in weeks and the vertical axis can represent the plant height in inches. Now the points can be plotted on a coordinate plane and connected with a line.

Line graph
By observing the upward and downward slant of the lines connecting the points, the trends in the data can be described and future events can be predicted.
Example

A Trip to Grandma's

Tadeo was so inspired by analyzing all the graphs in the magazine that he decided to make his own diagram. Luckily, the next day he and his parents planned to visit his grandparents, who live 420 miles away. Tadeo recorded how far they had traveled after each hour of the drive.
A car traveling and a table of values gets filled with times and distances
a Make a line graph using the table of values.
b Interpret the line graph. About how long will the drive to Tadeo's grandparents take?

Answer

a
Line graph
b 6 hours

Hint

a Choose a scale and intervals for the axes and define what they will represent. Plot the points from the table and connect them with segments.
b Does the graph show an upward or downward trend? Extend it to predict when the family will travel a total distance of 420 miles.

Solution

a Before drawing the line graph, consider the table of values Tadeo created.
Hour Distance Traveled (mi)
1 70
2 135
3 203
4 278
5 348

The horizontal axis can represent time in hours and the vertical axis can represent the distance traveled in miles. The distance data includes values from 70 to 348, so a scale from 0 to 420 miles with an interval of 70 miles is reasonable. Now the points can be plotted on a coordinate plane and connected with line segments.

Line graph

The line graph representing the distance traveled by Tadeo's family is complete.

b It is time to interpret the line graph from Part A.
Line graph

Notice that the graph shows an upward slant of the line with a steady increase from 1 to 5 hours. To predict about how long the drive to Tadeo's grandparents will take, follow the trend of the line to extend the graph to a distance of 420 miles.

Line graph

It can be predicted Tadeo's family will reach their destination after a total of about 6 hours of driving.

Discussion

Different Types of Displays

This lesson has presented several statistical displays that can be used to represent a data set. To determine which display to choose, consider the following table.

Type of Display Best Used to...
Bar Graph ... show values corresponding to specific categories
Box Plot ... show measures of spread for a data set
Dot Plot ... show how many times each value occurs in the set
Histogram ... show the frequency of data divided into equal intervals
Line Graph ... show change over a period of time or in respect to a different quantity
Example

Life Expectancies in Different Countries

Tadeo's grandparents are 83 and 85 years old. Over the years, they have told Tadeo many fascinating stories about their lives. He began to wonder how his grandparents got to live such long and wonderful lives. Later he started wondering about the life expectancy in different countries, so he did a little investigation.

Country Life Expectancy
United States 76.3
Japan 84.5
Germany 80.9
Brazil 77.3
China 78.2
India 68.3
Australia 83.3
South Africa 62.4
a What is the most appropriate display for the given data set? Explain.
b Construct the display chosen in Part A.

Answer

a Bar graph
b
Bar graph

Hint

a Recall the various types of displays covered in this lesson and what they are best used for.
b Let the horizontal axis represent the countries and the vertical axis represent the life expectancies. Draw bars for each country as high as its life expectancy is.

Solution

a The most appropriate display needs to be selected for the given data set. Start by recalling the types of display covered in this lesson and when to use them.
Type of Display Best Used to...
Bar Graph ... show values corresponding to specific categories
Box Plot ... show measures of spread for a data set
Dot Plot ... show how many times each value occurs in the set
Histogram ... show the frequency of data divided into equal intervals
Line Graph ... show change over a period of time or in respect to a different quantity

The given data set lists countries and their corresponding life expectancies. After careful consideration of these types of displays, a bar graph looks like the best choice because it can show data that corresponds to specific categories — in this case, countries and life expectancies.

b The next step is to actually construct a bar graph for the given data set.
Country Life Expectancy
United States 76.3
Japan 84.5
Germany 80.9
Brazil 77.3
China 78.2
India 68.3
Australia 83.3
South Africa 62.4

The countries can be marked on the horizontal axis and the life expectancy can be marked on the vertical axis. Next, draw bars with heights equal to the life expectancies.

Bar graph
Closure

Choosing the Best-Fitting Displays

Recall the four situations mentioned at the beginning of the lesson.

Situation 1 A tech company recently launched a new smartphone model. They conduct a survey where customers rate their satisfaction with the phone on a scale of 1 to 10.
Situation 2 A survey is conducted to gather data on the distribution of ages among participants in a community event.
Situation 3 A teacher wants to analyze the distribution of test scores in her class by finding the median and quartiles of the scores.
Situation 4 A study tracks the temperature variations over the course of a week in a particular city.
Now consider the statistical displays covered in this lesson. If necessary, click on each name to see an graph for the display.
Statistical Displays: histogram, dot plot, line graph, box plot
To match each situation to a graph, analyze each situation one at a time.

Situation 1

In the first situation involves a survey where a group of customers rate their satisfaction with the phone on a scale of 1 to 10. This situation can be visualized with a diagram where the horizontal axis shows ratings from 1 to 10 with dots placed over the numbers corresponding to the frequency of each given rating. This description fits a dot plot.

A dot plot illustrating the satisfaction ratings with the new smartphone model

Situation 2

In the second situation, data on the distribution of ages of participants in a community event is collected. This data can be illustrated by showing the ages in different intervals on the horizontal axis and the number of people in the corresponding age interval on the vertical axis. This matches the description of a histogram.

Histogram showing the ages of participant of an event

Situation 3

The third situation involves analyzing the distribution of test scores in a class. The two remaining statistical displays are a box plot and a line graph. Recall what they are best used for.

Type of Display Best Ysed to...
Box Plot ... measures of spread for a data set
Line Graph ... show change over a period of time or in respect to a different quantity

Since the distribution of the test scores is to be illustrated, not the changes in scores over a period of time, a box plot seems like the best-fitting display for the situation.

Boxplot shown above a number line

Situation 4

The last situation includes tracking temperature variations over the course of a week. This data can be demonstrated with a line graph where the horizontal axis represents the days of the week and the vertical axis represents the temperature.

Line graph


Statistical Displays
Exercise 2.1
>
2
e
7
8
9
×
÷1
=
=
4
5
6
+
<
log
ln
log
1
2
3
()
sin
cos
tan
0
.
π
x
y