Big Ideas Math: Modeling Real Life, Grade 6
BI
Big Ideas Math: Modeling Real Life, Grade 6 View details
3. Measures of Center
Continue to next subchapter

Exercise 26 Page 431

When is a value considered an outlier?

Data set Mean Median Mode
With Outlier 48.5 53 No mode
Without Outlier (17) 53 54 No mode

The Most Affected Measure: Mean

Practice makes perfect

If a value in a data set is more than 1.5 times the interquartile range away from the lower or upper quartiles, it is considered an outlier. This is why, to identify any outliers, we first have to find these statistical measures, including any outliers.

Analyzing the Data With Any Outliers

We want to find the mean, median, and mode of the given data set. 45, 52, 17, 63, 57, 42, 54, 58 Let's begin by calculating the mean.

Mean

The mean of a data set is the sum of the values divided by the total number of values in the set. Let's start by calculating the sum of the given values. 45+ 52+ 17+ 63+ 57+ 42+ 54+ 58 = 388 There are 8 values in our set, so we have to divide the sum by 8. Mean: 388/8=48.5 The mean is 48.5. We can continue by finding the median.

Median

When the data are arranged in numerical order, the median is the middle value — or the mean of the two middle values — in a set of data. Let's arrange the given values and find the median. 17, 42, 45, 52, 54, 57, 58, 63 The number of values in our set is 8. This means that there are 2 middle values. This is why the median is the mean of two middle values.

Median : 52+ 54/2 = 53 The last measure we need is the mode. Let's find it!

Mode

The mode is the value or values that appear most often in a set of data. Arranging the data set from least to greatest makes it easier to see how often each value appears. Let's take a look at the ordered set one more time! 17, 42, 45, 52, 54, 57, 58, 63 Since the data set does not contain any repeated values, there is no mode. Mode: This set does not have a mode

Identifying Outliers

To identify any outliers, we have to calculate the interquartile range (IQR). To do this, let's recall some information about the quartiles first!

  • Second Quartile (Q_2) is the median of the data set. It divides the set of data into two halves.
  • Lower Quartile (Q_1) is the median of the lower half of the data set.
  • Upper Quartile (Q_3) is the median of the upper half of the data set.
  • Interquartile Range is the difference between the upper quartile and the lower quartile (Q_3-Q_1).
Let's start by recalling the ordered data set from least to greatest value! 17, 42, 45, 52 | 54, 57, 58, 63 The median of the set is 53. This value divides the set into two halves. We have two middle values for each half. Thus, we need to calculate the mean of those middle values. Upper Quartile:& 57+ 58/2=57.5 Lower Quartile:& 42+ 45/2= 43.5 The next step to calculate the interquartile range is to calculate the difference between the upper and lower quartiles. Let's do it! Interquartile Range:& 57.5- 43.5= 14 Next, we need to determine the maximum and minimum values for data to be considered an outlier. Outliers are more than 1.5 times the IQR away from the upper and lower quartiles. Let's break it into two steps and start with the minimum value. Outlier < Q_1-1.5*IQR Let's substitute 43.5 for Q_1 and 14 for IQR.
Q_1 - 1.5 * IQR
43.5 - 1.5 * 14
43.5 - 21
22.5
The only value less than 22.5 is 17. This means that 17 is an outlier. Now let's check if there are other outliers by calculating the maximum value. Outlier > Q_2+1.5*IQR We can substitute 57.5 for Q_2 and 14 for IQR.
Q_2 + 1.5 * IQR
57.5 + 1.5 * 14
57.5 + 21
78.5
Our data set does not contain values greater than 78.5. This means that there are no more outliers than 17. We are ready to analyze the data set without outliers!

Analyzing the Data Without the Outliers

Let's repeat the process, this time excluding 17 from the data set.

Mean

Let's start by calculating the sum of the given values without 17. 42+ 45+ 52+ 54+ 57+ 58+ 63 = 371 There are 7 values in our set, because we excluded an outlier — 17. Mean: 371/7=53

Median

Recall the ordered set with 17 excluded. 42, 45, 52, 54, 57, 58, 63 This time the number of values in our set is 7. This is why there is a middle value. The median is 54. Median : 54

Mode

Since the data set still does not contain any repeated values, there is no mode. Mode: This set does not have a mode

Summary

Finally, we summarize our findings in the table below so it is easier to compare the results.

Data set Mean Median Mode
With Outlier 48.5 53 No mode
Without Outlier (17) 53 54 No mode

We can see that removing the outlier modified the mean and the median. They both increased but the mean increased more than the median. This set without the outlier still does not have a mode.