Big Ideas Math: Modeling Real Life, Grade 6
BI
Big Ideas Math: Modeling Real Life, Grade 6 View details
3. Measures of Center
Continue to next subchapter

Exercise 29 Page 431

When is a value considered an outlier?

Data set Mean Median Mode
With Outlier 101 102 110
Without Outlier (64) 105.625 106 110

The Most Affected Measure: Mean

Practice makes perfect

If a value in a data set is more than 1.5 times the interquartile range away from the lower or upper quartiles, it is considered an outlier. This is why, to identify any outliers, we first have to find these statistical measures, including any outliers.

Analyzing the Data With Any Outliers

We want to find the mean, median, and mode of the given data set. 101, 110, 99, 100, 64, 112, 110, 111, 102 Let's begin by calculating the mean.

Mean

The mean of a data set is the sum of the values divided by the total number of values in the set. Let's start by calculating the sum of the given values. 101+ 110+ 99+ 100+ 64+ 112+ 110+ 111+ 102 = 909 There are 9 values in our set, so we have to divide the sum by 9. Mean: 909/9=101 The mean is 101. We can continue by finding the median.

Median

When the data are arranged in numerical order, the median is the middle value — or the mean of the two middle values — in a set of data. Let's arrange the given values and find the median. 64, 99, 100, 101, 102, 110, 110, 111, 112 The number of values in our set is 9. This means that there is one middle value. This is why the median is 102.

Median : 102 The last measure we need is the mode. Let's find it!

Mode

The mode is the value or values that appear most often in a set of data. Arranging the data set from least to greatest makes it easier to see how often each value appears. Let's arrange the values before we find the mode. 64, 99, 100, 101, 102, 110, 110, 111, 112 We can see that the most common value in the given data set is 110. This is the mode of our data set. Mode: 110

Identifying Outliers

To identify any outliers, we have to calculate the interquartile range (IQR). To do this let's recall some information about the quartiles first!

  • Second Quartile (Q_2) is the median of the data set. It divides the set of data into two halves.
  • Lower Quartile (Q_1) is the median of the lower half of the data set.
  • Upper Quartile (Q_3) is the median of the upper half of the data set.
  • Interquartile Range is the difference between the upper quartile and the lower quartile (Q_3-Q_1).
Let's start by recalling the ordered data set from least to greatest value! 64, 99, 100, 101, 102, 110, 110, 111, 112 The median of the set is 102. This value divides the set into two halves. We have two middle values for each half. Thus, we need to calculate the mean of those middle values. Upper Quartile:& 110+ 111/2=110.5 Lower Quartile:& 99+ 100/2= 99.5 The next step to calculate the interquartile range is to calculate the difference between the upper and lower quartiles. Let's do it! Interquartile Range:& 110.5- 99.5= 11 Next, we need to determine the maximum and minimum values for data to be considered an outlier. Outliers are more than 1.5 times the IQR away from the upper and lower quartiles. Let's break it into two steps and start with the minimum value. Outlier < Q_1-1.5*IQR Let's substitute 99.5 for Q_1 and 11 for IQR.
Q_1 - 1.5 * IQR
99.5 - 1.5 * 11
99.5 - 16.5
83
The only value less than 83 is 64. This means that 64 is an outlier. Now let's check if there are other outliers by calculating the maximum value. Outlier > Q_2+1.5*IQR We can substitute 110.5 for Q_2 and 11 for IQR.
Q_2 + 1.5 * IQR
110.5 + 1.5 * 11
110.5 + 16.5
127
Our data set does not contain values greater than 127. This means that there are no more outliers than 64. We are ready to analyze the data set without outliers!

Analyzing the Data Without the Outliers

Let's repeat the process, this time excluding 64 from the data set.

Mean

Let's start by calculating the sum of the given values without 64. 99+ 100+ 101+ 102+110+ 110+ 111+ 112 = 845 There are 8 values in our set, because we excluded an outlier — 64. Mean: 845/8=105.625

Median

Recall the ordered set with 64 excluded. 99, 100, 101, 102, 110,110, 111, 112 This time the number of values in our set is 8. This is why there are 2 middle values. The median is the mean of them. Median : 102+ 110/2= 106

Mode

The most repeated value is stil 110. The mode of the set remains the same. Mode: 110

Summary

Finally, we summarize our findings in the table below so it is easier to compare the results.

Data set Mean Median Mode
With Outlier 101 102 110
Without Outlier (64) 105.625 106 110

We can see that removing the outlier modified the mean and the median. They both increased but the mean increased more than the median. This is because the outlier was excluded with the minimum value. This set without outlier still has the same mode.