Sign In
When is a value considered an outlier?
| Data set | Mean | Median | Mode |
|---|---|---|---|
| With Outlier | 101 | 102 | 110 |
| Without Outlier (64) | 105.625 | 106 | 110 |
The Most Affected Measure: Mean
If a value in a data set is more than 1.5 times the interquartile range away from the lower or upper quartiles, it is considered an outlier. This is why, to identify any outliers, we first have to find these statistical measures, including any outliers.
We want to find the mean, median, and mode of the given data set. 101, 110, 99, 100, 64, 112, 110, 111, 102 Let's begin by calculating the mean.
The mean of a data set is the sum of the values divided by the total number of values in the set. Let's start by calculating the sum of the given values. 101+ 110+ 99+ 100+ 64+ 112+ 110+ 111+ 102 = 909 There are 9 values in our set, so we have to divide the sum by 9. Mean: 909/9=101 The mean is 101. We can continue by finding the median.
Median : 102 The last measure we need is the mode. Let's find it!
The mode is the value or values that appear most often in a set of data. Arranging the data set from least to greatest makes it easier to see how often each value appears. Let's arrange the values before we find the mode. 64, 99, 100, 101, 102, 110, 110, 111, 112 We can see that the most common value in the given data set is 110. This is the mode of our data set. Mode: 110
To identify any outliers, we have to calculate the interquartile range (IQR). To do this let's recall some information about the quartiles first!
Q_1= 99.5, IQR= 11
Multiply
Subtract term
Q_1= 110.5, IQR= 11
Multiply
Add terms
Let's repeat the process, this time excluding 64 from the data set.
Let's start by calculating the sum of the given values without 64. 99+ 100+ 101+ 102+110+ 110+ 111+ 112 = 845 There are 8 values in our set, because we excluded an outlier — 64. Mean: 845/8=105.625
Recall the ordered set with 64 excluded. 99, 100, 101, 102, 110,110, 111, 112 This time the number of values in our set is 8. This is why there are 2 middle values. The median is the mean of them. Median : 102+ 110/2= 106
The most repeated value is stil 110. The mode of the set remains the same. Mode: 110
Finally, we summarize our findings in the table below so it is easier to compare the results.
| Data set | Mean | Median | Mode |
|---|---|---|---|
| With Outlier | 101 | 102 | 110 |
| Without Outlier (64) | 105.625 | 106 | 110 |
We can see that removing the outlier modified the mean and the median. They both increased but the mean increased more than the median. This is because the outlier was excluded with the minimum value. This set without outlier still has the same mode.