Sign In
When is a value considered an outlier?
Outlier: 54
| Data set | Mean | Median | Mode |
|---|---|---|---|
| With Outlier | 22.8 | 19.5 | 18 |
| Without Outlier | 19.3 | 19 | 18 |
If a value in a data set is more than 1.5 times the interquartile range away from the lower or upper quartiles, it is considered an outlier. Therefore, to identify any outliers we first have to find these statistical measures, including any outliers.
Consider the given data set.
16, 19, 21, 18, 18, 54, 20, 22, 23, 17
Let's organize the information in a list using a graphing calculator. Press STAT and choose Edit.
Enter all of the data into the first list.
Next, analyze the data by pressing the STAT button again and navigating to the CALC menu. Press ENTER once to select the 1-Var Stats
option and then again to select the list in which we entered the data (usually L1). This will produce most of the statistical measures we are looking to find.
We can see from our calculator that the mean x is 22.8. To find the median and quartiles we need to scroll down.
Examining the output will give us most of the desired statistical measures. rcl x:& 22.8 &(Mean) Q_1:& 18 &(First Quartile) Med:& 19.5 &(Median) Q_3:& 22 &(Third Quartile) In order to find the mode, we have to identify the value that occurs most frequently. 16, 19, 21, 18, 18, 54, 20, 22, 23, 17 We see that 18 occurs the most frequently, so this is the mode of the given data set.
To identify any outliers, we start by calculating the interquartile range (IQR). This is the difference between the first and third quartiles. Q_3- Q_1= IQR ⇔ 22- 18= 4 Next, we need to determine the maximum and minimum values for data to be considered an outlier. Outliers are more than 1.5 times the IQR away from the upper and lower quartiles. Q_1-1.5*IQR& ⇔ 18-1.5( 4)= 12 Q_3+1.5*IQR& ⇔ 22+1.5( 4)= 28 These calculations tell us that any values lower than 12 or greater than 28 are outliers. Since 54>28, 54 is an outlier.
Let's repeat the process, excluding 54 from the data set. We will type the new data set into another list, L2.
Once we have finished adding the new data set into L2, we can analyze the data the same way as before. After selecting the 1-Var Stats
option, however, we must remember to choose L2 by pushing 2nd and 2.
Examining the new output will give us most of the desired statistical measures. rcl x:& 19.3 &(Mean) Med:& 19 &(Median) Note that 18 is still the most frequently occurring value. Therefore, the mode of the data set remains the same.
Finally, we summarize our findings in the table below so it is easier to compare the results.
| Data set | Mean | Median | Mode |
|---|---|---|---|
| With Outlier | 22.8 | 19.5 | 18 |
| Without Outlier | 19.3 | 19 | 18 |
We can see that removing the outlier modified the mean and the median.