Sign In
When is a value considered an outlier?
Outlier: 22.2
| Data set | Mean | Median | Mode |
|---|---|---|---|
| With Outlier | ≈ 60.738 | 58 | - |
| Without Outlier (12) | ≈ 63.95 | 60 | - |
If a value in a data set is more than 1.5 times the interquartile range away from the lower or upper quartiles, it is considered an outlier. Therefore, to identify any outliers, we first have to find these statistical measures including any outliers.
Let's organize the information in a list using a graphing calculator. Press STAT, and choose Edit.
Enter all of the data into the first list.
Next, analyze the data by pressing the STAT button again and navigating to the CALC menu. Press ENTER once to select the 1-Var Stats
option and then again to select the list in which we entered the data (usually L1). This will produce most of the statistical measures we are looking to find.
Let's write the obtained mean, lower quartile, median, and upper quartile.
rcl
x:& 60.738 &(mean)
Q_1:& 50.45 &(lower quartile)
med:& 58 &(median)
Q_3:& 72 &(upper quartile)
To find the mode, we have to identify the value that occurs most frequently.
To identify any outliers, we start by calculating the interquartile range (IQR). This is the difference between the upper and lower quartiles. Q_3- Q_1= IQR ⇕ 72- 50.45= 21.55 Next, we need to determine the maximum and minimum values for data to be considered outlier. Outliers are more than 1.5 times the IQR away from the upper and lower quartiles. Q_1-1.5*IQR:& 50.45-1.5( 21.55)= 18.125 Q_3+1.5*IQR:& 72+1.5( 21.55)= 104.325 These calculations tell us that any values lower than 18.125 or greater than 104.325 are outliers. Our list has no values less than 18.125 or greater than 104.325. However, the differences between the maximum 99.9 and the minimum 22.2, which are very close to being outliers, and the rest of the values are significant enough to consider these values as outliers.
Let's repeat the process, excluding 22.2 from the data set. Type the new data set into another list, L2.
Once you have finished adding the new data set into L2, analyze the data the same way as before. After selecting the 1-Var Stats
option, however, remember to choose L2 by pushing 2nd and 2.
Examining the new output will give us most of the desired statistical measures. rcl x:& 63.95 &(Mean) med:& 60 &(Median) Note that there is still no value that occurs more frequently than others. Therefore, there is still no mode of the data set.
Finally, we summarize our findings in the table below so it is easier to compare the results.
| Data set | Mean | Median | Mode |
|---|---|---|---|
| With Outlier | ≈ 60.738 | 58 | - |
| Without Outlier (12) | ≈ 63.95 | 60 | - |
We can see that removing the outlier modified the mean and the median. Without the outlier, the mean and the median increased.