Sign In
12 does not fit the data, see solution.
n|mean]] and the median of the data set. We will do these things one at a time, starting with recalling the definitions of median
| Measure | Definition | Example |
|---|---|---|
| Median | The middle observation in the data set written in numerical order. If the number of observations is even, it is the average of the two middle numbers. | Data Set: 3, 8, 12, 20, 21 Median: 12 |
| Mean | The sum of all values in the data set divided by the number of values. | Data Set: 3, 8, 12, 20, 21 Mean: 3+8+12+20+21/5= 12.8 |
Keeping this in mind, let's take a look at the given data set. 62, 65, 93, 51, 55, 12, 79, 85, 55, 72, 78, 83, 91, 76 Next, let's rewrite this data set in numerical order. This way, it will be easier for us to find any values that are unusual. 12, 51, 55, 55, 62, 65, 72, 77, 78, 79, 83, 85, 91, 93 As we can see, there is only one value that is far from the others, 12. We can say that it is an outlier in the given data set. Next to find out how the mean and the median change with outlier we will calculate them with and without 12. First, let's calculate the mean.
| Outlier | Data Set | Value |
|---|---|---|
| Included | 62, 65, 93, 51, 55, 12, 79, 85, 55, 72, 78, 83, 91, 76 | 867/14 ≈ 61.93 |
| Not Included | 62, 65, 93, 51, 55, 79, 85, 55, 72, 78, 83, 91, 76 | 855/13 ≈ 65.77 |
As we can see, the change in the mean is noticeable. When we include an outlier, the mean lowers by almost 4! Now, let's calculate the median for both sets. Notice that the original set has 14 observations. Because of this, we will take the average of the two middle numbers.
| Outlier | Ordered Data Set | Value |
|---|---|---|
| Included | 12, 51, 55, 55, 62, 65, 72, 77, 78, 79, 83, 85, 91, 93 | 72+77/2=74.5 |
| Not Included | 51, 55, 55, 62, 65, 72, 77, 78, 79, 83, 85, 91, 93 | 77 |
The change in median is 77-74.5=2.5. Finally, we can say that an outlier changes both the mean and the median, but the change in the median is smaller. Therefore, when there are outliers in the data, it is better to use the median than the mean.