The mean or average of a data set is one representation of the center of the data set. To calculate the mean, add all the data points together, then divide by the number of data points.
Suppose a data set represents the heights of different towers. The mean of this data set gives an idea of a typical height. Calculating the mean could be seen as rearragning the blocks so that all the towers have the same height.
The mode is a measure of center that shows the most common value in a data set. For the setthe mode is since it occurs the most number of times. When there are two or more values that are just as common, there can be more than one mode. However, if all values only occur once, then the data set does not have a mode.
Find the mean, median and the mode for the following data.
We'll find each statistic one at a time.
To determine the median, we first write the data points in ascending order. The median is the value that lies directly in the middle. Since there is an even number of values, there is no single middle number. The median is then found by calculating the mean of the two middle numbers, and The median is
The mode is the value that is most common in the data set. In this case, it is , which occurs three times. Therefore, the mode is
An outlier is a data point that is significantly different than the other values in the data set. It can be significantly larger or significantly smaller than the others. The presence of an outlier can affect the mean (and in turn, the standard deviation) of a data set, but not necessarily the median or mode. Consider the following data set.Notice that most of the data points are between and However, there is one value significantly less than the others: Thus, it can be said that is an outlier.
There isn't anything Benji loves more than a bumble bee. Every day he keeps track of how many he sees. The data below gives the number of bumble bees Benji saw for the first days in April. On the th day, Benji sees bumble bees! Determine how this value affects the mean, median and mode of the data.
To begin, we should notice that bumble bees is significantly higher than the number of bees Benji saw on any other day. Thus, is probably an outlier. To see how this outlier affects the measures of center, we can calculate all three with and without the outlier.
Each of the three measures of center, mean, median, and mode, has advantages and drawbacks.
The mean of a data set takes into account the actual value of every data point. This means it can be significantly impacted by an outlier. Thus, a benefit of the mean as a measure of center is that is truly represents all data points in a data set. One drawback, though, is how tedious the calculation of the mean can be for large data sets.
In contrast to the mean, finding the median is a relatively simple process. In some cases, one might wish to exclude or avoid the effects of an outlier. Since the median considers the number of data points and not the actual values, this statistic might be a better choice. However, if it is not intentionally chosen as the measure of center, the analysis of the data set it gives can be limited.