mathleaks.com mathleaks.com Start chapters home Start History history History expand_more Community
Community expand_more
{{ filterOption.label }}
{{ item.displayTitle }}
{{ item.subject.displayTitle }}
arrow_forward
{{ searchError }}
search
{{ courseTrack.displayTitle }}
{{ printedBook.courseTrack.name }} {{ printedBook.name }}

# Finding Measures of Center

Measures of center are used to give an idea of a typical value of a data set. Instead of presenting all the data points, a measure of center can be used to represent the entire set.

## Measure of Center

A measure of center is a statistic that summarize a data set by finding its center. The most common measures of center are mean, median and mode.

## Mean

The mean or average of a data set is one representation of the center of the data set. To calculate the mean, add all the data points together, then divide by the number of data points.

mean

Suppose a data set represents the heights of different towers. The mean of this data set gives an idea of a typical height. Calculating the mean could be seen as rearragning the blocks so that all the towers have the same height.

Animate the mean value

Reset

After the blocks are rearranged, the towers each have a height of 4. Therefore, the mean is 4. If the heights are written as x, then the mean is sometimes written as The towers' mean height can then be written as

## Median

The median is a measure of center that lies in the middle of a data set when it is written in numerical order. For an odd number of data points, there is only one value in the middle, and that is the median. For an even number of data points, the median is the mean of the two middle numbers.

## Mode

The mode is a measure of center that shows the most common value in a data set. For the set
4,3,1,2,4,4,5,4,
the mode is 4, since it occurs the most number of times. When there are two or more values that are just as common, there can be more than one mode. However, if all values only occur once, then the data set does not have a mode.
fullscreen
Exercise
Find the mean, median and the mode for the following data.
Show Solution
Solution
We'll find each statistic one at a time.

### Mean

In order to determine the mean, we need to add all the data points together. Then we divide the sum by the total number of points, which, in this case, is 10.
The mean is 5.9.

### Median

To determine the median, we first write the data points in ascending order.
The median is the value that lies directly in the middle. Since there is an even number of values, there is no single middle number. The median is then found by calculating the mean of the two middle numbers, 5 and 7.
The median is 6.

### Mode

The mode is the value that is most common in the data set. In this case, it is 9, which occurs three times. Therefore, the mode is 9.

## Outlier

An outlier is a data point that is significantly different than the other values in the data set. It can be significantly larger or significantly smaller than the others. The presence of an outlier can affect the mean (and in turn, the standard deviation) of a data set, but not necessarily the median or mode. Consider the following data set.
Notice that most of the data points are between 7 and 11. However, there is one value significantly less than the others: 4. Thus, it can be said that 4 is an outlier.
fullscreen
Exercise
There isn't anything Benji loves more than a bumble bee. Every day he keeps track of how many he sees. The data below gives the number of bumble bees Benji saw for the first 13 days in April.
On the 14th day, Benji sees 7 bumble bees! Determine how this value affects the mean, median and mode of the data.
Show Solution
Solution

To begin, we should notice that 7 bumble bees is significantly higher than the number of bees Benji saw on any other day. Thus, 7 is probably an outlier. To see how this outlier affects the measures of center, we can calculate all three with and without the outlier.

#### Excluding 7

First, we'll exclude the outlier. The mean can be calculated as follows.
Thus, the mean is approximately 1.85. To find the median, we must write the data points in ascending order. The median will fall directly in the middle of the points.
Thus, 2 is the median. Most often, Benji only sees 1 bumble bee. Thus, 1 is the mode.

#### Including 7

Next, we'll perform the same steps including the outlier. The mean can be calculated as follows.
Thus, the mean is approximately 2.38. To find the median, we'll add 7 to the end of the ordered list from above. The median still lies directly in the middle.
Now that there are an even number of data points, the median will be the mean of the two middle numbers. Since these are the same, the median is 2. Although the additional data point shifted the location of the median, notice that the value did not change. Lastly, seeing 1 bumble bee still occurred most. Thus, the mode is 1.
Adding an additional data point, even when it was such an extreme data point, only affected the mean. The median and mode remained the same. This is because the mean is the only measure of center that takes into account the actual data points rather than just the number of data points. Thus, the outlier affects the mean.

## Benefits and Drawbacks of the Measures of Center

Each of the three measures of center, mean, median, and mode, has advantages and drawbacks.

### Mean

The mean of a data set takes into account the actual value of every data point. This means it can be significantly impacted by an outlier. Thus, a benefit of the mean as a measure of center is that is truly represents all data points in a data set. One drawback, though, is how tedious the calculation of the mean can be for large data sets.

### Median

In contrast to the mean, finding the median is a relatively simple process. In some cases, one might wish to exclude or avoid the effects of an outlier. Since the median considers the number of data points and not the actual values, this statistic might be a better choice. However, if it is not intentionally chosen as the measure of center, the analysis of the data set it gives can be limited.

### Mode

Out of all three measures of center, the mode is the simplest to determine. Additionally, it can be used when data is categorical. However, the mode emphasizes few data points — usually 1 — while ignoring all others. This can be considered a drawback when using the mode to determine a typical value in a data set.