Reference

Measures of Spread

Concept

Measure of Spread

A measure of spread is a way of quantifying how spread out, or different, the points in a data set are. Some commonly used measures of spread are range, interquartile range, and standard deviation. Measures of spread are often used together with measures of center to give an idea both of what a typical value is and how much the data can be expected to deviate from it.

Interactive applet where points of the dot plot can be moved around.

Move the points around in the dot plot to generate new data. The applet identifies the range, interquartile range, mean absolute deviation, and standard deviation of the data set.

Concept

Range

Range is a measure of spread that measures the difference between the maximum and minimum values of the data set.

Concept

Quartile

Quartiles are three values that divide a data set into four equal parts. The quartiles are denoted as

Q_{1},

Q_{2},

and

Q_{3} .

The second quartile

Q_{2},

also known as the median, divides the ordered data set into two halves.

Lower half a b c Q_{2} ↑ d Upper half e f g

The median of the lower half is the first quartile

Q_{1},

while the median of the upper half is the third quartile

Q_{3} .

Lower half a b c ↓ Q_{1} Q_{2} ↑ d Upper half e f g ↓ Q_{3}

The first quartile is also called lower quartile, and the third quartile is also called upper quartile. To find the quartiles of a data set, the values must first be written in numerical order.

Example of how three quartiles can be identified in a set

The difference between the upper and lower quartiles is the interquartile range.

Concept

Interquartile Range

The interquartile range, or IQR, of a data set is a measure of spread that measures the difference between $Q_{3}$ and $Q_{1},$ the upper and lower quartiles.

IQR $= Q_{3} - Q_{1}$

The following applet shows how to find the IQR of different data sets.

Applet that calculates the interquartile range of a data set

Concept

Mean Absolute Deviation

The mean absolute deviation (MAD) is a measure of the spread of a data set that measures how much the data elements differ from the mean. The mean absolute deviation is the average distance between each data value and the mean.

MAD = \frac{∣ x _{1} - x ∣ + ∣ x _{2} - x ∣ + \dots + ∣ x _{n} - x ∣}{n}

Calculating the MAD involves determining the absolute difference between every data point and the mean, followed by averaging these absolute differences. The applet below calculates the mean absolute deviation for the data set on the number line. Move the points around to change the data.

Applet to calculate the mean absolute deviation

A large MAD value indicates that data points deviate considerably from the mean — that is, there is significant variation within the data set. The mean absolute deviation is less sensitive to outliers compared to standard deviation and variance because it calculates the average of the absolute differences.

Concept

Standard Deviation

The standard deviation is a measure of spread of a data set that measures how much the data elements differ from the mean. The standard deviation, often represented by the Greek letter

σ

(sigma), is calculated by taking the square root of the variance of the data set. Let

x_{1}, x_{2}, \dots, x_{n}

be the data values in a set and

\overline{x}

their mean.

σ = \frac{( x _{1} - x ) ^{2} + ( x _{2} - x ) ^{2} + \dots + ( x _{n} - x ) ^{2}}{n}

The applet below calculates the standard deviation for the data set on the number line. Move the points around to change the data.

Applet that calculates the standard deviation of a set of five numbers

As shown, finding the standard deviation involves calculating the average of the squared differences between each data point and the mean, and then taking the square root of that average. The sum of squares can also be written in sigma notation.

σ = \frac{i = 1 \sum n ( x _{i} - x ) ^{2}}{n}

Standard deviation is sensitive to outliers because of the squaring of differences. It is commonly used when analyzing a data set that exhibits a normal distribution.

Concept

Variance

The variance is a measure of spread of a set of data that measures how much the data elements deviate from the mean. Mathematically, the variance is the average of the squares of the difference between each data value

x_{i}

and the mean

\overline{x} .

\frac{( x _{1} - x ) ^{2} + ( x _{2} - x ) ^{2} + \dots + ( x _{n} - x ) ^{2}}{n}

The variance is the square of the standard deviation

σ,

so it is usually denoted as

σ^{2} .

σ = \frac{( x _{1} - x ) ^{2} + ( x _{2} - x ) ^{2} + \dots + ( x _{n} - x ) ^{2}}{n} ⇕ σ^{2} = \frac{( x _{1} - x ) ^{2} + ( x _{2} - x ) ^{2} + \dots + ( x _{n} - x ) ^{2}}{n}

The applet below calculates the variance in the data set on the number line. Points can be moved to change the data.

Recommended exercises