{{ tocSubheader }}

{{ 'ml-label-loading-course' | message }}

{{ tocSubheader }}

{{ 'ml-toc-proceed-mlc' | message }}

{{ 'ml-toc-proceed-tbs' | message }}

An error ocurred, try again later!

Chapter {{ article.chapter.number }}

{{ article.number }}. # {{ article.displayTitle }}

{{ article.intro.summary }}

Show less Show more Lesson Settings & Tools

| {{ 'ml-lesson-number-slides' | message : article.intro.bblockCount }} |

| {{ 'ml-lesson-number-exercises' | message : article.intro.exerciseCount }} |

| {{ 'ml-lesson-time-estimation' | message }} |

Reference

Concept

A measure of spread is a way of quantifying how spread out, or different, the points in a data set are. Some commonly used measures of spread are range, interquartile range, and standard deviation. Measures of spread are often used together with measures of center to give an idea both of what a typical value is and how much the data can be expected to deviate from it.

Move the points around in the dot plot to generate new data. The applet identifies the range, interquartile range, mean absolute deviation, and standard deviation of the data set.

Concept

Range is a measure of spread that measures the difference between the maximum and minimum values of the data set.

Concept

Quartiles are three values that divide a data set into four equal parts. The quartiles are denoted as $Q_{1},$ $Q_{2},$ and $Q_{3}.$ The second quartile $Q_{2},$ also known as the median, divides the ordered data set into two halves.

$Lower halfabc Q_{2}↑d Upper halfefg $

The median of the lower half is the first quartile $Q_{1},$ while the median of the upper half is the third quartile $Q_{3}.$ $Lower halfabc↓Q_{1} Q_{2}↑d Upper halfefg ↓Q_{3} $

The first quartile is also called lower quartile, and the third quartile is also called upper quartile. To find the quartiles of a data set, the values must first be written in numerical order.
The difference between the upper and lower quartiles is the interquartile range.

Concept

The interquartile range, or **IQR**, of a data set is a measure of spread that measures the difference between $Q_{3}$ and $Q_{1},$ the upper and lower quartiles.

IQR$=Q_{3}−Q_{1}$

The following applet shows how to find the IQR of different data sets.

Concept

The mean absolute deviation (MAD) is a measure of the spread of a data set that measures how much the data elements differ from the mean. The mean absolute deviation is the average distance between each data value and the mean.

$MAD=n∣x_{1}−x∣+∣x_{2}−x∣+⋯+∣x_{n}−x∣ $

A large MAD value indicates that data points deviate considerably from the mean — that is, there is significant variation within the data set. The mean absolute deviation is less sensitive to outliers compared to standard deviation and variance because it calculates the average of the absolute differences.

Concept

The standard deviation is a measure of spread of a data set that measures how much the data elements differ from the mean. The standard deviation, often represented by the Greek letter $σ$ (sigma), is calculated by taking the square root of the variance of the data set. Let $x_{1},x_{2},…,x_{n}$ be the data values in a set and $x$ their mean.

$σ=n(x_{1}−x)_{2}+(x_{2}−x)_{2}+⋯+(x_{n}−x)_{2} $

The applet below calculates the standard deviation for the data set on the number line. Move the points around to change the data.
As shown, finding the standard deviation involves calculating the average of the squared differences between each data point and the mean, and then taking the square root of that average. The sum of squares can also be written in sigma notation.

$σ=ni=1∑n (x_{i}−x)_{2} $

Standard deviation is sensitive to outliers because of the squaring of differences. It is commonly used when analyzing a data set that exhibits a normal distribution.Concept

The variance is a measure of spread of a set of data that measures how much the data elements deviate from the mean. Mathematically, the variance is the average of the squares of the difference between each data value $x_{i}$ and the mean $x.$

$n(x_{1}−x)_{2}+(x_{2}−x)_{2}+⋯+(x_{n}−x)_{2} $

The variance is the square of the standard deviation $σ,$ so it is usually denoted as $σ_{2}.$
$σ=n(x_{1}−x)_{2}+(x_{2}−x)_{2}+⋯+(x_{n}−x)_{2} ⇕σ_{2}=n(x_{1}−x)_{2}+(x_{2}−x)_{2}+⋯+(x_{n}−x)_{2} $

The applet below calculates the variance in the data set on the number line. Points can be moved to change the data. Loading content