| {{ 'ml-lesson-number-slides' | message : article.intro.bblockCount }} |
| {{ 'ml-lesson-number-exercises' | message : article.intro.exerciseCount }} |
| {{ 'ml-lesson-time-estimation' | message }} |
Here is a recommended readings before getting started with this lesson.
Emily and Ignacio love learning about animals. They believe they can make meaningful discoveries by studying data about any animal, beginning with cats. They choose to create a data set, consisting of seven data points, showing the lifespan of cats in their neighborhood. They surveyed their neighbors to get this information.
Lifespan of Cats (in years) | |||
---|---|---|---|
15 | 11 | 14 | 15 |
14 | 17 | 13 |
Answer the following questions using this data set.
A data set is a collection of values that provides information. These values can be presented in various ways such as in numbers or categories. The values are typically gathered through measurements, surveys, or experiments. Consider a data set that consists of the heights of a group of actors.
Actor | Height |
---|---|
Madzia | 5 ft 4 in. |
Magda | 5 ft 2 in. |
Ignacio | 6 ft 1.6 in. |
Henrik | 5 ft 10 in. |
Ali | 6 ft 1 in. |
Diego | 5 ft 2 in. |
Miłosz | 5 ft 2 in. |
Paulina | 5 ft 3 in. |
Aybuke | 5 ft 7 in. |
Mateusz | 6 ft 1.2 in. |
Gamze | 5 ft 3 in. |
Marcin | 5 ft 7 in. |
Marcial | 5 ft 8 in. |
Heichi | 5 ft 5 in. |
Arkadiusz | 5 ft 6 in. |
Enrique | 5 ft 10.5 in. |
Aleksandra | 5 ft 4 in. |
Mateusz | 5 ft 9 in. |
Jordan | 5 ft 5 in. |
Paula | 5 ft 2 in. |
MacKenzie | 5 ft 6 in. |
Joe | 6 ft 1 in. |
Flavio | 5 ft 10 in. |
Jeremy | 5 ft 4 in. |
Umut | 6 ft 1 in. |
The mean, or the average, of a numerical data set is one of the measures of center. It is defined as the sum of all of the data values in a set divided by the number of values in the set.
Mean=Number of ValuesSum of Values
The following applet calculates the mean of the data set on the number line. Points can be moved to change the data values.
Ignacio volunteers at a dog shelter. He asks Emily to help him study a data set he made concerning the lifespan of some of the dogs. The information they gather will help the shelter!
This time, the data set consists of eight data points rather than seven.
Lifespan of Dogs (in years) | |||
---|---|---|---|
10 | 21 | 16 | 15 |
13 | 15 | 17 | 11 |
Substitute values
Add terms
Calculate quotient
Similar to the measures of center, there are measures that describe how much the values in a data set differ from each other using only one measure. These measures summarize the spread of the data.
Range is a measure of spread that measures the difference between the maximum and minimum values of the data set.
The interquartile range, or IQR, of a data set is a measure of spread that measures the difference between Q3 and Q1, the upper and lower quartiles.
IQR=Q3−Q1
The following applet shows how to find the IQR of different data sets.
First, identify the median of the given data set. Since the number of values is even, the median is the mean of the two middle values.
The median of the data is 6.
The median divides the data into two halves, a lower half and an upper half. For this data, the lower half includes the first six values and the upper half includes the following six.
When there is an odd number of values in the data set, the middle value is excluded from both the lower and upper sets.
Find the first and the third quartile. The first quartile, Q1, is the median of the lower set, while the third, Q3, is the median of the upper set. Here, both quartiles are found the same way the median was found.
Here, it is necessary to order the values from least to greatest. Then identify the median of the given data set. Since the number of values is an odd number, the median is the middle value.
The median of the data is 9. Both the lower and upper halves contain four data values. Therefore, there are two middle values in each half. The median of each half is the mean of the two middle values.
Q1=11, Q3=