Outliers

There are 13 people in a nightclub. Their average age is 30 years. If a patron must be at least 21 years old to enter the nightclub, what is the lowest number of people that must enter the nightclub for the average age to drop to under 25 years?

Name	Height	Height in inches
Jahmani Hot Shot Swanson	4'5''	12* 4+5=53
Jonte Too Tall Hall	5'2''	12* 5+2=62
Donald Ducky Moore	6'0''	12* 6+0=72
Solomon Bam Bam Bamiro	6'5''	12* 6+5=77
Sean Elevator Williams	6'10''	12* 6+10=82
Paul Tiny Sturgess	7'8''	12* 7+8=92

	Mean	Median	Standard Deviation	Interquartile Range
Data A	20.15	20	5.13	7
Data B	25.36	28	8.54	10
Data C	20.58	21	9.92	19
Data D	7.26	6	7.52	3
Data E	21.05	22	11.57	20

Data's Distribution	Some Observations	Preferred Statistics
Symmetric Distribution	Both the mean and the median are close to the center.	If the histogram has one peak, then both mean and median describe a typical data value. In this case, the preferred statistic is the mean, since it also considers the actual data values, not just the order.
Skewed Distribution or With Outliers	The extreme values can distort the mean and increase the standard deviation.	In this case, the preferred statistic to describe a typical data value is the median, since it only considers the order of the data. Furthermore, it is less sensitive to extreme values.

Possible Reason	Example
It can be a result of a data recording error. If this is obvious, then this data can be removed or modified.	Suppose a scientist records the length of some leaves from a tree in centimeters, measured to the nearest millimeter. In that case, a typical data entry has one decimal place. Yet, if the value 104 appears in the data, then it is likely that it was mistakenly recorded and meant to be recorded as 10.4.
The nature of the data is such that unusual entries can occur. In this case, this entry should also be considered. Still, usually, the median better describes a typical data value than the mean.	When looking at a data set of the heights of Star Wars characters, one should expect to see a few extremely low and high values.

Outliers

Catch-Up and Review

Effects of One Data Value's Change on a Data Set

Outlier

Extra

Identifying Outliers in a Data Set

Hint

Solution

Extra

Analyzing the Effect of Outliers on the Mean

Answer

Hint

Solution

Extra

Statistics of Data with Different Histogram Shapes

Hint

Solution

Extra

Connection Between Different Data Distributions and Best Statistics

Choosing the Best Statistics for Data Sets

Why Do Outliers Appear?

Extra

Tiffaniqua's Claim

Ramsha's Claim

Outliers

Recommended exercises

	9 Theory slides
	8 Exercises - Grade E - A
	Each lesson is meant to take 1-2 classroom sessions