{{ 'ml-label-loading-course' | message }}

{{ tocSubheader }}

{{ 'ml-toc-proceed-mlc' | message }}

{{ 'ml-toc-proceed-tbs' | message }}

An error ocurred, try again later!

Chapter {{ article.chapter.number }}

{{ article.number }}. # {{ article.displayTitle }}

{{ article.intro.summary }}

{{ 'ml-btn-show-less' | message }} {{ 'ml-btn-show-more' | message }} {{ 'ml-heading-abilities-covered' | message }}

{{ 'ml-heading-lesson-settings' | message }}

| {{ 'ml-lesson-number-slides' | message : article.intro.bblockCount}} |

| {{ 'ml-lesson-number-exercises' | message : article.intro.exerciseCount}} |

| {{ 'ml-lesson-time-estimation' | message }} |

When collecting real-life data, there are many cases where most of the data values cluster close to the mean of the set. Consider, for example, men's shoe sizes.
### Catch-Up and Review

Most of the data is grouped next to the mean value, which is $9.$ Therefore, a man who wears a size $9.5$ shoe is more likely to be randomly selected than a man who wears a size $11.5$ shoe. When a data set is distributed this way and the domain of the distribution is continuous — not discrete — it is said that the data is *normally distributed*. This lesson explores this distribution.

**Here are a few recommended readings before getting started with this lesson.**

Challenge

Kevin has a summer internship at a tech company in his town. The daily number of calls that the company receives is normally distributed with a mean of $2240$ calls and a standard deviation of $150$ calls. The graph represents the distribution of the data.

Looking to make improvements in the company, Kevin's boss is interested in knowing the answers to the next couple of questions.

a What is the probability that more than $2540$ calls are received on a random day?

{"type":"text","form":{"type":"math","options":{"comparison":"1","nofractofloat":false,"keypad":{"simple":true,"useShortLog":false,"variables":[],"constants":[]}},"text":"<span class=\"katex\"><span class=\"katex-html\" aria-hidden=\"true\"><\/span><\/span>"},"formTextBefore":null,"formTextAfter":null,"answer":{"text":["0.025"]}}

b What is the probability that between $2300$ and $2420$ calls are received on a random day? Round the answer to two decimal places.

{"type":"text","form":{"type":"math","options":{"comparison":"1","nofractofloat":false,"keypad":{"simple":true,"useShortLog":false,"variables":[],"constants":[]}},"text":"<span class=\"katex\"><span class=\"katex-html\" aria-hidden=\"true\"><\/span><\/span>"},"formTextBefore":"<span class=\"katex\"><span class=\"katex-html\" aria-hidden=\"true\"><span class=\"base\"><span class=\"strut\" style=\"height:0.48312em;vertical-align:0em;\"><\/span><span class=\"mrel\">\u2248<\/span><\/span><\/span><\/span>","formTextAfter":null,"answer":{"text":["0.23"]}}

Discussion

When dealing with probability distributions, there is one type that stands out above the rest because it is very common in different real-life scenarios like people's heights, shoe sizes, birth weights, average grades, IQ levels, and many qualities. Because of this regularity, this type of distribution is called the *normal distribution*.

Concept

A normal distribution is a type of probability distribution where the mean, the median, and the mode are all equal to each other. The graph that represents a normal distribution is called a normal curve and it is a continuous, bell-shaped curve that is symmetric with respect to the mean $μ$ of the data set.

This type of distribution is the most common continuous probability distribution that can be observed in real life. When a normal distribution has a mean of $0$ and standard deviation of $1,$ it is called a standard normal distribution.

The total area under the normal curve is $100%,$ or $1.$ Because of this, the area under the normal curve in a certain interval represents the percentage of data within that interval or the probability of randomly selecting a value that belongs to that interval. The Empirical Rule can be used to determine the area under the normal curve at specific intervals. It is also worth noting that not all data sets are normally distributed. If the mean and median are not equal, then the data set is skewed.Concept

In statistics, the Empirical Rule, also known as the $68–95–99.7$ **rule**, is a shorthand used to remember the percentage of values that lie within certain intervals in a normal distribution. The rule states the following three facts.

- About $68%$ of the values lie within one standard deviation of the mean.
- About $95%$ of the values lie within two standard deviations of the mean.
- About $99.7%$ of the values lie within three standard deviations of the mean.

According to this rule, almost all the values observed lie within three standard deviations of the mean. For this reason, the rule is also called the

Empirical Rule.

Example

In his spare time, Kevin works with the *Less Chat, More Talk* campaign to encourage people to share with their loved ones in person instead of through screens. He wants to give away T-shirts with a cool logo outside a shopping mall to help spread this message.

Kevin is in charge of preparing the men's T-shirts, but he does not know how many of each size he should order. To figure it out, he searched the City Hall website and he found that the heights of the men in the city are normally distributed with a mean of $183$ centimeters and a standard deviation of $5$ centimeters. Along with this information, there was also a graph.

a What is the range of the heights that represent the middle $68%$ of the distribution? Write the answer as a strict compound inequality.

{"type":"text","form":{"type":"math","options":{"comparison":"1","nofractofloat":false,"keypad":{"simple":false,"useShortLog":false,"variables":["X"],"constants":[]}},"text":"<span class=\"katex\"><span class=\"katex-html\" aria-hidden=\"true\"><\/span><\/span>"},"formTextBefore":"Range:","formTextAfter":null,"answer":{"text":["178 < X < 188"]}}

b What percent of the surveyed men are shorter than $173$ centimeters?

{"type":"text","form":{"type":"math","options":{"comparison":"1","nofractofloat":false,"keypad":{"simple":true,"useShortLog":false,"variables":[],"constants":[]}},"text":"<span class=\"katex\"><span class=\"katex-html\" aria-hidden=\"true\"><\/span><\/span>"},"formTextBefore":null,"formTextAfter":"<span class=\"katex\"><span class=\"katex-html\" aria-hidden=\"true\"><span class=\"base\"><span class=\"strut\" style=\"height:0.80556em;vertical-align:-0.05556em;\"><\/span><span class=\"mspace\" style=\"margin-right:0.16666666666666666em;\"><\/span><span class=\"mord\">%<\/span><\/span><\/span><\/span>","answer":{"text":["2.5"]}}

c If $3000$ men participated in the survey, how many of them are between $188$ and $193$ centimeters tall?

{"type":"text","form":{"type":"math","options":{"comparison":"1","nofractofloat":false,"keypad":{"simple":true,"useShortLog":false,"variables":[],"constants":[]}},"text":"<span class=\"katex\"><span class=\"katex-html\" aria-hidden=\"true\"><\/span><\/span>"},"formTextBefore":null,"formTextAfter":"adults","answer":{"text":["405"]}}

a Use the Empirical Rule to determine the corresponding percentage.

b Use the Empirical Rule.

c Use the Empirical Rule to find the percent. Then, multiply it by the total number of men surveyed.

a According to the Empirical Rule, the middle $68%$ of the data in a normal distribution
falls in the range that starts one standard deviation to the left of the mean and ends one standard deviation to the right of the mean.

$Middle68%μ−σ<X<μ+σ $

According to the website, the mean is $183$ and the standard deviation is $5.$ Therefore, $μ=183$ and $σ=5.$
$Middle68%183−5<X<183+5⇕178<X<188 $

Consequently, the middle $68%$ of the distribution represents the range of heights from $178$ to $188$ centimeters.
b Start by highlighting the corresponding interval in the graph.

$μ−2σμ−2σ =183−2(5)⇕=173 $

According to the Empirical Rule, $95%$ of the data fall between $μ−2σ$ and $μ+2σ.$ It is known that the value of $μ−2σ$ is $173.$ Calculate the value of $μ+2σ.$
$μ+2σμ+2σ =183+2(5)⇕=193 $

Therefore, $95%$ of the data fall between $173$ and $193.$ This implies that $5%$ of the data fall outside this range.
Due to the symmetry of the normal curve, $2.5%$ of the data fall to the left of $173$ and $2.5%$ of the data fall to the right of $193.$ Consequently, $2.5%$ of the men surveyed are shorter than $173$ centimeters.

c The graph below shows the percentages represented by each interval according to the Empirical Rule.

According to the graph, $13.5%$ of the surveyed men are between $188$ and $193$ centimeters tall.

To find the number of men that belong to this range, multiply the corresponding percentage by the total number of men that participated in the survey.$13.5%(3000)$

Multiply

PercentToFrac

$a%=100a $

$10013.5 (3000)$

MoveRightFacToNum

$ca ⋅b=ca⋅b $

$10040500 $

CalcQuot

Calculate quotient

$405$

Discussion

Given a normal distribution, it can be drawn by hand. For example, consider a normally distributed data set with a mean $μ=10$ and standard deviation $σ=2.$
*expand_more*
*expand_more*

*expand_more*

$μ=10andσ=2 $

Such distribution can be drawn following the next three steps.
1

Place the Mean on a Horizontal Axis

First, draw a horizontal axis and mark the mean of the data in the middle. In this case, the mean is $10.$

2

Find and Add More Labels

Find more labels to write on the axis such that each interval is one standard deviation long. In this case, the intervals must be $2$ units long. To accomplish this, add and subtract multiples of the standard deviation to and from the mean.

Labels to the Left of the Mean | Labels to the Right of the Mean |
---|---|

$10−1⋅2=8$ | $10+1⋅2=12$ |

$10−2⋅2=6$ | $10+2⋅2=14$ |

$10−3⋅2=4$ | $10+3⋅2=16$ |

Adding three labels to each side of the mean is enough.

3

Draw the Normal Curve

Lastly, draw a bell-shaped curve with its peak at the mean. Remember, the curve is symmetric with respect to the mean. In this case, the peak occurs at $10.$

Example

While reading some statistics about the people in the city, Kevin was surprised to learn that the weights of newborns are also normally distributed. He found the following information given by the local hospital.

a Graph the normal distribution labeling all the intervals and percentages.

b What percent of the newborn babies weigh $6.3$ pounds or more?

a

b $84%$

a Start by drawing the axis and placing the mean in the middle. Determine the standard deviation. Write labels so that the length of each interval is $1$ standard deviation. Then, draw the normal curve and use the Empirical Rule to label the percents.

b Identify $6.3$ in the graph from Part A. Shade the region to the right of $6.3$ pounds and add the corresponding percentages.

a To graph a normal distribution, draw a horizontal axis and place the mean of the data in the middle. According to the given information, the mean weight is $μ=7$ pounds.

$7−σσ =6.3⇕=0.7 and7+σσ =7.7⇕=0.7 $

The standard deviation of the weights of the babies is $0.7$ pounds, so on the axis, write labels to the left and right of the mean such that each interval is $0.7$ units long.
Next, draw the normal curve — a bell-shaped curve that is symmetric with respect to the mean, where it has its peak.

According to the Empirical Rule, the percentages below the curve are distributed as follows.

- About $68%$ of the data fall between $6.3$ and $7.7$ pounds.
- About $95%$ of the data fall between $5.6$ and $8.4$ pounds.
- About $99.7%$ of the data fall between $4.9$ and $9.1$ pounds.

The percentages in every interval can be labeled by using the symmetry of the curve. This will complete the diagram of the distribution.

b The percent of newborn babies that weigh $6.3$ pounds or more corresponds to the region below the normal curve that is to the right of $6.3.$ Therefore, to calculate the percent of newborns that weigh $6.3$ pounds or more, highlight this part of the graph.

$34+34+13.5+2.35+0.15=84 $

Consequently, $84%$ of the newborn babies weigh $6.3$ pounds or more.
Discussion

The height of people is usually normally distributed. For example, the average height of a woman in the United States is about $162.5$ centimeters. Assuming a standard deviation of $2.5$ centimeters, the graph of this distribution looks as follows.

The Empirical Rule is used to determine the percentage of data that falls between any two labels on the axis. However, what about if the endpoints of the interval are different from the labels? For example, what is the percentage of women that are shorter than $166$ centimeters?

To find such a percentage, the first step is converting the data value into its corresponding $z-$*score*.

Concept

The $z-$score, also known as the $z-$value, represents the number of standard deviations that a given value $x$ is from the mean of a data set. The following formula can be used to convert any $x-$value into its corresponding $z-$score.

$z=σx−μ $

$z=2.5166−162.5 ⇔z=1.4 $

This means that the value $166$ is $1.4$ standard deviations to the right of $162.5.$ Once the corresponding $z-$score is known, the area below the curve that is to the left of this value can be found using a standard normal table. Method

Consider a standard normal distribution and a randomly chosen $z-$score. The area below the normal curve that is to the left of this $z-$score can be calculated using a standard normal table. For example, consider $z=0.6.$

The percentage of data that is1

Locate the Whole Part of the $z-$Score

In the left column of the standard normal table, locate the whole part of the $z-$score. Since $z=0.6$ is positive, look at the four bottom rows. Because the whole part of $z$ is $0,$ shade the fifth row.

$.0$ | $.1$ | $.2$ | $.3$ | $.4$ | $.5$ | $.6$ | $.7$ | $.8$ | $.9$ | |
---|---|---|---|---|---|---|---|---|---|---|

$-3$ | $.00135$ | $.00097$ | $.00069$ | $.00048$ | $.00034$ | $.00023$ | $.00016$ | $.00011$ | $.00007$ | $.00005$ |

$-2$ | $.02275$ | $.01786$ | $.01390$ | $.01072$ | $.00820$ | $.00621$ | $.00466$ | $.00347$ | $.00256$ | $.00187$ |

$-1$ | $.15866$ | $.13567$ | $.11507$ | $.09680$ | $.08076$ | $.06681$ | $.05480$ | $.04457$ | $.03593$ | $.02872$ |

$-0$ | $.50000$ | $.46017$ | $.42074$ | $.38209$ | $.34458$ | $.30854$ | $.27425$ | $.24196$ | $.21186$ | $.18406$ |

$0$ | $.50000$ | $.53983$ | $.57926$ | $.61791$ | $.65542$ | $.69146$ | $.72575$ | $.75804$ | $.78814$ | $.81594$ |

$1$ | $.84134$ | $.86433$ | $.88493$ | $.90320$ | $.91924$ | $.93319$ | $.94520$ | $.95543$ | $.96407$ | $.97128$ |

$2$ | $.97725$ | $.98214$ | $.98610$ | $.98928$ | $.99180$ | $.99379$ | $.99534$ | $.99653$ | $.99744$ | $.99813$ |

$3$ | $.99865$ | $.99903$ | $.99931$ | $.99952$ | $.99966$ | $.99977$ | $.99984$ | $.99989$ | $.99993$ | $.99995$ |

The probability that corresponds to a $z-$score for which the integer part is $0$ appears in the shaded row.

2

Locate the Decimal Part of the $z-$Score

In the top row of the standard normal table, locate the decimal part of the $z-$score. Here, the decimal part is $6.$ Consequently, shade the seventh column.

$.0$ | $.1$ | $.2$ | $.3$ | $.4$ | $.5$ | $.6$ | $.7$ | $.8$ | $.9$ | |
---|---|---|---|---|---|---|---|---|---|---|

$-3$ | $.00135$ | $.00097$ | $.00069$ | $.00048$ | $.00034$ | $.00023$ | $.00016$ | $.00011$ | $.00007$ | $.00005$ |

$-2$ | $.02275$ | $.01786$ | $.01390$ | $.01072$ | $.00820$ | $.00621$ | $.00466$ | $.00347$ | $.00256$ | $.00187$ |

$-1$ | $.15866$ | $.13567$ | $.11507$ | $.09680$ | $.08076$ | $.06681$ | $.05480$ | $.04457$ | $.03593$ | $.02872$ |

$-0$ | $.50000$ | $.46017$ | $.42074$ | $.38209$ | $.34458$ | $.30854$ | $.27425$ | $.24196$ | $.21186$ | $.18406$ |

$0$ | $.50000$ | $.53983$ | $.57926$ | $.61791$ | $.65542$ | $.69146$ | $.72575$ | $.75804$ | $.78814$ | $.81594$ |

$1$ | $.84134$ | $.86433$ | $.88493$ | $.90320$ | $.91924$ | $.93319$ | $.94520$ | $.95543$ | $.96407$ | $.97128$ |

$2$ | $.97725$ | $.98214$ | $.98610$ | $.98928$ | $.99180$ | $.99379$ | $.99534$ | $.99653$ | $.99744$ | $.99813$ |

$3$ | $.99865$ | $.99903$ | $.99931$ | $.99952$ | $.99966$ | $.99977$ | $.99984$ | $.99989$ | $.99993$ | $.99995$ |

3

Identify the Intersecting Cell

The shaded row and column intersect at $0.72575.$ Therefore, the percentage of data that is less than or equal to $0.6$ is $72.575%.$ This means that the probability that a value chosen at random is less than or equal to $0.6$ is $0.72575.$

$P(z≤0.6)=0.72575 $

Other areas can also be found using the same standard normal table.

To find the area below the normal curve and between two $z-$scores, subtract the area to the left of the smaller $z-$score from the area to the left of the greater $z-$score.

The area to the right of a $z-$score is the complement of the area to the left of the same $z-$score.

Since the area under the normal curve represents a probability, by the Complement Rule, these two probabilities add up to $1.$$P(z>z_{1})+P(z≤z_{1})=1 $

Therefore, the area to the right of a $z-$score is the difference of $1$ and the area to the left of the $z-$score. $P(z>z_{1})=1−P(z≤z_{1})$

According to the standard normal table, the probability that a randomly selected value is less than or equal to $1.4$ is $0.91924.$ Therefore, about $91.92%$ of women are shorter than or equal to $166$ centimeters.

Example

Kevin has become a stats fan. He has recorded the time it takes him to commute to his internship over the past few days. He observes that the times are normally distributed with a mean of $17$ minutes and a standard deviation of $2.5$ minutes.

Find the following probabilities and write them in decimal form rounded to two decimal places.

a What is the probability that Kevin's commute tomorrow will take less than $14$ minutes?

{"type":"text","form":{"type":"math","options":{"comparison":"1","nofractofloat":false,"keypad":{"simple":true,"useShortLog":false,"variables":[],"constants":[]}},"text":"<span class=\"katex\"><span class=\"katex-html\" aria-hidden=\"true\"><\/span><\/span>"},"formTextBefore":"<span class=\"katex\"><span class=\"katex-html\" aria-hidden=\"true\"><span class=\"base\"><span class=\"strut\" style=\"height:0.48312em;vertical-align:0em;\"><\/span><span class=\"mrel\">\u2248<\/span><\/span><\/span><\/span>","formTextAfter":null,"answer":{"text":["0.12"]}}

b What is the probability that Kevin's commute will take between $16$ and $19$ minutes next Monday?

{"type":"text","form":{"type":"math","options":{"comparison":"1","nofractofloat":false,"keypad":{"simple":true,"useShortLog":false,"variables":[],"constants":[]}},"text":"<span class=\"katex\"><span class=\"katex-html\" aria-hidden=\"true\"><\/span><\/span>"},"formTextBefore":"<span class=\"katex\"><span class=\"katex-html\" aria-hidden=\"true\"><span class=\"base\"><span class=\"strut\" style=\"height:0.48312em;vertical-align:0em;\"><\/span><span class=\"mrel\">\u2248<\/span><\/span><\/span><\/span>","formTextAfter":null,"answer":{"text":["0.44"]}}

c Kevin starts work every day at $8:00AM.$ One day he leaves his house at $7:41AM.$ What is the probability that Kevin will be late for work this day?

{"type":"text","form":{"type":"math","options":{"comparison":"1","nofractofloat":false,"keypad":{"simple":true,"useShortLog":false,"variables":[],"constants":[]}},"text":"<span class=\"katex\"><span class=\"katex-html\" aria-hidden=\"true\"><\/span><\/span>"},"formTextBefore":"<span class=\"katex\"><span class=\"katex-html\" aria-hidden=\"true\"><span class=\"base\"><span class=\"strut\" style=\"height:0.48312em;vertical-align:0em;\"><\/span><span class=\"mrel\">\u2248<\/span><\/span><\/span><\/span>","formTextAfter":null,"answer":{"text":["0.22"]}}

a Draw the normal distribution curve. If $14$ is not one of the labels in the axis, then convert it into a $z-$score. Use a standard normal table to find the probability that a random value is less than the corresponding $z-$score.

b Convert each value to its corresponding $z-$score. To find the desired probability, subtract the probability that a random value is less than the smallest $z-$score from the probability that a random value is less than the largest $z-$score.

c How much time does Kevin have on this day to get from his house to work on time? Convert that time into a $z-$score. The probability of Kevin being late is equal to the probability a random value being greater than that $z-$score.

a Start by drawing the normal distribution curve. According to Kevin, the mean time it takes him to get to work is $17$ minutes, so this value should be located in the middle of the axis. The standard deviation is $2.5$ minutes. The labels of the axis are found by adding and subtracting integer multiples of the standard deviation to and from the mean.

The probability that Kevin spends less than $14$ minutes getting to work tomorrow is represented by the area below the curve that is to the left of $14.$

Since $14$ is not a label on the axis, the Empirical Rule cannot be used. Therefore, to find the area, first convert $x=14$ into its corresponding $z-$score.$z=σx−μ $

SubstituteValues

Substitute values

$z=2.514−17 $

Evaluate right-hand side

SubTerm

Subtract term

$z=2.5-3 $

MoveNegNumToFrac

Put minus sign in front of fraction

$z=-2.53 $

CalcQuot

Calculate quotient

$z=-1.2$

$.0$ | $.1$ | $.2$ | $.3$ | $.4$ | $.5$ | $.6$ | $.7$ | $.8$ | $.9$ | |
---|---|---|---|---|---|---|---|---|---|---|

$-3$ | $.00135$ | $.00097$ | $.00069$ | $.00048$ | $.00034$ | $.00023$ | $.00016$ | $.00011$ | $.00007$ | $.00005$ |

$-2$ | $.02275$ | $.01786$ | $.01390$ | $.01072$ | $.00820$ | $.00621$ | $.00466$ | $.00347$ | $.00256$ | $.00187$ |

$-1$ | $.15866$ | $.13567$ | $.11507$ | $.09680$ | $.08076$ | $.06681$ | $.05480$ | $.04457$ | $.03593$ | $.02872$ |

$-0$ | $.50000$ | $.46017$ | $.42074$ | $.38209$ | $.34458$ | $.30854$ | $.27425$ | $.24196$ | $.21186$ | $.18406$ |

$0$ | $.50000$ | $.53983$ | $.57926$ | $.61791$ | $.65542$ | $.69146$ | $.72575$ | $.75804$ | $.78814$ | $.81594$ |

$1$ | $.84134$ | $.86433$ | $.88493$ | $.90320$ | $.91924$ | $.93319$ | $.94520$ | $.95543$ | $.96407$ | $.97128$ |

$2$ | $.97725$ | $.98214$ | $.98610$ | $.98928$ | $.99180$ | $.99379$ | $.99534$ | $.99653$ | $.99744$ | $.99813$ |

$3$ | $.99865$ | $.99903$ | $.99931$ | $.99952$ | $.99966$ | $.99977$ | $.99984$ | $.99989$ | $.99993$ | $.99995$ |

According to the table, the probability that tomorrow Kevin will spend less than $14$ minutes traveling to work is about $0.12.$

b In the graph from Part A it can be seen that neither $16$ nor $19$ are labels on the axis.

Therefore, both values will need to be converted into their corresponding $z-$scores first. Recall that $μ=17$ and $σ=2.5!$

$z=σx−μ $ | ||
---|---|---|

$x-$value | Substitute | Simplify |

$16$ | $z=2.516−17 $ | $z=-0.4$ |

$19$ | $z=2.519−17 $ | $z=0.8$ |

$P(-0.4<z<0.8)=P(z<0.8)−P(z<-0.4) $

Each of these probabilities can be found using the standard normal table. $.0$ | $.1$ | $.2$ | $.3$ | $.4$ | $.5$ | $.6$ | $.7$ | $.8$ | $.9$ | |
---|---|---|---|---|---|---|---|---|---|---|

$-3$ | $.00135$ | $.00097$ | $.00069$ | $.00048$ | $.00034$ | $.00023$ | $.00016$ | $.00011$ | $.00007$ | $.00005$ |

$-2$ | $.02275$ | $.01786$ | $.01390$ | $.01072$ | $.00820$ | $.00621$ | $.00466$ | $.00347$ | $.00256$ | $.00187$ |

$-1$ | $.15866$ | $.13567$ | $.11507$ | $.09680$ | $.08076$ | $.06681$ | $.05480$ | $.04457$ | $.03593$ | $.02872$ |

$-0$ | $.50000$ | $.46017$ | $.42074$ | $.38209$ | $.34458$ | $.30854$ | $.27425$ | $.24196$ | $.21186$ | $.18406$ |

$0$ | $.50000$ | $.53983$ | $.57926$ | $.61791$ | $.65542$ | $.69146$ | $.72575$ | $.75804$ | $.78814$ | $.81594$ |

$1$ | $.84134$ | $.86433$ | $.88493$ | $.90320$ | $.91924$ | $.93319$ | $.94520$ | $.95543$ | $.96407$ | $.97128$ |

$2$ | $.97725$ | $.98214$ | $.98610$ | $.98928$ | $.99180$ | $.99379$ | $.99534$ | $.99653$ | $.99744$ | $.99813$ |

$3$ | $.99865$ | $.99903$ | $.99931$ | $.99952$ | $.99966$ | $.99977$ | $.99984$ | $.99989$ | $.99993$ | $.99995$ |

$P(-0.4<z<0.8)=P(z<0.8)−P(z<-0.4)$

SubstituteII

$P(z<0.8)=0.78814$, $P(z<-0.4)=0.34458$

$P(-0.4<z<0.8)=0.78814−0.34458$

SubTerm

Subtract term

$P(-0.4<z<0.8)=0.44356$

RoundDec

Round to $2$ decimal place(s)

$P(-0.4<z<0.8)≈0.44$

c In order for Kevin to be on time, his commute cannot take more than $19$ minutes. In other words, he will be late for work if it takes more than $19$ minutes. This means that the probability that Kevin will be late for work that day is represented by the area below the normal curve that is to the right of $19.$

Since $19$ is not a label on the axis, the Empirical Rule cannot be used. Therefore, $z-$scores must be used to find the area. In Part B it was determined the $z-$score that corresponds to $19$ is $0.8.$

Probability of Kevin Being Late | Probability of Kevin Being on Time |
---|---|

$P(z>0.8)$ | $P(z≤0.8)$ |

$P(z>0.8)+P(z≤0.8)=1 $

It was also determined in Part B that $P(z≤0.8)$ is $0.78814.$ Substitute this value into the equation above and solve for $P(z>0.8).$
$P(z>0.8)+P(z≤0.8)=1$

Substitute

$P(z≤0.8)=0.78814$

$P(z>0.8)+0.78814=1$

SubEqn

$LHS−0.78814=RHS−0.78814$

$P(z>0.8)=0.22186$

RoundDec

Round to $2$ decimal place(s)

$P(z>0.8)≈0.22$

Example

The company Kevin is interning with plans to release a new smartphone. He goes with the research team to a stadium with a prototype to let different people use the phone in order to determine what features and design people like.

After comparing and contrasting size preference with the ages of the participants, Kevin realizes that the data is normally distributed. Additionally, he notices that the middle $46%$ of participants prefer a larger phone.

a Find the $z-$scores that correspond to the limits of the ages of the middle $46%$ of people, those that prefer a larger phone. Write the limits from least to greatest, rounded to one decimal place.

{"type":"text","form":{"type":"list","options":{"comparison":"1","nofractofloat":false,"keypad":{"simple":true,"useShortLog":false,"variables":[],"constants":[]},"ordermatters":true,"numinput":2,"listEditable":false,"hideNoSolution":true},"text":"<span class=\"katex\"><span class=\"katex-html\" aria-hidden=\"true\"><\/span><\/span>"},"formTextBefore":"<span class=\"katex\"><span class=\"katex-html\" aria-hidden=\"true\"><span class=\"base\"><span class=\"strut\" style=\"height:0.43056em;vertical-align:0em;\"><\/span><span class=\"mord mathdefault\" style=\"margin-right:0.04398em;\">z<\/span><span class=\"mspace\" style=\"margin-right:0.2777777777777778em;\"><\/span><span class=\"mrel\">=<\/span><\/span><\/span><\/span>","formTextAfter":null,"answer":{"text":["-0.6","0.6"]}}

b The mean age of the participants was $19$ years old and the standard deviation $5.$ With this information, determine the range of the ages that represent the middle $46%$ of the distribution. Write the answer as a strict compound inequality.

{"type":"text","form":{"type":"math","options":{"comparison":"1","nofractofloat":false,"keypad":{"simple":false,"useShortLog":false,"variables":["X"],"constants":[]}},"text":"<span class=\"katex\"><span class=\"katex-html\" aria-hidden=\"true\"><\/span><\/span>"},"formTextBefore":"Age Range:","formTextAfter":null,"answer":{"text":["16<X<22"]}}

a Find the area that is to the left of the middle $46%.$ Use a standard normal table to find the $z-$score that produces that area. Because a normal distribution is symmetrical, the upper bound is the opposite of the lower bound.

b Convert the $z-$scores found in Part A into their original values.

a Let $z_{1}$ and $z_{2}$ be the lower and upper limits of the middle $46%$ of the data. To find the corresponding values, start by finding the percentage of data outside the middle area. To do so, subtract $46%$ from $100%.$

Due to the symmetry of the normal curve, the area to the left of $z_{1}$ is equal to the area to the right of $z_{2}.$ Therefore, each portion corresponds to $54÷2=27%$ of the data. For the moment, focus on the area to the left of $z_{1}.$

According to the last graph, the probability that a randomly chosen value is less than $z_{1}$ is $0.27.$ In other words, $P(z<z_{1})=0.27.$ Now, look for the $z-$value that produces a probability of $0.27$ on a standard normal table.

$.0$ | $.1$ | $.2$ | $.3$ | $.4$ | $.5$ | $.6$ | $.7$ | $.8$ | $.9$ | |
---|---|---|---|---|---|---|---|---|---|---|

$-3$ | $.00135$ | $.00097$ | $.00069$ | $.00048$ | $.00034$ | $.00023$ | $.00016$ | $.00011$ | $.00007$ | $.00005$ |

$-2$ | $.02275$ | $.01786$ | $.01390$ | $.01072$ | $.00820$ | $.00621$ | $.00466$ | $.00347$ | $.00256$ | $.00187$ |

$-1$ | $.15866$ | $.13567$ | $.11507$ | $.09680$ | $.08076$ | $.06681$ | $.05480$ | $.04457$ | $.03593$ | $.02872$ |

$-0$ | $.50000$ | $.46017$ | $.42074$ | $.38209$ | $.34458$ | $.30854$ | $.27425$ | $.24196$ | $.21186$ | $.18406$ |

$0$ | $.50000$ | $.53983$ | $.57926$ | $.61791$ | $.65542$ | $.69146$ | $.72575$ | $.75804$ | $.78814$ | $.81594$ |

$1$ | $.84134$ | $.86433$ | $.88493$ | $.90320$ | $.91924$ | $.93319$ | $.94520$ | $.95543$ | $.96407$ | $.97128$ |

$2$ | $.97725$ | $.98214$ | $.98610$ | $.98928$ | $.99180$ | $.99379$ | $.99534$ | $.99653$ | $.99744$ | $.99813$ |

$3$ | $.99865$ | $.99903$ | $.99931$ | $.99952$ | $.99966$ | $.99977$ | $.99984$ | $.99989$ | $.99993$ | $.99995$ |

It is seen in the table that $z_{1}=-0.6.$ Again, due to symmetry, $z_{2}$ is the opposite of $z_{1}.$ Therefore, $z_{2}=0.6.$

Therefore, the limits of the middle $46%$ of the data are $z=-0.6$ and $z=0.6.$

b In Part A it was determined that the limits of the middle $46%$ of the data are $-0.6$ and $0.6.$

$z=σx−μ $

Solve for $x$

$x=μ+z⋅σ$

$x=μ+z⋅σ$

Substitute values and evaluate

SubstituteValues

Substitute values

$x=19+(-0.6)(5)$

MultNegPos

$(-a)b=-ab$

$x=19+(-3)$

AddNeg

$a+(-b)=a−b$

$x=19−3$

SubTerm

Subtract term

$x=16$

$x=μ+z⋅σ$

Substitute values and evaluate

$x=22$

$16<X<22 $

Discussion

One interesting property of a normal distribution is that it can take any value as its mean and any non-negative value as its standard deviation. Because of this, comparing two normally distributed data sets has to be done carefully. Otherwise, erroneous conclusions can be made.
## Standardization of Normal Distribution

*expand_more*

*expand_more*

This process is called standardization and allows objective comparison of data sets that are normally distributed but have different means and standard deviations. Furthermore, it makes it possible to use the same table — the Standard Normal Table — to calculate the probability of any normal distribution. ### Extra

Graphic Illustration

Additionally, the probability of an event happening is the area below the curve. Since there are infinitely many possible curves, the process for finding a certain probability changes between different normal distributions. However, through the use of $z-$scores, any normal distribution can be standardized, allowing the use of a standard normal table to find any probability.

Method

Any normal distribution with mean $μ$ and standard deviation $σ$ can be converted into a standard normal distribution. For example, consider a normal distribution with $μ=35$ and $σ=1.22.$ To standardize the distribution, all its values have to be converted into their corresponding $z-$scores.

Since the domain is continuous, the conversion cannot be manually done for all the values. However, for illustrative purposes, it will be performed for the data set ${33,$ $34,$ $34,$ $35,$ $35,$ $35,$ $36,$ $36,$ $37}.$ Two steps will be followed.1

Subtract the Mean From Each Data Value

First, shift all the values so that the mean of the new set is $0.$ To do this, subtract the mean $35$ from each data value.

$x$ | $x−μ$ |
---|---|

$33$ | $33−35$ |

$34$ | $34−35$ |

$34$ | $34−35$ |

$35$ | $35−35$ |

$35$ | $35−35$ |

$35$ | $35−35$ |

$36$ | $36−35$ |

$36$ | $36−35$ |

$37$ | $37−35$ |

Notice that translating the values will not changed the standard deviation. The standard deviation of the new data set is still $1.22.$

The initial data set has been converted into ${-2,$ $-1,$ $-1,$ $0,$ $0,$ $0,$ $1,$ $1,$ $2}.$

2

Divide the Results by the Standard Deviation

To obtain a data set with a standard deviation of $1,$ divide the values obtained in the previous step by the standard deviation of the set.

$x$ | $σx−μ $ | $z-$Score |
---|---|---|

$33$ | $1.2233−35 $ | $-1.64$ |

$34$ | $1.2234−35 $ | $-0.82$ |

$34$ | $1.2234−35 $ | $-0.82$ |

$35$ | $1.2235−35 $ | $0$ |

$35$ | $1.2235−35 $ | $0$ |

$35$ | $1.2235−35 $ | $0$ |

$36$ | $1.2236−35 $ | $0.82$ |

$36$ | $1.2236−35 $ | $0.82$ |

$37$ | $1.2237−35 $ | $1.64$ |

After the standardization, the new data set is ${-1.64,$ $-0.82,$ $-0.82,$ $0,$ $0,$ $0,$ $0.82,$ $0.82,$ $1.64}.$ Here, the mean is $0$ and the standard deviation $1.$

Notice that the resulting curve has a similar shape and distribution of data values as the original.

The applet shows the changes of a normal distribution as it is standardized.

Example

Kevin's friend LaShay took the SAT and scored $640$ points on the math section. Kevin took the ACT and scored $28.32$ points in the math section.

Since these tests use different scales — the math section of the SAT scores $800$ points while the math section of the ACT scores $36$ points — they wonder who did better. They looked at the stats for each test to find out.

- LaShay's class scores are normally distributed with a mean of $523$ and a standard deviation of $90.$
- Kevin's class scores are also normally distributed with a mean of $21$ and a standard deviation of $6.1.$

a Compared to their corresponding classmates, who stood out more, Kevin or LaShay?

{"type":"choice","form":{"alts":["Kevin","LaShay"],"noSort":false},"formTextBefore":"","formTextAfter":"","answer":1}

b Kevin took the ACT with $2000$ people, including himself. How many people scored higher than Kevin?

{"type":"text","form":{"type":"math","options":{"comparison":"1","nofractofloat":false,"keypad":{"simple":true,"useShortLog":false,"variables":[],"constants":[]}},"text":"<span class=\"katex\"><span class=\"katex-html\" aria-hidden=\"true\"><\/span><\/span>"},"formTextBefore":null,"formTextAfter":"people","answer":{"text":["230"]}}

c What is the probability that a randomly chosen classmate of LaShay's has scored less than or equal to her on the SAT math section? Do not round the answer.

{"type":"text","form":{"type":"math","options":{"comparison":"1","nofractofloat":false,"keypad":{"simple":true,"useShortLog":false,"variables":[],"constants":[]}},"text":"<span class=\"katex\"><span class=\"katex-html\" aria-hidden=\"true\"><\/span><\/span>"},"formTextBefore":null,"formTextAfter":null,"answer":{"text":["0.90320"]}}

d The university where LaShay wants to study will accept only the top $100$ math scores. If LaShay took the SAT with $1000$ people, including herself, will she be accepted?

{"type":"choice","form":{"alts":["Yes","No"],"noSort":true},"formTextBefore":"","formTextAfter":"","answer":0}

b Use the $z-$score found in Part A and the standard normal table to find the percentage of people who scored lower than Kevin. Then, apply the Complement Rule. Multiply the percentage by the total number of people who took the test.

c Use the $z-$score found in Part A and the standard normal table.

d Use the probability found in Part C and the Complement Rule to determine how many people scored higher than LaShay. Are there more than $100$ people?

a Since the scores of both tests are normally distributed, to determine who did better, graph both normal distributions. According to the stats Kevin and LaShay found, the mean of the SAT is $523$ and the standard deviation is $90.$ On the other hand, the mean of the ACT is $21$ and the standard deviation is $6.1.$

Now Kevin's and LaShay's scores will be placed on the horizontal axis of their corresponding test. The score that is further to the right of the mean will tell who stood out the most compared to their class.

Unfortunately, it cannot be determined which score is further to the right of the mean just by looking at the graphs. Since the $z-$scores tell the number of standard deviations above or below the mean that a value is, it is convenient to find the corresponding $z-$scores.$z=σx−μ $

Since it will be further to the right of the mean, the higher positive $z-$score corresponds to the person who did better. Score | Mean | Standard Deviation | $z=σx−μ $ | $z-$score | |
---|---|---|---|---|---|

LaShay | $640$ | $523$ | $90$ | $z=90640−523 $ | $z=1.3$ |

Kevin | $26.52$ | $21$ | $6.1$ | $z=6.126.52−21 $ | $z=1.2$ |

LaShay's $z-$score is greater than Kevin's $z-$score. This means that her score is further to the right of the mean. Consequently, LaShay excelled more in her class than Kevin did in his.

b The percentage of people who scored higher than Kevin is represented by the area below the normal curve and to the right of Kevin's score.

$P(z>1.2)=1−P(z≤1.2) $

Using a standard normal table, the percentage of people who scored less than or the same as Kevin can be determined. Keep in mind that the table has the percentages written as decimal numbers. $.0$ | $.1$ | $.2$ | $.3$ | $.4$ | $.5$ | $.6$ | $.7$ | $.8$ | $.9$ | |
---|---|---|---|---|---|---|---|---|---|---|

$-3$ | $.00135$ | $.00097$ | $.00069$ | $.00048$ | $.00034$ | $.00023$ | $.00016$ | $.00011$ | $.00007$ | $.00005$ |

$-2$ | $.02275$ | $.01786$ | $.01390$ | $.01072$ | $.00820$ | $.00621$ | $.00466$ | $.00347$ | $.00256$ | $.00187$ |

$-1$ | $.15866$ | $.13567$ | $.11507$ | $.09680$ | $.08076$ | $.06681$ | $.05480$ | $.04457$ | $.03593$ | $.02872$ |

$-0$ | $.50000$ | $.46017$ | $.42074$ | $.38209$ | $.34458$ | $.30854$ | $.27425$ | $.24196$ | $.21186$ | $.18406$ |

$0$ | $.50000$ | $.53983$ | $.57926$ | $.61791$ | $.65542$ | $.69146$ | $.72575$ | $.75804$ | $.78814$ | $.81594$ |

$1$ | $.84134$ |