| {{ 'ml-lesson-number-slides' | message : article.intro.bblockCount }} |
| {{ 'ml-lesson-number-exercises' | message : article.intro.exerciseCount }} |
| {{ 'ml-lesson-time-estimation' | message }} |
A normal distribution is a type of probability distribution where the mean, the median, and the mode are all equal to each other. The graph that represents a normal distribution is called a normal curve and it is a continuous, bell-shaped curve that is symmetric with respect to the mean μ of the data set.
This type of distribution is the most common continuous probability distribution that can be observed in real life. When a normal distribution has a mean of 0 and standard deviation of 1, it is called a standard normal distribution. A normal distribution can be standardized by transforming each of its values into their corresponding z-scores.
The total area under the normal curve is 100%, or 1. Because of this, the area under the normal curve in a certain interval represents the percentage of data within that interval or the probability of randomly selecting a value that belongs to that interval. The Empirical Rule can be used to determine the area under the normal curve at specific intervals.
Consider the weights of oranges as an example of normally distributed data. The mean weight of an orange is about 310 grams and the standard deviation is approximately 15 grams. The distribution of a sample of weights of 1000 randomly chosen oranges is described by the following histogram.
In statistics, the Empirical Rule, also known as the 68–95–99.7 rule, is a shorthand used to remember the percentage of values that lie within certain intervals in a normal distribution. The rule states the following three facts.
Empirical Rule.
The z-score, also known as the z-value, represents the number of standard deviations that a given value x is from the mean of a data set. The following formula can be used to convert any x-value into its corresponding z-score.
z=σx−μ
Here, μ represents the mean and σ the standard deviation of the distribution. The z-value corresponding to a sample mean xˉ is called a z-statistic and is calculated using a similar formula.
z=nsxˉ−μ
In this formula, s is the standard deviation of the sample, n is the sample size, and μ is the population mean.
The left-hand column gives the whole part of z, while the top row gives the decimal part of z.
.0 | .1 | .2 | .3 | .4 | .5 | .6 | .7 | .8 | .9 | |
---|---|---|---|---|---|---|---|---|---|---|
-3 | .00135 | .00097 | .00069 | .00048 | .00034 | .00023 | .00016 | .00011 | .00007 | .00005 |
-2 | .02275 | .01786 | .01390 | .01072 | .00820 | .00621 | .00466 | .00347 | .00256 | .00187 |
-1 | .15866 | .13567 | .11507 | .09680 | .08076 | .06681 | .05480 | .04457 | .03593 | .02872 |
-0 | .50000 | .46017 | .42074 | .38209 | .34458 | .30854 | .27425 | .24196 | .21186 | .18406 |
0 | .50000 | .53983 | .57926 | .61791 | .65542 | .69146 | .72575 | .75804 | .78814 | .81594 |
1 | .84134 | .86433 | .88493 | .90320 | .91924 | .93319 | .94520 | .95543 | .96407 | .97128 |
2 | .97725 | .98214 | .98610 | .98928 | .99180 | .99379 | .99534 | .99653 | .99744 | .99813 |
3 | .99865 | .99903 | .99931 | .99952 | .99966 | .99977 | .99984 | .99989 | .99993 | .99995 |
The applet calculates the area below the standard normal curve and to the left of the entered z-score. It accepts z-scores up to two decimal places.