| {{ 'ml-lesson-number-slides' | message : article.intro.bblockCount }} |
| {{ 'ml-lesson-number-exercises' | message : article.intro.exerciseCount }} |
| {{ 'ml-lesson-time-estimation' | message }} |
Here are a few recommended readings before getting started with this lesson.
Understanding Probability
Understanding Descriptive Measures
Understanding Types of Data
Other Recommended Readings
A random variable assigns a numerical value to an outcome of a probability experiment. In many situations, it is important to know how likely it is that a random variable will take a specific value. This can be represented by listing or graphing the probability of each value of a random variable. This is called a probability distribution.
A probability distribution of a random variable X is a function that gives the probability of each outcome in the sample space. It can be represented by tables, equations, or graphs. A probability distribution needs to satisfy two conditions to be valid.
Consider the roll of a pair of standard dice. Let X be the random variable that represents the sum of the two dice. By the fundamental counting principle, since rolling each die has 6 possible outcomes, there are a total of 6⋅6=36 possible results. Additionally, the possible values of X are integers from 2 to 12.
A table that represents the theoretical probability distribution of X will now be created. Frequencies represent the number of dice roll results that add up to the given values x of the random variable X. The frequency is divided by 36 to determine the theoretical probability of each outcome.
X=Sum of Two Dice | ||
---|---|---|
x | Frequency | P(X=x) |
2 | 1 | 361≈0.028 |
3 | 2 | 362≈0.056 |
4 | 3 | 363≈0.083 |
5 | 4 | 364≈0.111 |
6 | 5 | 365≈0.139 |
7 | 6 | 366≈0.167 |
8 | 5 | 365≈0.139 |
9 | 4 | 364≈0.111 |
10 | 3 | 363≈0.083 |
11 | 2 | 362≈0.056 |
12 | 1 | 361≈0.028 |
Izabella is a big soccer fan. One weekend, she invited her friend Dylan to watch a world championship match together at her house. During the coin toss ceremony, Izabella asked Dylan about the number of heads they will obtain if they toss a fair coin four times.
Let X be a random variable that represents the number of heads in four coin flips. Help Izabella and Dylan solve the following problems and determine whether they can predict the number of times the experiment results in heads.
Number of Heads, x | Tally | Frequency |
---|---|---|
0 | ∣∣∣∣ ∣ | 6 |
1 | ∣∣∣∣ ∣∣∣∣ ∣∣∣∣ ∣∣∣∣ ∣∣ | 22 |
2 | ∣∣∣∣ ∣∣∣∣ ∣∣∣∣ ∣∣∣∣ ∣∣∣∣ ∣∣∣∣ ∣∣∣∣ ∣∣ | 37 |
3 | ∣∣∣∣ ∣∣∣∣ ∣∣∣∣ ∣∣∣∣ ∣∣∣∣ ∣∣∣ | 28 |
4 | ∣∣∣∣ ∣∣ | 7 |
Use this data to find the experimental probability of each possible value of X.
X=Number of Heads | |||||
---|---|---|---|---|---|
x | 0 | 1 | 2 | 3 | 4 |
P(X=x) | 0.0625 | 0.25 | 0.375 | 0.25 | 0.0625 |
X=Number of Heads | |||||
---|---|---|---|---|---|
x | 0 | 1 | 2 | 3 | 4 |
P(X=x) | 0.06 | 0.22 | 0.37 | 0.28 | 0.07 |
X=Number of Heads | |||||
---|---|---|---|---|---|
x | 0 | 1 | 2 | 3 | 4 |
Possible Outcomes | TTTT | TTTH, TTHT, THTT, HTTT | HHTT, HTTH, TTHH, HTHT, THTH, THHT | HHHT, HHTH, HTHH, THHH | HHHH |
Frequency | 1 | 4 | 6 | 4 | 1 |
X=Number of Heads | |||||
---|---|---|---|---|---|
x | 0 | 1 | 2 | 3 | 4 |
Frequency | 1 | 4 | 6 | 4 | 1 |
P(X=x) | 161 | 164=41 | 166=83 | 164=41 | 161 |
This table describes the theoretical probabilities associated with tossing a fair coin four times. Since only the theoretical probability is required, the Frequency
column can be skipped.
X=Number of Heads | |||||
---|---|---|---|---|---|
x | 0 | 1 | 2 | 3 | 4 |
P(X=x) | 0.0625 | 0.25 | 0.375 | 0.25 | 0.0625 |
Number of Heads, x | Tally | Frequency |
---|---|---|
0 | ∣∣∣∣ ∣ | 6 |
1 | ∣∣∣∣ ∣∣∣∣ ∣∣∣∣ ∣∣∣∣ ∣∣ | 22 |
2 | ∣∣∣∣ ∣∣∣∣ ∣∣∣∣ ∣∣∣∣ ∣∣∣∣ ∣∣∣∣ ∣∣∣∣ ∣∣ | 37 |
3 | ∣∣∣∣ ∣∣∣∣ ∣∣∣∣ ∣∣∣∣ ∣∣∣∣ ∣∣∣ | 28 |
4 | ∣∣∣∣ ∣∣ | 7 |
Calculate the experimental probability of each possible outcome by dividing its frequency by the total number of trials, 100.
Number of Heads, x | Tally | Frequency | P(X=x) |
---|---|---|---|
0 | ∣∣∣∣ ∣ | 6 | 1006 |
1 | ∣∣∣∣ ∣∣∣∣ ∣∣∣∣ ∣∣∣∣ ∣∣ | 22 | 10022 |
2 | ∣∣∣∣ ∣∣∣∣ ∣∣∣∣ ∣∣∣∣ ∣∣∣∣ ∣∣∣∣ ∣∣∣∣ ∣∣ | 37 | 10037 |
3 | ∣∣∣∣ ∣∣∣∣ ∣∣∣∣ ∣∣∣∣ ∣∣∣∣ ∣∣∣ | 28 | 10028 |
4 | ∣∣∣∣ ∣∣ | 7 | 1007 |
Since only the experimental probability is required, the Tally
and Frequency
columns can be skipped. Next, write the table horizontally.
X=Number of Heads | |||||
---|---|---|---|---|---|
x | 0 | 1 | 2 | 3 | 4 |
P(X=x) | 0.06 | 0.22 | 0.37 | 0.28 | 0.07 |
The expected value of a random variable X is the average of the possible outcomes of a random variable. It is used to describe the center of a probability distribution. For a discrete random variable, the expected value E(X) is given by the weighted mean.
E(X)=i=1∑nxi⋅P(X=xi)
In this formula, xi represents a specific outcome, P(X=xi) corresponds to the associated probability of xi, and n is the number of all possible outcomes. According to the law of large numbers, when considering a sequence of random variables, its average tends to the expected value under specific conditions.
n→+∞limSn=μ
The expected value is commonly used with a measure of variation such as the variance or standard deviation to determine how outcome will differ from the expected value.
The standard deviation of a random variable is a measure of variation that describes how spread out the outcomes of a random variable X are from its expected value E(X). The standard deviation is represented by the Greek letter σ — read as sigma
— and is given by the square root of the variance of X.
In this formula, xi is a specific outcome and P(X=xi) is the probability of xi.
Let X be the random variable representing the number of cars sold on a given day in a car dealership. The table below shows the probability distribution of X.
x | 0 | 1 | 2 | 3 | 4 |
---|---|---|---|---|---|
P(X=x) | 151 | 153 | 156 | 153 | 152 |
Substitute values
a⋅cb=ca⋅b
a0=0
Identity Property of Addition
Add fractions
Use a calculator
Round to 2 decimal place(s)
xi | [xi−E(X)]2 | [xi−E(X)]2⋅P(X=xi) |
---|---|---|
0 | (0−2.13)2=4.5369 | 4.5364⋅151≈0.3025 |
1 | (1−2.13)2=1.2769 | 1.2769⋅153≈0.2554 |
2 | (2−2.13)2=0.0169 | 0.0169⋅156≈0.0068 |
3 | (3−2.13)2=0.7569 | 0.7569⋅153≈0.1514 |
4 | (4−2.13)2=3.4969 | 3.4969⋅152≈0.4663 |
Variance σ2 | ≈1.1824 |
Finally, calculate the square root of the variance to get the standard deviation of X.
Izabella's aunt Magdalena owns a clothing store. She needs to increase stock in her shop and plans to invest $15000 in one of the two collections that were offered to her by well-known brands. Each brand claims that they have a great expected rate of return. Their probability distributions are described below.
Dylan and Izabella want to help Magdalena make the best decision. They decided to use their recently acquired knowledge about the expected value and standard deviation of a probability distribution to analyze the offers. Help them answer the following questions and give the best advice to Magdalena.
Substitute values
Multiply
Add and subtract terms
xi | [xi−E(X)]2 | [xi−E(X)]2⋅P(X=xi) |
---|---|---|
1200 | (1200−1125)2=5625 | 5625⋅0.5=2812.5 |
1800 | (1800−1125)2=455625 | 455625⋅0.2=91125 |
900 | (900−1125)2=50625 | 50625⋅0.2=10125 |
-150 | (-150−1125)2=1625625 | 1625625⋅0.1=162562.5 |
Sum of Values | 266625 |
Substitute values
Multiply
Add and subtract terms
xi | [xi−E(X)]2 | [xi−E(X)]2⋅P(X=xi) |
---|---|---|
3600 | (3600−1145)2=6027025 | 6027025⋅0.3=1808107.5 |
2850 | (2850−1145)2=2907025 | 2907025⋅0.1=290702.5 |
-300 | (-300−1145)2=2088025 | 2088025⋅0.4=835210 |
-500 | (-500−1145)2=2706025 | 2706025⋅0.2=541205 |
Sum of Values | 3475225 |
The expected value and the standard distribution of each probability distribution have been calculated. The following table summarizes these measures.
Measures of the Probability Distributions | ||
---|---|---|
Expected Value of Collection I | 1125 | |
Expected Value of Collection II | 1145 | |
Standard Deviation of Collection I | ≈516.23 | |
Standard Deviation of Collection II | ≈1864.20 |
The outcomes of many experiments can be reduced to two possibilities, success or failure. If two more conditions are satisfied, these experiments can be modeled by a binomial experiment.
A binomial experiment is a probability experiment that has the following three properties.
Note that many probability experiments can be reduced so that they satisfy the conditions of a binomial experiment. In case of rolling a die, there are six possible outcomes. However, they can be divided into two groups — even numbers and odd numbers, for example.
After grouping the outcomes, there are two possible results of each trial — rolling an even number or an odd number. The probability of rolling a number from either group is constant throughout the trials, and the result of rolling the die is not affected by the previous results.Is there a fixed number of trials for each experiment? How many outcomes are possible? Does the probability of each outcome remain constant for each trial? Are trials independent?
Start by recalling the conditions that a binomial experiment should satisfy.
Analyze each situation one at a time to see if it follows all of the conditions of a binomial experiment.
This situation has eight trials of selecting one scratch-off cards at random.
Each card could win a prize or not, which means there are two possible outcomes for each trial. Moreover, the probability of success, which is winning a prize, is 25% or 0.25, for every card.Note that height can vary for every people surveyed.
Because there are 100 possible answers, it is likely that more than 2 different outcomes will occur. This means that this situation does not represent a binomial experiment.
This situation has a fixed number of trials because it involves asking 20 people if their blood type is O.
Each trial has only two possible solutions, blood type O or another type. Moreover, the probability of having blood type O in each trial is 0.4, which represents the probability of success.All four situations have already been analyzed and the results are summarized in the following table.
Experiment | Is It a Binomial Experiment? |
---|---|
The Scratch-off Cards | ✓ |
Heights of 100 People | × |
Rolling a Die | × |
Blood Type O? | ✓ |
Because binomial experiments can simplify many complex situations, it is essential to determine how likely it is to obtain a specific number of successes out of n trials in a given experiment. Also, the expected value, or center of the distribution, will be presented.
The binomial distribution is the probability distribution that describes the number of successes x out of n binomial trials. The trials must satisfy three conditions.
Let X be a random variable representing the total number of successes among n trials. The possible values of X are x=0, 1, 2, …, n. The binomial probability formula can be used to determine the probability of x successes among n trials P(X=x).
P(X=x)=nCxpxqn−x
In this formula, nCx is the binomial coefficient and p and q are the probabilities of success and failure, respectively. Additionally, the expected value of X can be determined by the product of the number of trials n and the probability of success p.
E(X)=np
This means that the expected number of successes in n trials is given by np.
Consider the experiment of drawing 1 card from a standard deck of cards with replacement.
If drawing a diamond is considered a success, let X be the number of diamonds drawn. Since there are 13 diamond cards in a standard deck, the probability of success p in each trial will be 5213=41. The probability of failure q will be 1−41=43. Suppose the experiment is repeated 5 times. Then, the binomial distribution with n=5 can be determined.