Sign In
| 11 Theory slides |
| 8 Exercises - Grade E - A |
| Each lesson is meant to take 1-2 classroom sessions |
Here are a few recommended readings before getting started with this lesson. Background to help understand Probability
Mark's father runs a burger restaurant. The mean age of people who visit the restaurant is 24.3 years old. Mark suspects that this situation has changed during the last year. To investigate whether his suspicions were true, he surveyed 65 customers and found a sample mean of 25.5 years with a standard deviation of 5 years.
If he wants to test his results with a 10% significance, help him complete the following questions.
Inferential statistics uses data from a sample to draw conclusions or test hypotheses about a population. Conclusions made from a sample are almost never 100% accurate but can be thought of as the best guess or most probable answer. One of the main tasks of inferential statistics is to provide a confidence interval.
The maximum error of estimate, also known as the margin of error, is the maximum difference between the estimate of the population mean xˉ and its actual value. The maximum error of estimate E is calculated using the following formula.
E=z⋅ns,n≥30
In this formula, z represents the z-value of a certain confidence level, s is the standard deviation of the sample, and n is the sample size. From the formula, some conclusions can be made about the error of estimate.
The maximum error of estimate is added to and subtracted from the estimation mean xˉ to find the bounds of a confidence interval.
A confidence interval for the population mean can be found by adding and subtracting the maximum error of estimate E to and from the sample mean xˉ.
CI=xˉ±E
Mark's father owns a burger restaurant. He wants to implement changes to improve the customer experience. Recently he found that in a sample of 36 burgers, on average, a burger takes 22 minutes to be cooked and given to the customer, with a standard deviation of 6.2 minutes.
Is the sample size greater than 30?
Since the confidence level c is 90%, this portion of the area around the mean μ will be covered in a standard normal distribution. The area in the distribution's tails that are not in the confidence interval will be (100−90)/2=5% each.
Because the distribution is symmetric, the z-values limiting this area are opposites, so only one value needs to be found. Additionally, this value is given by the z-value of the upper or lower tail. One way to determine this value is to use a graphing calculator. Push 2nd, then VARS, and choose the third option, invNorm(.
Next, enter 0.05 and push ENTER to get the z-value of the lower tail.
The z-value is approximately -1.645, and because of the symmetry of the distribution, this means that its additive inverse 1.645 can be used to evaluate the formula.
Substitute values
a⋅cb=ca⋅b
Multiply
Calculate root
Use a calculator
Round to 2 decimal place(s)
The secret to the success of the burger restaurant is not only the flavor of the meat but also the soda included in the King's Combo. This soda follows a unique brewing process, and a soda dispensing machine fills the bottles that are later sold with the combos.
Mark wants to find the mean volume contained in the bottles that are filled by the dispensing machine. He took a sample of 50 bottles of soda and measured their volumes. He found that the mean volume of the bottles is 330 milliliters with a standard deviation of 10. Which option corresponds to a 99% confidence interval for the population mean μ of soda volume?Begin by calculating the maximum error of estimate. Then add and subtract that from the sample mean to get the bounds of the confidence interval.
Determine a confidence interval for the population mean μ of soda volume in order to identify the right option. To do so, follow these steps.
The mean volume for the sample consisting of 50 sodas was 330 milliliters. The maximum error of estimate will be calculated next.
This value is given by the z-value of the upper or lower tail. Because the distribution is symmetric, the z-values limiting this area will be opposites of each other, so only one needs to be found. In this case, a short version of the standard normal table can be used to locate the z-value of the lower tail, which in decimal form is 0.005.
.0 | .1 | .2 | .3 | .4 | .5 | .6 | .7 | .8 | .9 | |
---|---|---|---|---|---|---|---|---|---|---|
-3 | .00135 | .00097 | .00069 | .00048 | .00034 | .00023 | .00016 | .00011 | .00007 | .00005 |
-2 | .02275 | .01786 | .01390 | .01072 | .00820 | .00621 | .00466 | .00347 | .00256 | .00187 |
-1 | .15866 | .13567 | .11507 | .09680 | .08076 | .06681 | .05480 | .04457 | .03593 | .02872 |
-0 | .50000 | .46017 | .42074 | .38209 | .34458 | .30854 | .27425 | .24196 | .21186 | .18406 |
0 | .50000 | .53983 | .57926 | .61791 | .65542 | .69146 | .72575 | .75804 | .78814 | .81594 |
1 | .84134 | .86433 | .88493 | .90320 | .91924 | .93319 | .94520 | .95543 | .96407 | .97128 |
2 | .97725 | .98214 | .98610 | .98928 | .99180 | .99379 | .99534 | .99653 | .99744 | .99813 |
3 | .99865 | .99903 | .99931 | .99952 | .99966 | .99977 | .99984 | .99989 | .99993 | .99995 |
Substitute values
a⋅cb=ca⋅b
Multiply
Use a calculator
Round to 2 decimal place(s)
CI=xˉ±E | |
---|---|
xˉ−E | xˉ+E |
330−3.60 | 330+3.60 |
326.40 | 333.60 |
While a confidence interval helps estimate the value of a population parameter like the mean, there is another inferential method that can help evaluate a specific claim about a population parameter. Before exploring this method, two statistical hypotheses about the population need to be identified. These are the null and alternative hypotheses.
The null hypothesis and alternative hypothesis are two mutually exclusive statements about the mean of a population. The null hypothesis, denoted by H0, is a statement of equality or non-strict inequality about the population mean that is accepted as true unless strong evidence is shown against it.
H0: Null Hypothesis
Conversely, the alternative hypothesis, denoted by Ha or H1, is a strict inequality statement that contradicts the null hypothesis. It is the complement of the null hypothesis and will be accepted if there is evidence in its favor.
Ha: Alternative Hypothesis
Notice that the initial claim made by the researcher is the one that sets the null and alternative hypotheses. If the claim can be written algebraically as a strict inequality, it will be part of the alternative hypothesis. Otherwise, it will be part of the null hypothesis.
Another characteristic of the King's Combo at Mark's father's restaurant is that customers can choose between a cookie or a soft ice cream as part of their meal. They can also pay $2 more to get a piece of cake.
Null Hypothesis | Alternative Hypothesis |
---|---|
The mean is greater than or equal to 0.60. H0:μ≥0.60 |
The mean is less than 0.60. (claim) Ha:μ<0.60 |
Null Hypothesis | Alternative Hypothesis |
---|---|
The mean is equal to 0.50. (claim) H0:μ=0.50 |
The mean is not equal to 0.50. Ha:μ=0.50 |
Once the null and alternative hypotheses have been correctly identified, they can be tested by performing a hypothesis test to see which statement is more likely true. Before the test can be performed, some information is needed.
A hypothesis test is an inferential method that uses sample data to examine a claim about the mean μ of a population. Because the population mean is almost always unknown, it is common to be suspicious about the truthfulness of any assumption about its value. The following are typical claims about the mean of a population.
Typical Claims About the Mean | ||
---|---|---|
The mean is equal to a specific value, μ=k. | The mean is greater than a specific value, μ>k. | The mean is less than a specific value, μ<k. |
Before making a hypothesis test, two hypotheses need to be specified, the null hypothesis and the alternative hypothesis. These hypotheses must be mutually exclusive. The null hypothesis H0 is assumed to be true. The hypothesis test puts the null hypothesis on trial to see if there is strong evidence against it. If so, the alternative hypothesis Ha is accepted instead.
The significance level α is the probability that the results obtained in a sample are due to chance and is set in advance when making a hypothesis test. The smaller the α value, the stronger the results of a sample are. These are typical values for the significance level.
Typical Significance Levels α | ||
---|---|---|
1% | 5% | 10% |
In a standard normal distribution, the sample mean would fall around the center of the distribution if the null hypothesis H0 were true. This means that a value in the tails of the distribution would be unusual if H0 were true. The significance level tells how far the sample mean will lie in from the center of the distribution and whether to reject the null hypothesis and accept the alternative hypothesis Ha.
The critical region, determined by the significance level α, is the set of values that will lead to rejecting the null hypothesis H0. In a standard normal distribution, this region is located in the tails of the distribution. The cutoff value of the region is a critical value given by the z-value of α. The tests of significance — left, right, or two-tail — determine whether there are one or two critical regions.
Critical Values | |||
---|---|---|---|
Significance Level | Left-Tail Test Ha:μ<k |
Two-Tail Test Ha:μ=k |
Right-Tail Test Ha:μ>k |
α=1% | -2.326 | ±2.576 | 2.326 |
α=5% | -1.645 | ±1.960 | 1.645 |
α=10% | -1.282 | ±1.645 | 1.282 |
In a hypothesis test, the region where the null hypothesis is rejected is known as the critical region. The location of this region depends on the significance level α and the inequality symbol of the alternative hypothesis as determined by the tests of significance. The tests of significance can be divided into the left-tailed test, the two-tailed test, and the right-tailed test.
The applet below shows how the critical regions vary depending on the tests of significance.
When making a hypothesis test, begin by identifying the claim to set the null and alternative hypotheses. Then the critical regions and the critical values are determined based on the tests of significance. Finally, the null hypothesis is rejected if the z-statistic falls within the critical region. To illustrate this process, consider the following situation.
A company says that each of their packages of ham contains exactly 20 slices. |
Null Hypothesis H0 | Alternative Hypothesis Ha |
---|---|
The mean is equal to 20 slices (claim). H0:μ=20 |
The mean is different than 20 slices. Ha:μ=20 |
Because the sign of the alternative hypothesis is =, a two-tailed test of significance will be conducted. This means that there are two critical regions whose cutoffs will be given by the z-value of the significance level α. The following are the critical values for the most common α values.
Critical Values | |||
---|---|---|---|
Significance Level | Left-Tail Test Ha:μ<k |
Right-Tail Test Ha:μ>k |
Two-Tail Test Ha:μ=k |
α=1% | -2.326 | 2.326 | ±2.576 |
α=5% | -1.645 | 1.645 | ±1.960 |
α=10% | -1.282 | 1.282 | ±1.645 |
From the table, note that the critical values for a 10% significance level are ±1.645. Now the critical regions and critical values can be labeled.
Substitute values
Subtract term
b/ca=ba⋅c
Calculate root
Put minus sign in front of fraction
Multiply
Calculate quotient
Next, verify if the z-statistic falls within the critical region. If so, reject the null hypothesis. To do so, plot the z-statistic jointly with the critical regions to see where it falls, outside or inside the critical region.
Because the z-statistic falls within the critical region, the null hypothesis H0 is rejected in this case.
Use the result of the previous step to make a conclusion about the initial claim.
A company says that each of their packages of ham contains exactly 20 slices. |
In this case, since the initial claim is related to the null hypothesis, it can be said that there is enough evidence to reject the claim that the packages of ham contain exactly 20 slices.
The following situations need to be considered when calculating the critical values.
For the given example, each critical region will cover an area of 5%. Therefore, the z-value for the left 0.05 will be found first. To do so, push 2nd, then VARS, and choose the third option, invNorm(.
Now enter the desired value, which in this case is 0.05. Finally, push ENTER to get the result.
The z-value for the left tail is about -1.645, so the z-value for the right tail will be 1.645. A similar process is followed when performing a one-tail test.
Dinos and Dragonsmovie with his family, Mark decides to eat a bar of his favorite chocolate as a snack. After eating it, he feels slightly disappointed because the bar seemed a little smaller than the 150 grams listed on the packages. He decides to investigate if the brand producing the chocolate bars lied about the weight of the chocolate bars.
Null Hypothesis H0 | Alternative Hypothesis Ha |
---|---|
The mean is equal to 150 g. (claim) H0:μ=150 g |
The mean is different than 150 g. Ha:μ=150 g |
Because the sign of the alternative hypothesis is =, a two-tailed test of significance corresponds to this situation.
invNorm(.
Next, given that 20.5=0.025, enter 0.025 and push ENTER to get the result.
This is the critical value corresponding to the critical region on the left of the standard normal distribution. Because the distribution is symmetric, the critical value for the upper tail will be the same but with the opposite sign. With this information, the critical regions can be set in the distribution.
This corresponds to option A.
Substitute values
Subtract term
b/ca=ba⋅c
Use a calculator
Round to 3 decimal place(s)
Note that the z-statistic falls outside the critical region. Therefore, the null hypothesis cannot be rejected. This means that there is not enough evidence to reject the claim about the weight of the chocolate bars. So, it is most likely true that the mean weight of the chocolate bars is 150 g.
After enjoying the Dinos and Dragons
movie with his family, Mark and his father start watching sports news. The newscaster reports that, on average, teens spend at most 59 minutes a day playing sports. Mark wants to determine if what the news reported is accurate.
Using a sample of 35 teens, Mark calculates a mean of 62 minutes and a standard deviation of 6 minutes. Help Mark if he wants to test the news report with 5% significance.
Null Hypothesis H0 | Alternative Hypothesis Ha |
---|---|
The mean is less than or equal to 59 minutes. (claim) H0:μ≤59 minutes |
The mean is greater than 59 minutes. Ha:μ>59 minutes |
Because the sign of the alternative hypothesis is >, a right-tailed test of significance applies to this situation.
invNorm(.
Because the upper 5% of the distribution is desired, the value to be entered into the calculator is given by 1−0.05=0.95. Next, enter this value and push ENTER to get the result
The critical value is about 1.645. This value will limit the critical region that will be located in the right tail of the distribution.
Therefore, this corresponds to option D.
Substitute values
Subtract term
b/ca=ba⋅c
Use a calculator
Round to 3 decimal place(s)
Since the z-statistic falls in the critical region, the null hypothesis should be rejected. Additionally, because the initial claim is related to the null hypothesis, it can be said that it is more likely that the mean time spent by teens playing sports is greater than 59 minutes.
This lesson reviewed the importance of samples when it comes to estimating population parameters. However, due to the margin of error in estimations, inferential methods are helpful when stating how confident a specific estimation is or testing a particular claim about the population mean.
Inferential Methods | |
---|---|
Confidence Interval | Hypothesis Test |
Estimates a population parameter as a range of values | Tests a claim about the mean of a population |
Now the challenge presented earlier about the average age of people at the burger restaurant can be solved.
The mean age of people who eat at Mark's father's burger restaurant used to be 24.3. Mark suspects that this has changed, so he surveyed a sample of 65 customers. He found a sample mean of 25.5 years with a standard deviation of 5 years. If he wants to conduct a test with 10% significance, help him through the hypothesis test.
Null Hypothesis H0 | Alternative Hypothesis Ha |
---|---|
The mean is equal to 24.3. (claim) H0:μ=24.3 |
The mean is different than 24.3 years. Ha:μ=24.3 |
Because the sign of the alternative hypothesis is =, a two-tailed test of significance will be needed in this case.
invNorm(.
Mark wants to test his hypothesis at a 10% significance level, meaning each critical region will contain 5% of the distribution. Next, enter 0.05 and push ENTER to get the result.
This is the critical value corresponding to the critical region on the left of the standard normal distribution. Moreover, because the distribution is symmetric, the critical value for the upper tail will be the same but with the opposite sign. With this information, the critical regions can be set in the distribution.
This corresponds to option B.
Substitute values
Subtract term
b/ca=ba⋅c
Use a calculator
Round to 2 decimal place(s)
Because the z-statistic falls in the critical region, the null hypothesis H0 should be rejected. Given that the initial claim is related to the null hypothesis, there is strong evidence to reject the claim that the mean age of customers at the restaurant is 24.3. This means that it is more likely that the mean age is different than 24.3.
A laptop manufacturing company says it takes no more than 48 minutes for the battery to fully charge. Using a sample of 38 laptops, Ali calculated a mean time of 53 minutes with a standard deviation of 10.45 minutes.
If a test with 10% significance is performed, analyze the following situations and select the ones that fit this study.
Consider the following graphs.
We will identify if a left-tailed, right-tailed, or two-tailed test fits this situation. To do so, let's begin by identifying the claim to set the null and alternative hypotheses.
It takes no more than 48 minutes for the laptop battery to charge fully.
We can represent the company's claim with a non-strict inequality. μ≤48 Because of the ≤ sign, we can relate the claim to the null hypothesis. Conversely, the complement will be the alternative hypothesis H_a.
Null Hypothesis H_0 | Alternative Hypothesis H_a |
---|---|
The mean is less than or equal to 48 minutes. (claim) H_0: μ≤48 minutes |
The mean is greater than 48 minutes. H_a:μ> 48 minutes |
Now we can identify the inequality symbol of the alternative hypothesis to see which test of significance applies to the given situation. Since the inequality symbol is >, a right-tailed test fits this situation.
We are performing a right-tailed test at 10 % significance so that the critical region will be in the upper 10 % of the standard normal distribution. We will calculate the z-value of the upper 10 % to find the critical value and set the critical region. We will use a graphing calculator. Push 2nd, then VARS, and select the third option, invNorm(.
Because we want the upper 10 % of the distribution, the value to be entered into the calculator is given by 1-0.1=0.9. Next, we will enter this value and push ENTER to get the result
The critical value is about 1.282. We can now set the critical region in the upper tail of the distribution by using this value.
Note that this graph corresponds to option A.
We will now calculate the z-statistic using the following formula.
z=x-μ/ssqrt(n)
In this formula, x is the sample mean, μ is the population mean, s is the standard deviation of the sample, and n is the sample size. In this case, we are given that x= 53, μ= 48, s= 10.45, and n= 38.
Now that we have the z-statistic, we can plot it on the graph of the critical region to see its position in the distribution. If it falls in the critical region, we can reject the null hypothesis. Let's do it!
We can see that the z-statistic is in the critical region, so we can reject the null hypothesis. Recall that the initial claim is related to the null hypothesis.
It takes no more than 48 minutes for the laptop battery to charge fully.
Therefore, we can reject the claim and say that it is more likely that the mean time it takes the laptop battery to charge fully is more than 48 minutes. This is statement B.
A famous gaming chair manufacturer claims that the chairs they produce last for at least ten years. Suppose that from a sample of 33 chairs, Davontay found that the mean time that the sample chairs lasted was 9.5 years with a standard deviation of 1.5 years.
If a 1% hypothesis test is conducted using this sample data, determine the information corresponding to the following situations.
To identify the test of significance that applies to this situation, we need first to analyze the company's claim to set the null and alternative hypotheses. Then according to the inequality sign of the alternative hypothesis, we can determine the test of significance. Let's take a look at the company's claim.
Our gaming chairs last for at least 10 years.
We can represent the claim of the manufacturer company as the following non-strict inequality. μ≥ 10 We can see that the inequality symbol in this expression is ≥, which means we can relate the claim to the null hypothesis. The alternative hypothesis will be the complement of this statement.
Null Hypothesis H_0 | Alternative Hypothesis H_a |
---|---|
The mean is greater than or equal to 10 years. (claim) H_0: μ≥10 years |
The mean is less than 10 years. H_a:μ< 10 years |
Note that the inequality sign of the alternative hypothesis is <. Therefore, a left-tailed test is needed when making a hypothesis test to evaluate the company's claim.
Given that we will make a left-tailed test and the significance level is 1 %, we need to find the critical value that separates the 1 % of the area in the left tail of the standard normal distribution. To do so, we will use a graphing calculator. Push 2nd, then VARS, and select the third option, invNorm(.
Next, we will enter 0.01 and push ENTER to get the critical value corresponding to the 1 %.
The critical value is about -2.326. By using this value, we can draw the critical region for this hypothesis test.
Consider the formula for the z-statistic.
z=x-μ/ssqrt(n)
In this formula, x is the sample mean, μ is the population mean, s is the standard deviation of the sample, and n is the sample size. In this case, we are told that x= 9.5, μ= 10, s= 1.5, and n= 33.
Now let's plot the z-statistic to see if it falls in the critical region. If so, we will reject the null hypothesis. If it does not, we cannot reject the null hypothesis.
Because the z-statistic falls outside the critical region, we cannot reject the null hypothesis. Let's recall the claim of the gaming chair manufacturing, which is related to the null hypothesis.
Our gaming chairs last for at least 10 years.
Since we cannot reject the claim, it means that the company statement that the gaming chairs last for at least ten years is most likely true.
A company that packages cereal wants to test if their machines are functioning well. The machines must package the correct weight in each box they sell. If the weight of the boxes are off, the company is either shorting customers or losing money.
We will make a hypothesis test to identify which statement is more likely true about the machine that packages the cereal boxes. Let's recall the five steps to perform a hypothesis test.
Let's follow these steps one at a time.
We are told that the company claims that the mean weight of cereal boxes equals 510 grams. We will represent this claim by the following equality. μ=510 Because this is a statement of equality, it will represent the null hypothesis. Conversely, the alternative hypothesis is that the mean weight of cereal boxes is not 510 grams.
Null Hypothesis H_0 | Alternative Hypothesis H_a |
---|---|
The mean equals 510 grams. (claim) H_0: μ=510 grams |
The mean is not equal to 510 grams. H_a:μ≠ 510 grams |
Since the sign of the alternative hypothesis is ≠, we will conduct a two-tailed test of significance. This means that there are two critical regions whose cutoffs will be given by the z-value of the significance level α. The following are the critical values for the most common α values.
Critical Values | |||
---|---|---|---|
Significance Level | Left-Tail Test H_a:μ | Right-Tail Test H_a:μ>k |
Two-Tail Test H_a:μ≠ k |
α=1 % | -2.326 | 2.326 | ±2.576 |
α=5 % | -1.645 | 1.645 | ±1.960 |
α=10 % | -1.282 | 1.282 | ±1.645 |
From the table, we can see that the critical values for a 10 % significance level are ±1.645. Let's use these values to label the critical regions and critical values in a standard normal distribution.
Recall that the z-statistic is the z-value of the sample mean and can be calculated by the following formula. z=x-μ/ssqrt(n) In this formula, x is the sample mean, μ is the population mean, s is the standard deviation of the sample, and n is the sample size. For the given situation, we have that x= 516, μ= 510, s= 20, and n= 40.
We will now verify if the z-statistic falls in the critical region. If so, we will reject the null hypothesis. Let's plot the z-statistic into the graph of the critical regions.
Because the z-statistic is in the critical region, we must reject the null hypothesis H_0.
Knowing that there is evidence to reject the null hypothesis, let's now make a conclusion about the initial claim.
The mean weight of cereal boxes equals 510 grams.
Because the initial claim is related to the null hypothesis, we can reject the claim that the boxes weigh, on average, 510 grams. This implies that the machine packaging the boxes is malfunctioning, so statement B is true.