Sign In
| 12 Theory slides |
| 7 Exercises - Grade E - A |
| Each lesson is meant to take 1-2 classroom sessions |
Here are a few recommended readings before getting started with this lesson.
Zosia attends North High School in Honolulu. She asked 50 students whether they prefer a chocolate bar or a piece of fruit as a lunchtime snack and whether or not they surf. She obtained the following information.
A two-way frequency table, also known as a two-way table, displays categorical data that can be grouped into two categories. One of the categories is represented in the rows of the table, the other in the columns. For example, the table below shows the results of a survey where 100 participants were asked if they have a driver's license and if they own a car.
carand
driver's license.Both have possible responses of
yesand
no.The numbers in the table are called joint frequencies. Also, two-way frequency tables often include the total of the rows and columns — these are called marginal frequencies. Select any frequency in the table below to display more information.
Totalrow and the
Totalcolumn, which in this case is 100, equals the sum of all joint frequencies. This is called the grand total. A joint frequency of 43 shows that 43 people have a driver's license and own a car. A marginal frequency of 53 shows that 53 people do not have a car. The rest of the numbers from the table can also be interpreted.
Organizing data in a two-way frequency table can help with visualization, which in turn makes it easier to analyze and present the data. To draw a two-way frequency table, three steps must be followed.
Suppose that 53 people took part in an online survey, where they were asked whether they prefer top hats or berets. Out of the 18 males that participated, 12 prefer berets. Also, 15 of the females chose top hats as their preference. The steps listed above will now be used to analyze and present the data.
First, the two categories of the table must be determined, after which the table can be drawn without frequencies. Here, the participants gave their hat preference and their gender, which are the two categories. Hat preference can be further divided into top hat and beret, and gender into female and male.
The total row and total column are included to write the marginal frequencies.
The given joint and marginal frequencies can now be added to the table.
Zain has a job leading backpackers on excursions in the High Sierras. To better understand what time of day to plan certain activities, Zain posed a question to 50 backpackers about their sleep patterns: Are you a night owl or an early bird?
Zain then categorized the participants by sleep pattern and age — younger than 30 and 30 or older. Here is part of what was gathered.
Zain made a two-way frequency table with the data they collected. Unfortunately, some of the data values got smudged and are unable to be read! The missing data values have been replaced with letters, for now.
Begin by finding the number of people age 30 or older who participated in the survey. To do so, calculate the difference between the grand total and the number of participants younger than 30. That would be 50 divided by 23.
In a two-way frequency table, a joint relative frequency is the ratio of a joint frequency to the grand total. Similarly, a marginal relative frequency is the ratio of a marginal frequency to the grand total. Consider the following example of a two-way table.
Here, the grand total is 100. The joint and marginal frequencies can now be divided by 100 to obtain the joint and marginal relative frequencies. Clicking in each cell will display its interpretation.
Previously, Zain made a two-way frequency table about backpackers sleep patterns.
Zain wants to dig deeper into the data for even more clear interpretations, so they plan to calculate the joint and marginal relative frequencies.
To calculate the joint and marginal relative frequencies, the joint and marginal frequencies must be divided by the grand total, 50.
The table below shows the joint and marginal relative frequencies.
One finding — of a variety — based on the joint and marginal relative frequencies, shows that about one-third of the participants who are 30 or older are night owls. Additionally, Zain can see that the participants are almost equally distributed among the categories, as both pairs of marginal relative frequencies have values close to 50-50.
A conditional relative frequency is the ratio of a joint frequency to either of its two corresponding marginal frequencies. Alternatively, it can be calculated using joint and marginal relative frequencies. As an example, the following data will be used.
Referring to the column totals, the left column of joint frequencies should be divided by 67 and the right column by 33. Furthermore, since the column totals are used, the sum of the conditional relative frequencies of each column is 1.
The resulting two-way frequency table can be interpreted to obtain the following information.
Using their two-way frequency table, Zain wants to continue improving the interpretation of their data by finding the conditional relative frequencies.
Zain will use the row totals to make the calculations.
Since Zain uses the row totals, the joint frequencies in the first row must be divided by 23 and the joint frequencies in the second row must be divided by 27.
Zain uses the row totals. Therefore, the joint frequencies in the first row must be divided by 23 and the joint frequencies in the second row must be divided by 27.
The table below shows the conditional relative frequencies.
Zain interprets the various findings as reason to believe when planning night activities, like storytelling over a campfire, they could tailor the stories for an older generation. Interestingly, the older backpackers, as a whole, seem to prefer nights more than the younger backpackers. Zain can now plan according to these interpretations.
Zain will now consider the two-way table that shows conditional relative frequencies obtained using row totals.
They want to calculate some conditional probabilities by using the table. Help Zain find these probabilities!
Consider the fact that the conditional relative frequencies were found using row totals.
The table was created using row totals. Therefore, the first cell of the first row shows the probability of a person being a night owl given that they are younger than 30. Similarly, the second cell of the first row shows the probability of a person being an early bird given that they are younger than 30.
Likewise, the first cell of the second row shows the probability of a person being a night owl given that they aged 30 or older. Similarly, the second cell of the second row shows the probability of a person being an early bird given that they are aged 30 or older.
Paulina conducted a survey at Washington High. She asked 170 students whether they have cable TV and whether they took a vacation last summer. She displays the results in a two-way frequency table.
taking a vacationand
having cable TVare independent events for this population of 170 students.
What is the probability that a student chosen at random took a vacation last summer? What is the probability that a random student who has cable TV took a vacation last summer?
Let A be the event that a student took a vacation last summer and B be the event that a student has cable TV. The table shows that from a total of 170 participants, 56 students took a vacation last summer.
At the beginning of the lesson, Zosia asked 50 students of North High School in Honolulu whether they prefer a chocolate bar or a piece of fruit as a lunchtime snack and whether they surf or not.
Make a two-way frequency table to display the obtained information.
A two-way frequency table can be made to organize the obtained information.
Next, the missing marginal frequencies can be calculated.
Now, two of the three missing joint frequencies can be calculated.
Finally, the last empty cell can be filled.
Now that the two-way table is complete, the desired probabilities can be found. Out of a total of 50 students, 42 surf and 28 prefer fruit as a lunch snack.
Three different communities, A, B, and C, were surveyed to find out if they were satisfied with the health care they received. The responses were summarized as joint relative frequencies in a two-way frequency table.
being satisfied with the health careand
living in Community Cindependent events? Justify your reasoning.
We can find the percentage of Community A that was satisfied with their health care by calculating the conditional probability shown below. P(satisfied|community A) Let's go back to the given two-way frequency table and calculate the marginal relative frequencies for every column.
Now, since we are restricting our population to Community A, we are only interested in the first column of the two-way frequency table.
If we divide the joint relative frequency of people who live in Community A and are satisfied with health care by the marginal relative frequency of Community A, we can determine the required conditional probability.
This means that about 74 % of people living in Community A are satisfied with their health care.
This time, it is given that a person was not satisfied with their health care, and we want to calculate the probability that the person lives in Community B. Let's find the relative marginal frequencies for the two rows.
This time our population is restricted to the people who were not satisfied with their health care. Therefore, we are only interested in the second row.
By dividing the joint relative frequency of people living in Community B who are not satisfied with health care by the marginal relative frequency of the people that said they are not satisfied with health care, we can determine the desired conditional probability.
This means that of the people that were not satisfied with their health care, about 31 % live in Community B.
Let's put the information from the tables found in Part A and Part B together.
We can see that the probability that a randomly chosen person is satisfied with the health insurance is 75 %, since this is the relative marginal frequency corresponding to being satisfied.
Now let's find the probability that a random person from Community C is satisfied with their health care. To do this we will focus on the information from the third column.
If we divide the joint relative frequency of people which live in Community C and are satisfied with their health care by the marginal relative frequency of Community C, we can determine the required conditional probability.
Now, comparing both probabilities we can see that they are approximately the same. 75 % &≈ 72 % P(satisfied) &≈ P(satisfied|community C) Since being in Community C has barely any effect on the probability of being satisfied with health care, it can be argued that these are independent events.
To explain two-way frequency tables to his students, Ignacio asks the boys and girls of his three groups whether they prefer math or English. Having gathered all of the data, he starts making a two-way frequency table. Before Ignacio can finish, he is pulled away for an important meeting. Below we see how far he got.
He asks his student, Izabella, to finish it for him. However, she does not have access to the data collected. Ignacio tells Izabella that she only needs to know two things, which he proceeds to write down on a piece of paper.
Let x be the number of boys who prefer math. Therefore, the number of girls who prefer English and those who prefer math can be represented as indicated below. c|c Girls preferring & Girls preferring math & English 2x & x+4 From the given information on the two-way frequency table, we can see that the number of students who prefer English is 37. Since the total number of students is 100, we know that 100-37 = 63 students prefer math. Let's add all this information to the diagram.
With this information we can set up an equation that we can solve for the number of boys who prefer math x. x+2x=63 Let's solve this equation for x.
Therefore, 21 boys prefer math. Now that we know this information, we can find the number of girls that prefer English and the number of girls that prefer math. c|c Girls preferring & Girls preferring math & English 2x & x+4 x = 21 & x = 21 2( 21) & 21+4 42 & 25 Let's add this information to the table.
Now, since the total number of students who prefer English was 37 and 25 girls prefer English, we can conclude that 37-25=12 boys prefer English. Let's add this to the table as well.
Now we can determine the marginal frequencies for the Boys
and Girls.
As we can see, the total number of boys in the three groups that Ignacio teaches is 33.
If the event preferring math
was independent of the events being a boy
and being a girl
, the following conditional probabilities should be equal to the probability of a random student preferring math.
P(prefers math | boys)
P(prefers math | girls)
In other words, the occurrence of the events being a boy
and being a girl
should have no effect in the probability of the event preferring math
. Let's have a look at the two-way frequency table from Part A.
We can find the probability of a random student preferring math, since we know the total of students is 100 and the students who prefer math is 63.
P(prefers math)= 63/100 = 63 %
The required conditional probabilities can be found by calculating the corresponding conditional relative frequencies. Since there are 33 boys in total and 21 preferred math, the conditional probability can be calculated.
P(prefers math | boys)=21/33≈ 64 %
Similarly, since there are 67 girls in total and 42 preferred math, the conditional probability can be found.
P(prefers math | girls)=42/67≈ 63 %
Because these conditional probabilities are almost the same as the probability that a random student prefers math, it can be argued that the events being a boy
and being a girl
do not have effect on the event preferring math.
Therefore, we can say that having a preference for math over English is independent of the student's gender.