Sign In
| 12 Theory slides |
| 9 Exercises - Grade E - A |
| Each lesson is meant to take 1-2 classroom sessions |
Here are a few recommended readings before getting started with this lesson.
Categorical data, also called qualitative data, is data that can be split into groups. Categorical data belongs to one or more categories that have a fixed number of possible outcomes or values. Human blood groups are one example of categorical data.
categorizedinto one of these groups.
The beauty store has plenty to offer, and Tiffaniqua wonders how her aunt manages such a wide variety of products. Auntie has a database where she stores information about each individual product for sale. For example, consider one particular bottle of rose hand lotion.
What variables are not described using numbers?
The store sells different types of different scented products. Consider the given bottle of hand lotion.
There are four variables present in the given picture. These are the type of product, the scent, the content volume, and the price. Notice that the type of product is hand lotion
, and the scent is rose
. These variables are described using words, rather than numbers. This means that these variables correspond to categorical data.
On the other hand, the content volume is given in fluid ounces, and the price in dollars. Since these variables are described using numbers, they correspond to numerical data.
Exploring the proportion of occurrences of a group, value, or set of values in a data set provides valuable insights. This is called the relative frequency, which is given by the ratio of an observed category or value's frequency to the total number of observations in a data set.
Relative frequency =Frequency/Number of observations
For example, suppose categorical data is explored in a survey made on a classroom about the favorite color of the students.
Color | Frequency |
---|---|
Blue | 9 |
Red | 7 |
Green | 5 |
Yellow | 4 |
Purple | 3 |
Other | 2 |
Knowing that there are 30 students in the classroom, the relative frequency of each category can be found by dividing the frequency of each category by 30.
Color | Frequency | Relative Frequency |
---|---|---|
Blue | 9 | 9/30 = 0.3 |
Red | 7 | 7/30 = 0.23 |
Green | 5 | 5/30 = 0.17 |
Yellow | 4 | 4/30 = 0.13 |
Purple | 3 | 3/30 = 0.1 |
Other | 2 | 2/30 = 0.07 |
Relative frequencies are typically written as percentages.
Color | Frequency | Relative Frequency |
---|---|---|
Blue | 9 | 30 % |
Red | 7 | 23 % |
Green | 5 | 17 % |
Yellow | 4 | 13 % |
Purple | 3 | 10 % |
Other | 2 | 7 % |
Outstandingly, Tiffaniqua's auntie hand makes the candles herself! She buys a huge block of soy wax and uses it to make different scented candles.
To give each candle their characteristic scent, she needs to add essential oils. It is vital that she knows in advance how much of each oil to buy. Otherwise, she ends up with too much. Consider her previous month's sales sheet.
Scent | Number of Candles Sold |
---|---|
Lavender | 15 |
Citrus | 5 |
Vanilla | 30 |
Rose | 20 |
Jasmine | 10 |
Begin by finding how many candles were sold. Divide each frequency by the total number of candles sold. Write each relative frequency as a percentage.
In order to find the relative frequency of the sales of each different scent of candle, begin by finding how many candles were sold last month. To do so, add the frequencies of each scent. 15 + 5 + 30 + 20 + 10 = 80 This means that 80 candles were sold last month. Next, find the relative frequency by dividing the respective frequency by this total. Since 15 lavender candles were sold, dividing 15 by 80 yields the relative frequency of lavender candles. 15/80 = 0.1875 Next, this number will be written as a percentage in order to compare it to the others. 0.1875 = 18.75 % Do the same for the rest of the scents.
Scent | Number of Candles Sold | Divide | Relative Frequency |
---|---|---|---|
Lavender | 15 | 15/80 | 0.1875=18.75 % |
Citrus | 5 | 5/80 | 0.0625=6.25 % |
Vanilla | 30 | 30/80 | 0.375=37.5 % |
Rose | 20 | 20/80 | 0.25=25 % |
Jasmine | 10 | 10/80 | 0.125=12.5 % |
Knowing this, each scent can be paired with its relative frequency.
Scent | Relative Frequency |
---|---|
Lavender | 18.75 % |
Citrus | 6.25 % |
Vanilla | 37.5 % |
Rose | 25 % |
Jasmine | 12.5 % |
Tiffaniqua's auntie can now use these relative frequency percentages to prioritize how much of each oil to purchase!
A two-way frequency table, also known as a two-way table, displays categorical data that can be grouped into two categories. One of the categories is represented in the rows of the table, the other in the columns. For example, the table below shows the results of a survey where 100 participants were asked if they have a driver's license and if they own a car.
carand
driver's license.Both have possible responses of
yesand
no.The numbers in the table are called joint frequencies. Also, two-way frequency tables often include the total of the rows and columns — these are called marginal frequencies. Select any frequency in the table below to display more information.
Totalrow and the
Totalcolumn, which in this case is 100, equals the sum of all joint frequencies. This is called the grand total. A joint frequency of 43 shows that 43 people have a driver's license and own a car. A marginal frequency of 53 shows that 53 people do not have a car. The rest of the numbers from the table can also be interpreted.
Organizing data in a two-way frequency table can help with visualization, which in turn makes it easier to analyze and present the data. To draw a two-way frequency table, three steps must be followed.
Suppose that 53 people took part in an online survey, where they were asked whether they prefer top hats or berets. Out of the 18 males that participated, 12 prefer berets. Also, 15 of the females chose top hats as their preference. The steps listed above will now be used to analyze and present the data.
First, the two categories of the table must be determined, after which the table can be drawn without frequencies. Here, the participants gave their hat preference and their gender, which are the two categories. Hat preference can be further divided into top hat and beret, and gender into female and male.
The total row and total column are included to write the marginal frequencies.
The given joint and marginal frequencies can now be added to the table.
Using the given frequencies, more information can potentially be found by reasoning. For instance, because 12 out of the 18 males prefer berets, the number of males who prefer top hats is equal to the difference between these two values. 18 - 12 = 6 Therefore, there are 6 males who prefer top hats. Since there are 15 females who prefer top hats, the number of participants who prefer this type of hat is the sum of these two values. 6 + 15 = 21 It has been found that 21 participants prefer top hats. Continuing with this reasoning, the entire table can be completed.
The hand lotion sold at auntie's store comes in two sizes: large and small. She actually buys the lotion in bulk. Then, she fills it in their respective bottles, one by one. Auntie needs to restock on vanilla and rose hand lotions, and Tiffaniqua will help!
Out of a total of 25 vanilla bottles, they filled 18 small ones. They also filled 12 large rose bottles. In total, 60 bottles where filled.
Next, add a row and a column to include the marginal frequencies, which correspond to the totals of each individual category.
It is given that out of 25 vanilla bottles, Tiffaniqua filled 18 small ones. This means that the marginal frequency of vanilla bottles is 25 and that the joint frequency of small vanilla bottles is 18. Auntie also filled 12 large rose bottles, which is also a joint frequency. The grand total corresponds to 60 bottles. Add all this information to the table.
There is now enough information to fill the table. To find how many rose bottles there are note that out of the 60 bottles, 25 are vanilla, so subtract 25 from 60. 60-25 = 35 There are 35 rose bottles.
35-12 = 23 This means they filled 23 small rose bottles.
60-41 = 19 Finally, while not necessary, find how many large vanilla bottles there are by subtracting 12 from 19. 19-12 = 7 With this information, the table can now be completed.
In a two-way frequency table, a joint relative frequency is the ratio of a joint frequency to the grand total. Similarly, a marginal relative frequency is the ratio of a marginal frequency to the grand total. Consider the following example of a two-way table.
Here, the grand total is 100. The joint and marginal frequencies can now be divided by 100 to obtain the joint and marginal relative frequencies. Clicking in each cell will display its interpretation.
Relative frequencies can also be made either by columns or by rows. In the case it is made by rows, each joint frequency is divided by the marginal frequency of its corresponding row.
On the other hand, if the table is to be made by columns, each joint frequency needs to be divided by the marginal frequency of its respective column.
After filling up the bottles, auntie was left with some leftover lotion of vanilla and rose. She recalled that there was some leftover wax as well. She decides to give free samples to Tiffaniqua to share with friends. The following two-way table summarizes the items auntie gave to Tiffaniqua.
6+3+4+7 = 20 Auntie gave Tiffaniqua a total of 20 items. To find what percentage of the samples are rose candles divide its corresponding frequency by the grand total. 7/20 = 0.35 This corresponds to the joint relative frequency of rose candles. This can now be written as a percentage. 0.35 = 35 %
A total of 10 vanilla samples were given to Tiffaniqua. Find the relative marginal frequency by dividing this number by the grand total. 10/20 = 0.5 Do not forget to write it as a percentage. 0.5 = 50 % Tiffaniqua found that 50 % of the samples have a vanilla scent.
The marginal frequency corresponding to hand lotion bottles is 9. Find the marginal relative frequency by dividing this number by the grand total. 9/20 = 0.45 The marginal relative frequency is 0.45, or 45 %. This means that 45 % of the samples are hand lotion bottles.
Consider the two-way table given at the start of the lesson.
To determine which scent of bath bomb is the most popular focus on the bath bomb row. Look for which bath bomb scent has the greatest joint frequency.
In order to determine which scent is the most popular in general, the marginal frequencies need to be added to the table.
From here, it can be seen that the vanilla scent has the greatest marginal frequency.
Tiffaniqua's school performs a play to celebrate the end of the school year. For the school play, 10 students will need to use a wig, and 5 will need face paint. It is also known that 2 students will not need neither a wig nor face paint. A total of 14 students will participate in the play.
Let's begin by making sense of the given information. In this case we are told of two categories: face paint and wig. We can classify the students participating in the play depending on whether they use a wig or wear face paint or not. This means that we should make a two-way table with these categories.
We are told that 10 will use a wig and 5 will use face paint. These numbers correspond to marginal frequencies. We are also told that 14 students will participate in the play, which is the grand total. Let's add this information to the table!
There is only information about the joint frequency corresponding to students that will not use a wig nor face paint. We will include this information as well.
Out of the 14 students, 10 will use face paint, which means that 14-10=4 will not. Likewise, 5 students will require a wig, so 14-5=9 will not. Let's add these marginal frequencies to the table!
Following the same reasoning, we can find the next joint frequencies. From the 9 students that do not wear a wig, 2 of them do not use face paint as well, so 9-2=7 use face paint. In a similar way we can find that 4-2=2 students do not use face paint but do use a wig.
Finally, we can find how many students use both a wig and face paint. This can be either seen as 10-7=3 or 5-2=3.
There are two different clubs in Tiffaniqua's school: music club, and sports club. It is optional to participate in any of these clubs, and students are allowed to join both. The following table summarizes how many students in Tiffaniqua's class are in each club, if any.
We are asked to find what percentage of the students that are in the music club are also in sports club. The music categories are arranged by rows, so we will make a relative frequency table by rows. Since we will make the table by rows we will remove the sports totals.
We divide each frequency by the marginal frequency in its matching row. We will also write this as a percentage!
Next, look for students that are in the music club and the sports club.
This means that 33 % of the students that are in the music club are also in the sports club.
This time we are interested in what percentage of the students of the sports club are not in the music club. Sports are arranged by column, so we will make the table by columns.
Next, we divide each frequency by the total of its corresponding column and write it as a percentage as well.
We now look for those students that are on the sports club but are not in the music club.
The corresponding relative frequency is 58 %, which means that 58 % of the students that are in the sports club are not in the music club.
Auntie Stella gave Tiffaniqua the task of making a relative frequencies table by rows using the data from the two-way table they made when filling the hand lotion bottles. In the meantime, auntie will attend a phone call with the essential oils provider.
In order to find Tiffaniqua's mistake we will try to follow auntie's request by ourselves. We are asked to make a relative frequency table by rows, which means that we will get rid of the scents totals.
Next, we will divide each frequency by the marginal frequency of its matching row.
Here we can already see Tiffaniqua's mistake. In order to find what percentage of the large bottles were filled with rose lotion the following operation must be done first. 12/19 However, Tiffaniqua divided by 35 instead of 19. Probably Tiffaniqua got confused between table by rows or by columns! This means the answer is III.