Sign In
| 12 Theory slides |
| 9 Exercises - Grade E - A |
| Each lesson is meant to take 1-2 classroom sessions |
Here are a few recommended readings before getting started with this lesson.
Categorical data, also called qualitative data, is data that can be split into groups. Categorical data belongs to one or more categories that have a fixed number of possible outcomes or values. Human blood groups are one example of categorical data.
categorizedinto one of these groups.
The beauty store has plenty to offer, and Tiffaniqua wonders how her aunt manages such a wide variety of products. Auntie has a database where she stores information about each individual product for sale. For example, consider one particular bottle of rose hand lotion.
What variables are not described using numbers?
The store sells different types of different scented products. Consider the given bottle of hand lotion.
There are four variables present in the given picture. These are the type of product, the scent, the content volume, and the price. Notice that the type of product is hand lotion
, and the scent is rose
. These variables are described using words, rather than numbers. This means that these variables correspond to categorical data.
On the other hand, the content volume is given in fluid ounces, and the price in dollars. Since these variables are described using numbers, they correspond to numerical data.
Exploring the proportion of occurrences of a group, value, or set of values in a data set provides valuable insights. This is called the relative frequency, which is given by the ratio of an observed category or value's frequency to the total number of observations in a data set.
Relative frequency =Number of observationsFrequency
For example, suppose categorical data is explored in a survey made on a classroom about the favorite color of the students.
Color | Frequency |
---|---|
Blue | 9 |
Red | 7 |
Green | 5 |
Yellow | 4 |
Purple | 3 |
Other | 2 |
Knowing that there are 30 students in the classroom, the relative frequency of each category can be found by dividing the frequency of each category by 30.
Color | Frequency | Relative Frequency |
---|---|---|
Blue | 9 | 309=0.3 |
Red | 7 | 307=0.23 |
Green | 5 | 305=0.17 |
Yellow | 4 | 304=0.13 |
Purple | 3 | 303=0.1 |
Other | 2 | 302=0.07 |
Relative frequencies are typically written as percentages.
Color | Frequency | Relative Frequency |
---|---|---|
Blue | 9 | 30% |
Red | 7 | 23% |
Green | 5 | 17% |
Yellow | 4 | 13% |
Purple | 3 | 10% |
Other | 2 | 7% |
Outstandingly, Tiffaniqua's auntie hand makes the candles herself! She buys a huge block of soy wax and uses it to make different scented candles.
To give each candle their characteristic scent, she needs to add essential oils. It is vital that she knows in advance how much of each oil to buy. Otherwise, she ends up with too much. Consider her previous month's sales sheet.
Scent | Number of Candles Sold |
---|---|
Lavender | 15 |
Citrus | 5 |
Vanilla | 30 |
Rose | 20 |
Jasmine | 10 |
Begin by finding how many candles were sold. Divide each frequency by the total number of candles sold. Write each relative frequency as a percentage.
Scent | Number of Candles Sold | Divide | Relative Frequency |
---|---|---|---|
Lavender | 15 | 8015 | 0.1875=18.75% |
Citrus | 5 | 805 | 0.0625=6.25% |
Vanilla | 30 | 8030 | 0.375=37.5% |
Rose | 20 | 8020 | 0.25=25% |
Jasmine | 10 | 8010 | 0.125=12.5% |
Knowing this, each scent can be paired with its relative frequency.
Scent | Relative Frequency |
---|---|
Lavender | 18.75% |
Citrus | 6.25% |
Vanilla | 37.5% |
Rose | 25% |
Jasmine | 12.5% |
Tiffaniqua's auntie can now use these relative frequency percentages to prioritize how much of each oil to purchase!
A two-way frequency table, also known as a two-way table, displays categorical data that can be grouped into two categories. One of the categories is represented in the rows of the table, the other in the columns. For example, the table below shows the results of a survey where 100 participants were asked if they have a driver's license and if they own a car.
carand
driver's license.Both have possible responses of
yesand
no.The numbers in the table are called joint frequencies. Also, two-way frequency tables often include the total of the rows and columns — these are called marginal frequencies. Select any frequency in the table below to display more information.
Totalrow and the
Totalcolumn, which in this case is 100, equals the sum of all joint frequencies. This is called the grand total. A joint frequency of 43 shows that 43 people have a driver's license and own a car. A marginal frequency of 53 shows that 53 people do not have a car. The rest of the numbers from the table can also be interpreted.
Organizing data in a two-way frequency table can help with visualization, which in turn makes it easier to analyze and present the data. To draw a two-way frequency table, three steps must be followed.
Suppose that 53 people took part in an online survey, where they were asked whether they prefer top hats or berets. Out of the 18 males that participated, 12 prefer berets. Also, 15 of the females chose top hats as their preference. The steps listed above will now be used to analyze and present the data.
First, the two categories of the table must be determined, after which the table can be drawn without frequencies. Here, the participants gave their hat preference and their gender, which are the two categories. Hat preference can be further divided into top hat and beret, and gender into female and male.
The total row and total column are included to write the marginal frequencies.
The given joint and marginal frequencies can now be added to the table.
The hand lotion sold at auntie's store comes in two sizes: large and small. She actually buys the lotion in bulk. Then, she fills it in their respective bottles, one by one. Auntie needs to restock on vanilla and rose hand lotions, and Tiffaniqua will help!
Out of a total of 25 vanilla bottles, they filled 18 small ones. They also filled 12 large rose bottles. In total, 60 bottles where filled.
Next, add a row and a column to include the marginal frequencies, which correspond to the totals of each individual category.
It is given that out of 25 vanilla bottles, Tiffaniqua filled 18 small ones. This means that the marginal frequency of vanilla bottles is 25 and that the joint frequency of small vanilla bottles is 18. Auntie also filled 12 large rose bottles, which is also a joint frequency. The grand total corresponds to 60 bottles. Add all this information to the table.
In a two-way frequency table, a joint relative frequency is the ratio of a joint frequency to the grand total. Similarly, a marginal relative frequency is the ratio of a marginal frequency to the grand total. Consider the following example of a two-way table.
Here, the grand total is 100. The joint and marginal frequencies can now be divided by 100 to obtain the joint and marginal relative frequencies. Clicking in each cell will display its interpretation.
Relative frequencies can also be made either by columns or by rows. In the case it is made by rows, each joint frequency is divided by the marginal frequency of its corresponding row.
On the other hand, if the table is to be made by columns, each joint frequency needs to be divided by the marginal frequency of its respective column.
After filling up the bottles, auntie was left with some leftover lotion of vanilla and rose. She recalled that there was some leftover wax as well. She decides to give free samples to Tiffaniqua to share with friends. The following two-way table summarizes the items auntie gave to Tiffaniqua.
Consider the two-way table given at the start of the lesson.
To determine which scent of bath bomb is the most popular focus on the bath bomb row. Look for which bath bomb scent has the greatest joint frequency.
In order to determine which scent is the most popular in general, the marginal frequencies need to be added to the table.
From here, it can be seen that the vanilla scent has the greatest marginal frequency.
Let's begin by recalling what is categorical data.
Categorical Data |- Categorical data, also called qualitative data, is data that can be split into groups. Categorical data belongs to one or more categories that have a fixed number of possible outcomes or values.
In other words, categorical data is data that can be described using words rather than numbers. The duration and the number of awards are described using numbers, so these do not correspond to categorical data. & Title & Movie Genre *& Duration & Filming Studio *& Number of Awards On the other hand, the title, the movie genre, and the filming studio are all names and words that describe the movie. ✓& Title ✓& Movie Genre & Duration ✓& Filming Studio & Number of Awards
A jewelry store sells different types of necklaces depending on which metal is used for the chain and what kind of stone is used to decorate it. The following two-way table contains information about last month sales.
Let's begin by taking a look at the given two-way frequency table.
We are asked for how many gold ruby necklaces were sold. This corresponds to the joint frequency of gold and ruby, let's use x to represent it. We will now focus on the ruby column. Note that the marginal frequency of ruby is 5.
This means that adding 3 to our missing number x equals 5. x+3=5 Let's find x by subtracting 3 from both sides of the above equation!
We found that x=2. This means that 2 ruby gold necklaces were sold at the jewelry shop.
This time we are asked to find how many silver necklaces were sold. This corresponds to the marginal frequency of silver necklaces. Let's focus on the silver row of the given table.
At first glance, it looks like we do not have enough information to find the marginal frequency. However, note that we can find how many gold necklaces were sold. 2+4+5 = 11 Let's add this to our table.
Out of the 25 necklaces sold at the store, 11 were made of gold. This means that 25-11=14 were made of silver.
Tiffaniqua made a bunch of sandwiches to give out as charity for the needed. She used two different type of breads, white and whole grain. She also used two different types of deli meat, ham and turkey. The following table summarizes what types of sandwiches Tiffaniqua made.
Let's take a look at the given two-way frequency table.
To find how many sandwiches Tiffaniqua made, we need to add all the joint frequencies. 13+7+6+4 = 30 This means that Tiffaniqua made a total of 30 sandwiches.
We are now asked to find how many slices of white bread Tiffaniqua used. Let's begin by adding the marginal frequencies to the table.
Tiffaniqua made 20 white bread sandwiches. Keep in mind that two slices of bread are used for every sandwich, which means that Tiffaniqua actually used 40 white bread slices to make the sandwiches.
A coffee shop sells two types of cappuccinos, vanilla and Irish cream. The coffee used for these beverages can be either regular or decaf. The following table summarizes last week's sales.
Consider the given two-way frequency table.
We are asked to find what percentage of the sales correspond to regular vanilla cappuccinos. This means that we need to find its respective joint relative frequency. Begin by finding how many cappuccinos were sold. 88+67+25+20 = 200 Find the relative joint frequency of regular vanilla cappuccinos by dividing its respective joint frequency by the grand total. 88/200 = 0.44 Do not forget to write this as a percentage. 0.44 = 44 %
Since we are asked to find what percentage of the sales correspond to decaf cappuccinos we will begin by adding the marginal frequencies to the table.
A total of 87 decaf cappuccinos were sold. Find its corresponding marginal relative frequency by dividing it by 200. We will write it as a percentage as well. 87/200 &= 0.435 &= 43.5 %
Let's take a look at the two-way table including marginal frequencies.
Since one flavor needs to be removed, it is better to stick to the one that has the highest amount of sales. The marginal frequency of vanilla flavored cappuccinos is 155, while the Irish cream is only 45. This means that the store should keep the vanilla flavor.
The following two-way relative frequency table summarizes the results of a survey about pets.
Let's begin by taking a look at the given two-way table. Keep in mind that this is a relative frequency table.
Owning a pet in this context means owning either a cat, a dog, or both.
This means that we will add all the frequencies except those who do not own neither a dog nor a cat. 5 % + 40 % + 35 % = 80 % We found that 80 % of the surveyed people own a pet.
We are told that 500 people participated in the survey. From the table, we can see that 5 % of these persons own both a cat and a dog. To find how many people fall under this category we will first write the percentage as a decimal.
5 % = 0.05
Next, we multiply this number by the number of people that participated in the survey.
0.05 * 500 = 25
This means that 25 people own both a cat and a dog.