Core Connections Algebra 2, 2013
CC
Core Connections Algebra 2, 2013 View details
1. Section C.1
Continue to next subchapter

Exercise 26 Page 765

Practice makes perfect
a The shop's owner measured the rotations per minute of one of its lathe at various times over the course of a month. The obtained measurements are listed below.

ccccc 250 & 251 & 253 & 253 & 253 254 & 257 & 257 & 259 & 259 261 & 263 & 265 & 270 & 291 We want to draw a combination of histogram and boxplot. Let's begin by drawing a histogram.

Histogram

A histogram is a graphical illustration of a data set. The data is grouped into specific intervals, which are called the bin width. This grouping is marked on a horizontal line.

Example histogram
We need to choose an appropriate bin width for each of the bars on our histogram. Remember all widths must have the same size. An approximation of how wide the bin width is the square root of the number of values in the data set. Let's count how many measurements we have. ccccc 250 & 251 & 253 & 253 & 253 254 & 257 & 257 & 259 & 259 261 & 263 & 265 & 270 & 291

We can see there are total of 15 measurements, therefore we can find the bin width. sqrt(15) = 3.872983... ≈ 4 Using a bin width of 4, we are able to identify the number of observations in each interval starting from the minimum data value 250.

Interval Observations
250-254 250, 251, 253, 253, 253, 254
254-258 257, 257
258-262 259, 259, 261
262-266 263, 265
266-270 270
270-274 -
274-278 -
278-282 -
282-286 -
286-290 -
290-294 291

The histogram is the collection of rectangles drawn above the intervals. The height of these rectangles are proportional to the frequency of the data in the corresponding interval. Let's find the frequency of the data.

Interval Observations Frequency
250-254 250, 251, 253, 253, 253, 254 6
254-258 257, 257 2
258-262 259, 259, 261 3
262-266 263, 265 2
266-270 270 1
270-274 - 0
274-278 - 0
278-282 - 0
282-286 - 0
286-290 - 0
290-294 291 1

Now we have all the information we need to draw the histogram.

Histogram

Boxplot

As a next step, we want to place a boxplot on top of the histogram. To create the boxplot, we need to determine the following five-number summary of the data set. &Minimum value &1^(st) Quartile &Median &3^(rd) Quartile &Maximum value Examining the observations, we notice that they have been ordered from least to greatest. ccccc 250 & 251 & 253 & 253 & 253 254 & 257 & 257 & 259 & 259 261 & 263 & 265 & 270 & 291 Therefore, we can immediately identify the minimum and maximum value as 250 and 291. Also, the number of values in the data set is 15, an odd number, which means the median must be the 8^(th) observation. ccccc 250 & 251 & 253 & 253 & 253 254 & 257 & 257 & 259 & 259 261 & 263 & 265 & 270 & 291 To find the 1^(st) and 3^(rd) Quartile, we have to identify the middle value of the lower and upper half, which will be the average of the 4^(th) and 5^(th) value for the lower half and of the 11^(th) and 12^(th) value for the upper half. ccccc 250 & 251 & 253 & 253 & 253 254 & 257 & 257 & 259 & 259 261 & 263 & 265 & 270 & 291 Having identified the relevant values, we can calculate the quartiles. 1^(st) Quartile:& 253+ 253/2 = 253 [0.6em] 3^(rd) Quartile:& 261+ 263/2 = 262 Let's summarize what we have found. &Minimum value=250 &1^(st) Quartile = 253 &Median = 257 &3^(rd) Quartile = 262 &Maximum value=291 Notice that the measurement 291 is far away from the bulk of data distribution. Therefore, it is an outlier. We will mark it on a modified boxplot with a dot. This results in the right segment ending on measurement equal to 270, which is the second highest obtained measurement.

Histogram
b It takes several statistics to get a meaningful description of a single-variable data. Let's recall the results obtained in Part A.

&Minimum value=250 &1^(st) Quartile = 253 &Median = 257 &3^(rd) Quartile = 262 &Maximum value=291 Observing the diagram from Part A, we can conclude that the data is sigle-peaked and right skewed with an outlier at 291. Therefore, the center is best described by its median at 257. The interquartile range (IQR) is the difference between the third and first quartiles. It is used to measure spread of the middle fifty percent of the data. Spread: 262-253=9

c Notice that the data set contains an outlier at 291. The mean is the sum of all observations divided by the number of observations. This is why the mean will be greater compared to if the outlier was removed. The outlier is not representative of the population in general so the median is a better measure, as it is unaffected by outliers.