An important measure of the performance of a locomotive is its "adhesion," which is the locomotive's pulling force as a multiple of its weight. The adhesion of one 4400-horsepower diesel locomotive model varies in actual use according to a Normal distribution with mean μ = 0.34 and standard deviation σ = 0.049 What proportion of adhesions (± 0.001) measured in use are higher than 0.47? Z = (0.47-0.34) / 0.049 = 2.653 Area to the right of z = 2.65 is 0.0040 What proportion of adhesions (± 0.001) are between 0.47 and 0.49? The new z-score is (0.49-0.34)/0.049 = 3.06 To find the area between the two z-scores, we find the difference in the areas to the left of each. Area left of 3.06 = 0.9989 Area left of 2.65 = 0.9960 Area between = 0.9989-0.9960 = 0.0029 What do you do if your z-value is bigger than the table values? Use the last value on the table. We know that the probability to the right of z = 3.939 is smaller than the area to the right for 3.49 (or whatever the last value on the table is). For a zscore of 6.2978, you can safely put down 0 or 1 (whichever side is appropriate) and be correct to within rounding. CHAPTER 6 BPS - 5TH ED. 1 CHAPTER 6 Two-Way Tables CHAPTER 6 BPS - 5TH ED. 2 CATEGORICAL VARIABLES • In this chapter we will study the relationship between two categorical variables (variables whose values fall in groups or categories). • To analyze categorical data, use the counts or percents of individuals that fall into various categories. CHAPTER 6 BPS - 5TH ED. 3 TWO-WAY TABLE • When there are two categorical variables, the data are summarized in a two-way table • each row in the table represents a value of the row variable • each column of the table represents a value of the column variable • The number of observations falling into each combination of categories is entered into each cell of the table CHAPTER 6 BPS - 5TH ED. 4 MARGINAL DISTRIBUTIONS • A distribution for a categorical variable tells how often each outcome occurred • totaling the values in each row of the table gives the marginal distribution of the row variable (totals are written in the right margin) • totaling the values in each column of the table gives the marginal distribution of the column variable (totals are written in the bottom margin) CHAPTER 6 BPS - 5TH ED. 5 MARGINAL DISTRIBUTIONS • It is usually more informative to display each marginal distribution in terms of percents rather than counts • each marginal total is divided by the table total to give the percents • A bar graph could be used to graphically display marginal distributions for categorical variables CHAPTER 6 BPS - 5TH ED. 6 CASE STUDY Age and Education (Statistical Abstract of the United States, 2001) Data from the U.S. Census Bureau for the year 2000 on the level of education reached by Americans of different ages. CHAPTER 6 BPS - 5TH ED. 7 CASE STUDY Age and Education Variables Marginal distributions CHAPTER 6 BPS - 5TH ED. 8 CASE STUDY Age and Education Variables 15.9% 33.1% 25.4% 25.6% 21.6% 46.5% 32.0% Marginal distributions CHAPTER 6 BPS - 5TH ED. 9 CASE STUDY Age and Education Marginal Distribution for Education Level CHAPTER 6 Not HS grad 15.9% HS grad 33.1% College 1-3 yrs 25.4% College ≥4 yrs 25.6% BPS - 5TH ED. 10 CONDITIONAL DISTRIBUTIONS • Relationships between categorical variables are described by calculating appropriate percents from the counts given in the table • prevents misleading comparisons due to unequal sample sizes for different groups CHAPTER 6 BPS - 5TH ED. 11 CASE STUDY Age and Education Compare the 25-34 age group to the 35-54 age group in terms of success in completing at least 4 years of college: Data are in thousands, so we have that 11,071,000 persons in the 25-34 age group have completed at least 4 years of college, compared to 23,160,000 persons in the 35-54 age group. The groups appear greatly different, but look at the group totals. CHAPTER 6 BPS - 5TH ED. 12 CASE STUDY Age and Education Compare the 25-34 age group to the 35-54 age group in terms of success in completing at least 4 years of college: Change the counts to percents: Now, with a fairer comparison using percents, the groups appear very similar. CHAPTER 6 11,071 = .293 (29.3%) for 25 - 34 age group 37,786 23,160 = .284 (28.4%) for 35 - 54 age group 81,435 BPS - 5TH ED. 13 CASE STUDY Age and Education If we compute the percent completing at least four years of college for all of the age groups, this would give us the conditional distribution of age, given that the education level is “completed at least 4 years of college”: CHAPTER 6 Age: 25-34 35-54 55 and over Percent with ≥ 4 yrs college: 29.3% 28.4% 18.9% BPS - 5TH ED. 14