EXCEL-lent!!! --- Descriptive Statistics Major U.S. Airport Traffic Statistics The following data represent levels of traffic among the N=135 airports that are classified as medium or large hubs by the Federal Aviation Administration (FAA). The variables included are: Airport – Column A (Nominal) City – Column B (Nominal) Scheduled Departures – Column C (Interval) Performed Departures – Column D (Interval) Enplaned Passengers – Column E (Interval) Enplaned Revenue Tons of Freight – Column F (Interval) Enplaned Revenue Tons of Mail – Column G (Interval) Obtain all relevant (and some irrelevant) summary statistics using EXCEL, by following these directions: Click on usair2.xls Choose Descriptive Statistics by selecting: Tools Data Analysis Descriptive Statistics 4) Input data range $C$1:$G$136 (The columns represent the interval scale variables measured on each airport) Grouped by Columns Click on Labels (variable names are given in row 1) Click on Summary Statistics (this will compute many statistics that would take a real long time to do by hand) Click on OK (the output will be printed to another worksheet, unless you change the output option) Print out the worksheet that contains the summary statistics. Obtain a histogram (and frequency distribution) for the variable Enplaned Passengers by following these directions (make obvious changes to repeat for any other interval scale variable). Click on usair2.xls Choose Histogram by selecting: Tools Data Analysis Histogram 4) Input data range $E$1:$E$136 (This column represent the interval scale variable Enplaned Passengers Click on Labels (variable names are given in row 1) Click on Chart Output (expand plot by clicking on plot then “stretching” from corner). Print out the worksheet that contains the frequency distribution and expanded histogram. Source: U.S. Federal Aviation Administration and Research and Special Programs Administration, Airport Activity Statistics (1990). EXCEL-lent!!! – 2 Independent Samples Household Incomes for FL and NC Mall Target Populations A small retailer is interested in determining whether household incomes differ by geographic area. She selects 12 Florida malls and 9 North Carolina malls at random, and observes the average household income for each mall’s target population. Suppose she wishes to determine whether malls’ target populations differ with respect to mean income between FL and NC. Conduct the independent-sample t-test using EXCEL, by following these directions: Click on mall.xls Choose independent sample t-test by selecting: Tools Data Analysis t-test: 2-Sample Assuming Equal Variances Input data range $C$1:$C$13 for Variable 1 (The column represent FL, there are 12 malls of data) Input data range $D$1:$D$10 for Variable 2 (The column represent NC, there are 9 malls of data) Choose 0 for the Hypothesized Mean Difference Click on Labels (variable names are given in row 1) Click on OK (output will be printed to another worksheet unless you change the output option) Print out results EXCEL-lent!!! – Paired Differences Battle of the Network Nightly News Who gets the best ratings, Peter Jennings or Dan Rather? We take a random sample of weekly mean number of households for each news program over the course of 1997. For each week, we observe the mean number of viewers as reported by Nielsen (actually, these are estimates, but ignore that for our purposes) for ABC and CBS. We wish to determine whether there are differences in the mean weekly viewership between the two networks. Conduct the paired difference test using EXCEL, by following these directions: Click on news.xls Choose paired-difference t-test by selecting: Tools Data Analysis t-test Paired Two Sample for Means Input data range $B$1:$B$11 for Variable 1 (The column represent ABC, there are 10 weeks of data) Input data range $C$1:$C$11 for Variable 2 (The column represent CBS, there are 10 weeks of data) Choose 0 for the Hypothesized Mean Difference Click on Labels (variable names are given in row 1) Click on OK (output will be printed to another worksheet unless you change the output option) Print out results Week ABC 1 2 3 4 5 6 7 8 9 10 Source: Daily Variety (1997 Editions) CBS 8.2 7.2 7.1 8.7 7.2 6.6 8.8 8.5 9.6 9.2 7.2 6.3 6.2 8.9 7.1 6.4 7.6 8.6 7.8 8.5 EXCEL-lent!!! --- 1-Way ANOVA (CRD) Television Ad Share – By Product Class Do brands within the same product class tend to have similar percentages of their advertising budgets dedicated to TV? Do brands across product classes tend to have different percentages of their advertising budgets dedicated to television? Based on data from Advertising Age, obtain the Analysis of Variance and test for differences across product classes. The raw data are given in Columns A-R, and the data are re-arranged by product class (one column for each product class that had at least two brands). The data that will be analyzed are in Columns S-AR, and represent the variable given in Column Q of the original data set. Fit the 1-Way ANOVA using EXCEL, by following these directions: Click on brandad2.xls (you may need to find it a different directory) Choose 1-Way ANOVA by selecting: Tools Data Analysis ANOVA: Single Factor Input data range $S$1:$AR$33 (The columns represent product classes, the largest sample size is nauto=32) Grouped by Columns Click on Labels (variable names are given in row 1) Click on OK (the output will be printed to another worksheet, unless you change the output option) Print out the worksheet that contains the ANOVA table. Source: Advertising Age, May 5, 1997 EXCEL-lent!!! --- Randomized Block Design Caffeine and Endurance The following data represent per capita income for 28 U.S. cities. Cites are classified by two factors: Region (Southeast/Midwest) and Size (Large/Medium). Test for interactions and main effects for these two factors on per capita income, by following these directions: Click on caffeine.xls (you may need to find it a different directory) Choose 2-Factor ANOVA by selecting: Tools Data Analysis ANOVA: Two-Factor without Replication Input data range $A$1:$J$5 (Rows are doses and Columns are Subjects). Click on Labels Click on OK (the output will be printed to another worksheet, unless you change the output option) Print out the worksheet that contains the group means and ANOVA table. EXCEL-lent!!! --- Simple Regression Therapeutic Category Promotional Expenditures Long before the drug industry got Food & Drug Administration (FDA) approval for their recent onslaught of commercial advertising, drug companies advertised directly to members of the medical profession via trade magazines and direct contacts. The author of this study reported overall industry promotional expenditures and prescription sales. There are n=10 classes of drugs in the sample. We wish to estimate the parameters of a simple linear regression model, relating prescription sales (Y) to promotional expenditures (X). Fit the model, and obtain (crude) plots using EXCEL, by following these directions: Click on drug2ex.xls Choose Regression by selecting: Tools Data Analysis Regression Input range for dependent variable (Y): $C$1:$C$11 (The dependent variable is in column C, rows 1-11, including the label) Input range for independent variable (X): $B$1:$B$11 (The independent variable is in column B, rows 1-11, including the label) Click on Labels (variable names are given in row 1) Click on Residual plots and Line fit plots Click on OK (the output will be printed to another worksheet) Print out the worksheet and plots Source: Brooke, P. (1976), “Promotional Parameters: A Preliminary Examination of Promotional Expenditures”, Journal of Drug Issues, 6:21-27. Therapeutic Category Promo (X) Sales (Y) Antibiotics 44 256 Ataractics 34 372 Non-Narcotic Analgesics 21 132 Penicillins 20 165 Oral Cold Preps 17 79 Plain Corticoids 16 80 Diuretics 14 142 Anti-spasmodics 13 84 Plain Antacids 12 11 Hypotensives 11 126 EXCEL-lent!!! --- Multiple Regression Mortgage Financing Cost Variation – By City A study in the mid 1960’s reported regional differences in mortgage costs for new homes. The sampling units were n=18 metro areas (SMSA’s) in the US. The dependent (response) variable (Y) is the average yield on a new home mortgage for the SMSA. The independent (predictor) variables (XI) are (and some hypothesized associations with cost): X1 – Average Loan Value / Mortgage Value Ratio (Higher X1 means lower down payment and higher risk to lender) X2 – Road Distance from Boston (Higher X2 means further from Northeast, where most capital was at the time, and higher costs of capital) X3 – Savings per annual dwelling unit constructed (Higher X3 means higher relative credit surplus, and lower costs of capital) X4 – Savings per Capita (does not adjust for new housing demand) X5 -- %Increase in Population 1950—1960 X6 -- % of first mortgage debt controlled by inter-regional banks Fit this full model using EXCEL, by following these directions: Click on housenew.xls (you may need to find it a different directory) Choose Regression by selecting: Tools Data Analysis Regression 4) Input range for dependent variable (Y): $B$1:$B$19 (The dependent variable is in column B, rows 1-19, including the label). 5) Input range for independent variable (X): $C$1:$H$19 (The independent variables are in columns C-H, rows 1-19, including the labels). Click on Labels (variable names are given in row 1) Click on OK (the output will be printed to another worksheet, unless you change the output option. Print out the worksheet that contains R2, Adj-R2, the ANOVA table, estimates, standard errors, and confidence intervals. Fit the reduced model, containing X1, X2, and X3 as the only independent variables. The only adjustment you need to make is in part 5), use input range: $C$1:$E$19 Source: Schaaf, A.H. (1966), “Regional Differences in Mortgage Financing Costs”, Journal of Finance, 21:85-94. SMSA Los Angeles-Long Beach Denver San Francisco-Oakland Dallas-Fort Worth Miami Atlanta Houston Seattle New York Memphis New Orleans Cleveland Chicago Detroit Minneapolis-St Paul Baltimore Philadelphia Boston MrtgYld Loan/Valu DistBostn Sav/Unit PCSave %PopInc %Interreg 6.17 78.1 3042 91.3 1738.1 45.5 33.1 6.06 77 1997 84.1 1110.4 51.8 21.9 6.04 75.7 3162 129.3 1738.1 24 46 6.04 77.4 1821 41.2 778.4 45.7 51.3 6.02 77.4 1542 119.1 1136.7 88.9 18.7 6.02 73.6 1074 32.3 582.9 39.9 26.6 5.99 76.3 1856 45.2 778.4 54.1 35.7 5.91 72.5 3024 109.7 1186 31.1 17 5.89 77.3 216 364.3 2582.4 11.9 7.3 5.87 77.4 1350 111 613.6 27.4 11.3 5.85 72.4 1544 81 636.1 27.3 8.1 5.75 67 631 202.7 1346 24.6 10 5.73 68.9 972 290.1 1626.8 20.1 9.4 5.66 70.7 699 223.4 1049.6 24.7 31.7 5.66 69.8 1377 138.4 1289.3 28.8 19.7 5.63 72.9 399 125.4 836.3 22.9 8.6 5.57 68.7 304 259.5 1315.3 18.3 18.7 5.28 67.8 0 428.2 2081 7.5 2 EXCEL-lent!!! --- 2-Factor ANOVA Per Capita Income – U.S. Cities (1985) The following data represent per capita income for 28 U.S. cities. Cites are classified by two factors: Region (Southeast/Midwest) and Size (Large/Medium). Test for interactions and main effects for these two factors on per capita income, by following these directions: Click on citypci.xls (you may need to find it a different directory) Choose 2-Factor ANOVA by selecting: Tools Data Analysis ANOVA: Two-Factor with Replication Input data range $B$1:$D$15 (Column B represents levels of the Region factor, Column C represents Large Cities and Column D represents Medium Cities). Choose 7 Rows Per Sample Click on OK (the output will be printed to another worksheet, unless you change the output option) Print out the worksheet that contains the group means and ANOVA table. Large City Atlanta New Orleans Miami Baltimore Memphis Houston Dallas Cincinnati Chicago Detroit St Louis Indianapolis Minneapolis Columbus Region Large City Med City Medium City Southeast 10341 8035 Birmingham 8975 11923 Little Rock 8904 9902 Tampa 8647 12259 Charlotte 9362 11527 Oklahoma City 12115 9129 Louisville 12816 10466 Jacksonville Midwest 10247 11764 Wichita 9642 10276 Fort Wayne 8852 8621 Dayton 8799 12134 Des Moines 10836 9625 Grand Rapids 12302 12886 Omaha 9909 11824 Madison Source: U.S. Bureau of the Census, County and City Data Book, 1988