EXCEL-lent!!! --- Descriptive Statistics Major U.S. Airport Traffic Statistics

advertisement
EXCEL-lent!!! --- Descriptive Statistics
Major U.S. Airport Traffic Statistics
The following data represent levels of traffic among the N=135 airports that are classified
as medium or large hubs by the Federal Aviation Administration (FAA). The variables
included are:







Airport – Column A (Nominal)
City – Column B (Nominal)
Scheduled Departures – Column C (Interval)
Performed Departures – Column D (Interval)
Enplaned Passengers – Column E (Interval)
Enplaned Revenue Tons of Freight – Column F (Interval)
Enplaned Revenue Tons of Mail – Column G (Interval)
Obtain all relevant (and some irrelevant) summary statistics using EXCEL, by following
these directions:













Click on usair2.xls
Choose Descriptive Statistics by selecting:
Tools
Data Analysis
Descriptive Statistics
4) Input data range $C$1:$G$136 (The columns represent the interval scale
variables measured on each airport)
Grouped by Columns
Click on Labels (variable names are given in row 1)
Click on Summary Statistics (this will compute many statistics that would
take a real long time to do by hand)
Click on OK (the output will be printed to another worksheet, unless you change
the output option)
Print out the worksheet that contains the summary statistics.
Obtain a histogram (and frequency distribution) for the variable Enplaned Passengers
by following these directions (make obvious changes to repeat for any other interval scale
variable).










Click on usair2.xls
Choose Histogram by selecting:
Tools
Data Analysis
Histogram
4) Input data range $E$1:$E$136 (This column represent the interval scale
variable
Enplaned Passengers
Click on Labels (variable names are given in row 1)
Click on Chart Output (expand plot by clicking on plot then “stretching” from
corner).
Print out the worksheet that contains the frequency distribution and expanded
histogram.
Source: U.S. Federal Aviation Administration and Research and Special Programs
Administration, Airport Activity Statistics (1990).
EXCEL-lent!!! – 2 Independent Samples
Household Incomes for FL and NC Mall Target
Populations
A small retailer is interested in determining whether household incomes differ by
geographic area. She selects 12 Florida malls and 9 North Carolina malls at random, and
observes the average household income for each mall’s target population. Suppose she
wishes to determine whether malls’ target populations differ with respect to mean income
between FL and NC.
Conduct the independent-sample t-test using EXCEL, by following these directions:











Click on mall.xls
Choose independent sample t-test by selecting:
Tools
Data Analysis
t-test: 2-Sample Assuming Equal Variances
Input data range $C$1:$C$13 for Variable 1 (The column represent FL, there are
12 malls of data)
Input data range $D$1:$D$10 for Variable 2 (The column represent NC, there are
9 malls of data)
Choose 0 for the Hypothesized Mean Difference
Click on Labels (variable names are given in row 1)
Click on OK (output will be printed to another worksheet unless you change the
output option)
Print out results
EXCEL-lent!!! – Paired Differences
Battle of the Network Nightly News
Who gets the best ratings, Peter Jennings or Dan Rather? We take a random sample of
weekly mean number of households for each news program over the course of 1997. For
each week, we observe the mean number of viewers as reported by Nielsen (actually,
these are estimates, but ignore that for our purposes) for ABC and CBS. We wish to
determine whether there are differences in the mean weekly viewership between the two
networks.
Conduct the paired difference test using EXCEL, by following these directions:











Click on news.xls
Choose paired-difference t-test by selecting:
Tools
Data Analysis
t-test Paired Two Sample for Means
Input data range $B$1:$B$11 for Variable 1 (The column represent ABC, there
are 10 weeks of data)
Input data range $C$1:$C$11 for Variable 2 (The column represent CBS, there
are 10 weeks of data)
Choose 0 for the Hypothesized Mean Difference
Click on Labels (variable names are given in row 1)
Click on OK (output will be printed to another worksheet unless you change the
output option)
Print out results
Week
ABC
1
2
3
4
5
6
7
8
9
10
Source: Daily Variety (1997 Editions)
CBS
8.2
7.2
7.1
8.7
7.2
6.6
8.8
8.5
9.6
9.2
7.2
6.3
6.2
8.9
7.1
6.4
7.6
8.6
7.8
8.5
EXCEL-lent!!! --- 1-Way ANOVA (CRD)
Television Ad Share – By Product Class
Do brands within the same product class tend to have similar percentages of their
advertising budgets dedicated to TV? Do brands across product classes tend to have
different percentages of their advertising budgets dedicated to television?
Based on data from Advertising Age, obtain the Analysis of Variance and test for
differences across product classes. The raw data are given in Columns A-R, and the data
are re-arranged by product class (one column for each product class that had at least two
brands). The data that will be analyzed are in Columns S-AR, and represent the variable
given in Column Q of the original data set.
Fit the 1-Way ANOVA using EXCEL, by following these directions:










Click on brandad2.xls (you may need to find it a different directory)
Choose 1-Way ANOVA by selecting:
Tools
Data Analysis
ANOVA: Single Factor
Input data range $S$1:$AR$33
(The columns represent product classes, the
largest sample size is nauto=32)
Grouped by Columns
Click on Labels (variable names are given in row 1)
Click on OK (the output will be printed to another worksheet, unless you change
the output option)
Print out the worksheet that contains the ANOVA table.
Source: Advertising Age, May 5, 1997
EXCEL-lent!!! --- Randomized Block Design
Caffeine and Endurance
The following data represent per capita income for 28 U.S. cities. Cites are classified by
two factors: Region (Southeast/Midwest) and Size (Large/Medium). Test for
interactions and main effects for these two factors on per capita income, by following
these directions:








Click on caffeine.xls (you may need to find it a different directory)
Choose 2-Factor ANOVA by selecting:
Tools
Data Analysis
ANOVA: Two-Factor without Replication
Input data range $A$1:$J$5 (Rows are doses and Columns are Subjects). Click
on Labels
Click on OK (the output will be printed to another worksheet, unless you change
the output option)
Print out the worksheet that contains the group means and ANOVA table.
EXCEL-lent!!! --- Simple Regression
Therapeutic Category Promotional Expenditures
Long before the drug industry got Food & Drug Administration (FDA) approval for their
recent onslaught of commercial advertising, drug companies advertised directly to
members of the medical profession via trade magazines and direct contacts. The author
of this study reported overall industry promotional expenditures and prescription sales.
There are n=10 classes of drugs in the sample. We wish to estimate the parameters of a
simple linear regression model, relating prescription sales (Y) to promotional
expenditures (X).
Fit the model, and obtain (crude) plots using EXCEL, by following these directions:




Click on drug2ex.xls Choose Regression by selecting:
Tools
Data Analysis
Regression

Input range for dependent variable (Y): $C$1:$C$11 (The dependent variable is
in column C, rows 1-11, including the label)
Input range for independent variable (X):
$B$1:$B$11 (The independent
variable is in column B, rows 1-11, including the label)
Click on Labels (variable names are given in row 1)
Click on Residual plots and Line fit plots
Click on OK (the output will be printed to another worksheet)
Print out the worksheet and plots





Source: Brooke, P. (1976), “Promotional Parameters: A Preliminary Examination of
Promotional Expenditures”, Journal of Drug Issues, 6:21-27.
Therapeutic Category
Promo (X) Sales (Y)
Antibiotics
44
256
Ataractics
34
372
Non-Narcotic Analgesics
21
132
Penicillins
20
165
Oral Cold Preps
17
79
Plain Corticoids
16
80
Diuretics
14
142
Anti-spasmodics
13
84
Plain Antacids
12
11
Hypotensives
11
126
EXCEL-lent!!! --- Multiple Regression
Mortgage Financing Cost Variation – By City
A study in the mid 1960’s reported regional differences in mortgage costs for new
homes. The sampling units were n=18 metro areas (SMSA’s) in the US. The dependent
(response) variable (Y) is the average yield on a new home mortgage for the SMSA. The
independent (predictor) variables (XI) are (and some hypothesized associations with cost):






X1 – Average Loan Value / Mortgage Value Ratio (Higher X1 means lower down payment and higher
risk to lender)
X2 – Road Distance from Boston (Higher X2 means further from Northeast, where most capital was at
the time, and higher costs of capital)
X3 – Savings per annual dwelling unit constructed (Higher X3 means higher relative credit surplus, and
lower costs of capital)
X4 – Savings per Capita (does not adjust for new housing demand)
X5 -- %Increase in Population 1950—1960
X6 -- % of first mortgage debt controlled by inter-regional banks
Fit this full model using EXCEL, by following these directions:












Click on housenew.xls (you may need to find it a different directory)
Choose Regression by selecting:
Tools
Data Analysis
Regression
4) Input range for dependent variable (Y): $B$1:$B$19
(The dependent
variable
is in column B, rows 1-19, including the label).
5) Input range for independent variable (X): $C$1:$H$19 (The independent
variables are in columns C-H, rows 1-19, including the labels).
Click on Labels (variable names are given in row 1)
Click on OK (the output will be printed to another worksheet, unless you change
the output option.
Print out the worksheet that contains R2, Adj-R2, the ANOVA table, estimates,
standard errors, and confidence intervals.
Fit the reduced model, containing X1, X2, and X3 as the only independent variables.
The only adjustment you need to make is in part 5), use input range: $C$1:$E$19
Source: Schaaf, A.H. (1966), “Regional Differences in Mortgage Financing Costs”,
Journal of Finance, 21:85-94.
SMSA
Los Angeles-Long Beach
Denver
San Francisco-Oakland
Dallas-Fort Worth
Miami
Atlanta
Houston
Seattle
New York
Memphis
New Orleans
Cleveland
Chicago
Detroit
Minneapolis-St Paul
Baltimore
Philadelphia
Boston
MrtgYld Loan/Valu DistBostn Sav/Unit PCSave %PopInc %Interreg
6.17
78.1
3042
91.3
1738.1
45.5
33.1
6.06
77
1997
84.1
1110.4
51.8
21.9
6.04
75.7
3162
129.3
1738.1
24
46
6.04
77.4
1821
41.2
778.4
45.7
51.3
6.02
77.4
1542
119.1
1136.7
88.9
18.7
6.02
73.6
1074
32.3
582.9
39.9
26.6
5.99
76.3
1856
45.2
778.4
54.1
35.7
5.91
72.5
3024
109.7
1186
31.1
17
5.89
77.3
216
364.3
2582.4
11.9
7.3
5.87
77.4
1350
111
613.6
27.4
11.3
5.85
72.4
1544
81
636.1
27.3
8.1
5.75
67
631
202.7
1346
24.6
10
5.73
68.9
972
290.1
1626.8
20.1
9.4
5.66
70.7
699
223.4
1049.6
24.7
31.7
5.66
69.8
1377
138.4
1289.3
28.8
19.7
5.63
72.9
399
125.4
836.3
22.9
8.6
5.57
68.7
304
259.5
1315.3
18.3
18.7
5.28
67.8
0
428.2
2081
7.5
2
EXCEL-lent!!! --- 2-Factor ANOVA
Per Capita Income – U.S. Cities (1985)
The following data represent per capita income for 28 U.S. cities. Cites are classified by
two factors: Region (Southeast/Midwest) and Size (Large/Medium). Test for
interactions and main effects for these two factors on per capita income, by following
these directions:









Click on citypci.xls (you may need to find it a different directory)
Choose 2-Factor ANOVA by selecting:
Tools
Data Analysis
ANOVA: Two-Factor with Replication
Input data range $B$1:$D$15
(Column B represents levels of the Region
factor, Column C represents Large Cities and Column D represents Medium
Cities).
Choose 7 Rows Per Sample
Click on OK (the output will be printed to another worksheet, unless you change
the output option)
Print out the worksheet that contains the group means and ANOVA table.
Large City
Atlanta
New Orleans
Miami
Baltimore
Memphis
Houston
Dallas
Cincinnati
Chicago
Detroit
St Louis
Indianapolis
Minneapolis
Columbus
Region
Large City Med City Medium City
Southeast
10341
8035 Birmingham
8975
11923 Little Rock
8904
9902 Tampa
8647
12259 Charlotte
9362
11527 Oklahoma City
12115
9129 Louisville
12816
10466 Jacksonville
Midwest
10247
11764 Wichita
9642
10276 Fort Wayne
8852
8621 Dayton
8799
12134 Des Moines
10836
9625 Grand Rapids
12302
12886 Omaha
9909
11824 Madison
Source: U.S. Bureau of the Census, County and City Data Book, 1988
Download