This work is licensed under a . Your use

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use
of this material constitutes acceptance of that license and the conditions of use of materials on this site.
Copyright 2009, The Johns Hopkins University and John McGready. All rights reserved. Use of these
materials permitted only in accordance with license rights granted. Materials provided “AS IS”; no
representations or warranties provided. User assumes all responsibility for use, and all liability related
thereto, and must independently review all materials for accuracy and efficacy. May contain materials
owned by others. User is responsible for obtaining permissions for use from third parties as needed.
Comparing Means among Two (or More)
Independent Populations
John McGready
Johns Hopkins University
Lecture Topics
 
CIs for mean difference between two independent populations
 
Two sample t-test
 
Non-parametric alternative, Mann Whitney (FYI, optional)
 
Comparing means amongst more than two independent populations:
ANOVA
3
Section A
Two Sample t-test: The Resulting Confidence Interval
Comparing Two Independent Groups
 
“A Low Carbohydrate as Compared with a Low Fat Diet in Severe
Obesity”*
-  132 severely obese subjects randomized to one of two diet
groups
-  Subjects followed for a six month period
 
At the end of study period
-  “Subjects on the low-carbohydrate diet lost more weight than
those on a low fat diet (95% confidence interval for the
difference in weight loss between groups, -1.6 to -6.2 kg;
p < .01)”
Source: * Samaha, F., et al. A low-carbohydrate as compared with a low-fat diet in severe obesity, New
England Journal of Medicine, 348: 21.
5
Comparing Two Independent Groups: Diet Types Study
 
Scientific question
-  Is weight change associated with diet type?
Diet Group
Low-Carb
Low-Fat
64
68
Mean weight change (kg)
Post-diet less pre-diet
-5.7
-1.8
Standard deviation of
weight changes (kg)
8.6
3.9
Number of subjects (n)
6
Diet Type and Weight Change
 
95% CIs for weight change by diet group
 
Low Carb:
 
Low Fat:
7
Comparing Two Independent Groups: Diet Types Study
 
In statistical terms, is there a non-zero difference in the average
weight change for the subjects on the low-fat diet as compared to
subjects on the low-carbohydrate diet?
-  95% CIs for each diet group mean weight change do not overlap,
but how do you quantify for the difference?
 
The comparison of interest is not “paired”
-  There are different subjects in each diet group
 
For each subject a change in weight (after diet—before weight) was
computed
-  However, the authors compared the changes in weight between
two independent groups!
8
Comparing Two Independent Groups
 
How do we calculate
-  Confidence interval for difference?
-  p-value to determine if the difference in two groups is
“significant?”
 
Since we have large samples (both greater than 60) we know the
sampling distributions of the sample means in both groups are
approximately normal
 
It turns out the difference of quantities, which are (approximately)
normally distributed, are also normally distributed
9
Sampling Distribution: Difference in Sample Means
 
So, the big news is . . .
-  The sampling distribution of the difference of two sample
means, each based on large samples, approximates a normal
distribution
-  This sampling distribution is centered at the true mean
difference, µ1 - µ2
10
Simulated Sampling Dist’n of Sample Mean Weight Loss
 
Simulated sampling distribution of sample mean weight change: low
carbohydrate diet group
11
Simulated Sampling Dist’n of Sample Mean Weight Loss
 
Simulated sampling distribution of sample mean weight change: low
fat diet group
12
Simulated Sampling Dist’n of Sample Mean Weight Loss
 
Simulated sampling distribution of sample mean weight change: low
fat diet group
13
Simulated Sampling Dist’n of Sample Mean Weight Loss
 
Side by side boxplots
14
95% Confidence Interval for Difference in Means
 
Our most general formula
 
The best estimate of a population mean difference based on sample
means:
 
Here,
may represent the sample mean weight loss for the 64
subjects on the low carbohydrate diet, and
the mean weight
less for the 68 subjects on the low fat diet
15
95% CI for Difference in Means: Diet Types Study
 
So,
formula for the 95% CI for µ1 - µ2 is:
 
Where
means
: hence the
= standard error of the difference of two sample
16
Two Independent (Unpaired) Groups
 
The standard error of the difference for two independent samples is
calculated differently than we did for paired designs
-  With paired design we reduced data on two samples to one set
differences between two groups
 
Statisticians have developed formulas for the standard error of the
difference
 
These formulas depend on sample sizes in both groups and standard
deviations in both groups
 
The
- 
is greater than either
Why do you think this is?
or
17
Principle
 
Variation from independent sources can be added
-  Why do you think this is additive
 
Of course, we don’t know σ1 and σ2: so we estimate with s1 and s2
to get an estimated standard error:
18
Comparing Two Independent Groups: Diet Types Study
 
Recall the data from the weight change/diet type study
Diet Group
Low-Carb
Low-Fat
64
68
Mean weight change (kg)
Post-diet less pre-diet
-5.7
-1.8
Standard deviation of
weight changes (kg)
8.6
3.9
Number of subjects (n)
19
95% CI for Difference in Means: Diet Types Study
 
So in this example, the estimated 95% for the true mean difference
in weight between the low-carbohydrate and low-fat diet groups is:
20
From Article
 
“Subjects on the low-carbohydrate diet lost more weight than those
on a low fat diet (95% confidence interval for the difference in
weight loss between groups, -1.6 to -6.2 kg; p< .01)”
 
So those on the low carb diet lost more on average by 3.9 kg: after
accounting for sampling variability this excess average loss over the
low-fat diet group could be as small as 1.6 kg or as large as 6.2 kg
-  This confidence interval does not include 0, suggesting a real
population level association between type of diet (low-carb or
low-fat) and weight loss
21
Section B
Two Sample t-test: Getting a p-value
Hypothesis Test to Compare Two Independent Groups
 
Two sample (unpaired) t-test
 
Is the (mean) weight change equal in the two diet groups?
-  Ho: µ1 = µ2
-  HA: µ1 ≠ µ2
 
In other words, is the expected difference in weight change zero?
-  Ho: µ1 - µ2 = 0
-  HA: µ1 - µ2 ≠ 0
3
Hypothesis Test to Compare Two Independent Groups
 
Recall, general “recipe” for hypothesis testing . . .
1.  Start by assuming Ho true
2.  Measure distance of sample result from µo (here again its 0)
3.  Compare test statistic (distance) to appropriate distribution to
get p-value
4
Diet Type and Weight Loss Study
 
In the diet types and weight loss study, recall:
 
So in this study:
- 
So this study result was 3.3 standard errors below the null mean
of 0 (i.e., 3.3 standard errors from the mean weight less
expected if null was true)
5
How Are p-values Calculated?
 
Is a result 3.3 standard errors below 0 unusual?
-  It depends on what kind of distribution we are dealing with
 
The p-value is the probability of getting a test statistic as extreme
as (or more extreme than) what you observed (-3.3) by chance if
was true
 
The p-value comes from the sampling distribution of the difference
in two sample means
 
What is the sampling distribution of the difference in sample means?
-  If both groups are large (more than 60 subjects) then this
distribution is approximately normal
-  This sampling distribution will be centered at true difference
-  Under null hypothesis, this true difference is 0
6
Diet/Weight Loss Sample
 
To compute a p-value, we would need to compute the probability of
being 3.3 or more standard errors away from 0 on a standard normal
curve
7
How to Use Stata to Perform a 2-Sample T-Test
 
Command syntax:
-  ttesti
, unequal
8
How to Use Stata to Perform a 2-Sample T-Test
 
Command syntax:
-  ttesti
, unequal
9
How to Use Stata to Perform a 2-Sample T-Test
 
Command syntax:
-  ttesti
, unequal
10
Summary: Weight Loss Example
 
Statistical method
-  “We randomly assigned 132 severely obese patients . . . to a
carbohydrate restricted (low-carbohydrate) diet or a calorieand fat-restricted diet”
-  “For comparison of continuous variables between the two
groups, we calculated the change from baseline to six months
in each subject, and compared the mean changes in the two
diet groups using an unpaired t-test”
 
Result
-  “Subjects on the low-carbohydrate diet lost more weight than
those on a low fat diet (95% confidence interval for the
difference in weight loss between groups, -1.6 to -6.2 kg;
p < .01)”
11
Section C
Two Sample t-test, Approach with Smaller Samples
Sampling Distribution
 
What is sampling distribution of the difference in sample means?
-  If either (or both) sample sizes are less than 60, a t-distribution
is used with n1 + n2 -2 degrees of freedom: this is the degrees of
freedom for the total sample size from both groups minus two
3
Two Sample t-test
 
Example
-  In a randomized design, 23 patients with hyperlipidemia were
randomized to either take Treatment A or Treatment B for 12
weeks
-  12 patients assigned to Treatment A
-  11 patients assigned to Treatment B
4
Two Sample t-test
 
Example
-  LDL cholesterol levels (mmol/L) measured on each subject at
baseline, and 12 weeks after start of study
-  The 12-week change in LDL cholesterol was computed for each
subject
5
Two Sample t-test
 
Summary of results:
Treatment Group
A
B
12
11
Mean LDL change (mmol/L)
Post-trt less pre-trt
-1.41
-0.32
Standard deviation of
LDL changes (mmol/L)
0.55
0.65
Number of subjects (n)
6
Two Sample t-test
 
Scientific question
-  Is there a difference in LDL change between the two treatment
groups?
 
Methods of inference
-  Confidence interval for the difference in mean LDL cholesterol
will change between the two groups
-  Statistical hypothesis test
7
95% Confidence Interval for Difference in Means
 
The general formula (large samples):
 
The general formula (“smaller” samples):
8
Two Sample t-test
 
Sample mean difference and estimated standard error:
Treatment Group
A
B
12
11
Mean LDL change (mmol/L)
Post-trt less pre-trt
-1.41
-0.32
Standard deviation of
LDL changes (mmol/L)
0.55
0.65
Number of subjects (n)
9
95% CI for Difference in Means: Hyperlipidemia Ex
 
How many standard errors to add and subtract?
-  Since sample sizes are small we will have to add slightly more
than two standard errors
 
Number we need add and subtract for 95% confidence comes from a
t-distribution with (12 + 11 - 2 = 21 ) degrees of freedom
-  From t-table this value is 2.08
 
So, 95% CI for true mean difference in change in LDL cholesterol,
drug A to drug B
10
Hypothesis Test to Compare Two Independent Groups
 
Two-sample (unpaired) t-test: getting a p-value
 
Is the change in LDL cholesterol the same in the two treatment
groups?
-  Ho: µ1 = µ2 → Ho: µ1-µ2 = 0
-  HA: µ1 ≠ µ2 → HA: µ1-µ2 ≠ 0
11
Hypothesis Test to Compare Two Independent Groups
 
Recall, general “recipe” for hypothesis testing . . .
1.  Start by assuming Ho true
2.  Measure distance of sample result from µo (here again its 0)
3.  Compare test statistic (distance) to appropriate distribution to
get p-value
12
Diet Type and Weight Loss Study
 
In the diet types and weight loss study, recall:
 
So in this study:
- 
So this study result was 4.4 standard errors below the null mean
of 0 (i.e., 4.4 standard errors from the less expected mean
difference in cholesterol change between the two treatments if
null was true)
13
How Are p-values Calculated?
 
Is a result 4.4 standard errors below 0 unusual?
-  It depends on what kind of distribution we are dealing with
 
The p-value is the probability of getting a test statistic (distance) as
or more extreme than what you observed (-4.4) by chance if it was
true
 
The p-value comes from the sampling distribution of the difference
in two sample means
 
What is the sampling distribution of the difference in sample means?
-  t-distribution with 12 + 1 – 2 = 21 degrees of freedom
14
Hyperlipidemia Example
 
To compute a p-value, we would need to compute the probability of
being 4.4 or more standard errors away from 0 on a t-distribution
with 21 degrees of freedom
15
Using Stata
 
Command syntax:
-  ttesti
, unequal
16
Using Stata
 
Command syntax:
-  ttesti
, unequal
17
Using Stata
 
Command syntax:
-  ttesti
, unequal
18
Summary: Weight Loss Example
 
Statistical method
-  Twenty-three patients with hyperlipidemia were randomly
assigned to one of two treatment groups: Treatment A or
Treatment B
-  12 patients were assigned to receive Treatment A
-  11 patients were assigned to receive Treatment B
19
Summary: Weight Loss Example
 
Statistical method
-  Baseline LDL cholesterol measurements were taken on each
subject, and LDL was again measured after 12 weeks of
treatment
-  The change in LDL cholesterol was computed for each subject
-  The mean LDL changes in the two treatment groups were
compared using an unpaired t-test and a 95% confidence
interval was constructed for the difference in mean LDL
changes
20
Summary: Weight Loss Example
 
Result
-  Patients on treatment A showed a decrease in LDL cholesterol
of 1.41 mmol/L and subjects on treatment B showed a decrease
of .32 mmol/L (a difference of 1.09 mmol/L, 95% CI .57 to 1.61
mmol/L)
-  The difference in LDL changes was statistically significant
(p < .001)
21
Section D
Two Sample t-test, Two Choices
FYI: Equal Variances Assumption
 
The “traditional” t-test assumes equal variances in the two groups
-  This can be formally tested with another hypothesis test!
-  But why not just compare observed values of s1 to s2?
 
There is a slight modification to allow for unequal variances—this
modification adjusts the degrees of freedom for the test, using
slightly different SE computation (the formula I give you)
 
If you want to be truly “safe” (desert island choice of t-test)
-  More conservative to use test that allows for unequal variance
 
Makes little to no difference in large sample
3
FYI: Equal Variances Assumption
 
Actually, the following occurs:
-  If underlying population level standard deviations are equal:
  Both approaches give valid confidence intervals but
intervals by approach assuming unequal standard
deviations slightly wider (and p-values slightly larger)
- 
If underlying population level standard deviations are not equal:
  The approach assuming equal variances does not give valid
confidence intervals and can severely under-cover the goal
of 95%
4
Unequal SD Approach: Diet Type/ Weight Loss Example
 
Command syntax:
-  ttesti
, unequal
5
Equal SD Approach: Diet Type/ Weight Loss Example
 
Command syntax:
-  ttesti
6
Unequal SD Approach: LDL/ Treatment Example
 
Command syntax:
-  ttesti
, unequal
7
Equal SD Approach: LDL/Treatment Example
 
Command syntax:
-  ttesti
, unequal
8
Section E
The Unpaired t-test: More Examples
Example 1: CE Costs in Maryland
 
Random sample of 500 Carotid Endarterectomy (CE) procedures
performed in State of Maryland, 1995
 
Some results:
Males
Females
Mean Charges (U.S. $)
6,615
7,088
SD (U.S. $)
4,220
4908
271
229
N
3
Example 1 :Boxplots!
 
We actually have luxury of individual level data here
4
Example 1
 
95% CIs for 1995 CE costs by patient sex
- 
Females:
- 
Males:
5
Example 1
 
Two sample t-test, unequal standard deviations assumption
6
Example 1: Summary
 
In a study conducted to assess determinants of CE procedure costs
in Maryland, a random sample of 500 CE patients from 1995 was
analyzed
 
This consisted of 229 females with average costs of $7,088 (95% CI:
6,440 to 7,736), and 271 males with average costs $6,625 (95% CI:
6,103 to 7,127)
 
While the females in the sample had average costs of $473 greater
than males in the samples, this difference in average costs is not
statistically significant (p = .25)
-  The 95% CI for the female to male average cost differential is
$-339 to $1,285
7
Example 2
 
The following data is taken from a 1990 study comparing (random
samples of) adolescents with bulimia to adolescents without
bulimia; both groups had similar body composition and levels of
physical activity*
 
The following table shows summary data on daily calorie intake by
bulimia status
Bulimia
No Bulimia
Mean Daily Caloric
Intake (kcal/kg)
22.1
29.7
SD (kcal/kg))
4.6
6.5
N
23
15
Source: *Example based on data taken from Pagano, M., Gauvreau, K. (2000). Principles of biostatistics, 2nd ed.
Duxbury Press (based on research by Gwirtsman, et al. (1989) Decreased calorie intake. American Journal of Clinical
Nutrition, 49.
8
Example 2
 
Abstract from article:
9
Example 2
 
Abstract from article:
10
Example 2: Boxplots
 
Again, luxury of individual level data:
11
Example 2
 
95% CIs for average daily calorie intake by bulimia status
- 
Bulimia:
- 
No bulimia:
12
Example 2 in Stata
 
Two sample t-test, unequal standard deviations assumption:
13
Summary
 
From the article:
14
Section F (Optional)
Non-Parametric Analogue to the Two Sample t-test
Alternative to the Two Sample T-Test
 
Nonparametric test for comparing two groups
 
“Non-parametric” refers to a class of tests that do not assume
anything about distribution of the data
 
Nonparametric test for comparing two groups
-  Mann-Whitney Rank Sum Test (Wilcoxon Rank Sum Test)
-  Also called Mann-Whitney-Wilcoxon (a mouthful)
 
Tries to answer the following question:
-  Are the two population distributions different?
3
Advantages
 
Does not assume populations being compared are normally
distributed
-  The two-sample t-test requires that assumption with very small
samples sizes
 
Uses only ranks
 
Not sensitive to outliers
4
Disadvantage of the Nonparametric Test
 
Nonparametric methods are often less sensitive (powerful) for
finding true differences because they throw away information (they
use only ranks)
 
Need full data set, not just summary statistics
 
Results do not include any confidence intervals quantifying range of
possibility for true difference between populations
5
Example: Health Education Study
 
Evaluate an intervention to educate high school students about
health and lifestyle over a two-month period
 
10 students randomized to “intervention” or “control” group
 
x = post test score – pre-test score is outcome to compare between
the intervention and control groups
6
Example: Health Education Study
 
x = post- pretest score for both groups
 
Intervention (I)
 
Control (C)
6
-5
-6
1
4
-  Only five individuals in each sample!!!
-  We want to compare the control and intervention groups to
assess whether the “improvement” (post–pre) in scores are
different, taking random sampling error into account
5
0
7
2
19
7
Example: Health Education Study
 
With such a small sample size, we need to be sure score
improvements are normally distributed if we want to use
t-test (BIG assumption)
 
Possible approach:
-  Mann-Whitney-Wilcoxon non-parametric test!
8
Example: Health Education Study
 
First step—rank the pooled data (ignore groupings)
- 
Rank
-6 -5 0 1 2 4 5 5 7 19
1 2 3 4 5 6 7 8 9 10
9
Example: Health Education Study
 
Second step—“reattach” group status
- 
- 
Rank
Group
-6 -5 0 1 2 4 5 5 7 19
1 2 3 4 5 6 7 8 9 10
C C I C I C I C I
I
10
Example: Health Education Study
 
Find the average rank in each of the two groups
 
Intervention group average rank
 
Control group average rank
11
Example: Health Education Study
 
Statisticians have developed formulas and tables to determine the
probability of observing such an extreme discrepancy in ranks (6.8
vs. 4.2) by chance alone
-  This is the p-value
 
In the health education study, the p-value was .17
-  The interpretation is that the Mann-Whitney test did not show
any significant difference in test score “improvement” between
the intervention and control group (p = .17)
12
Notes
 
The two-sample t-test would give a different answer (p = .14)
 
Different statistical procedures can give different p-values
 
If the largest observation, 19, was changed, the p-value based on
the Mann-Whitney test would not change but the two-sample t-test
would change
13
Notes
 
The t-test or the nonparametric test?
-  Statisticians will not always agree, but there are some
guidelines
-  Use non-parametric test if sample size is small and you have no
reason to believe data is “well behaved” (normally distributed)
-  Only “ranks” available
14
Using Stata to Perform Mann-Whitney-Wilcoxon
 
Data, as entered
15
Using Stata to Perform Mann-Whitney-Wilcoxon
 
“ranksum” command
-  Syntax:
  ranksum varname, by(group_var)
16
Using Stata to Perform Mann-Whitney-Wilcoxon
 
“ranksum” command
-  Syntax:
  ranksum varname, by(group_var)
17
Using Stata to Perform t-test
 
“ttest” command without “i” on end when data already in Stata
-  Syntax:
  ttest varname, by(group_var)
18
Summary: Educational Intervention Example
 
Statistical methods
-  10 high school students were randomized to either receive a
two-month health and lifestyle education program (or no
program)
-  Each student was administered a test regarding health and
lifestyle issues prior to randomization (and after the two-month
period)
19
Summary: Educational Intervention Example
 
Statistical methods
-  Differences in the two test scores (after-before) were computed
for each student
-  Mean and median test score changes were computed for each of
the two study groups
-  A Mann-Whitney rank sum test was used to determine if there
was a statistically significant difference in test score change
between the intervention and control groups at the end of the
two-month study period
20
Summary: Educational Intervention Example
 
Result
-  Participants randomized to the educational intervention scored
a median five points higher on the test given at the end of the
two-month study period, as compared to the test administered
prior to the intervention
-  Participants randomized to receive no educational intervention
scored a median one point higher on the test given at the end
of the two-month study period
-  The difference in test score improvements between the
intervention and control groups was not statistically significant
(p = .17)
21
Section G
Comparing Means between More than
Two Independent Populations
Motivating Example
 
Suppose you are interested in the relationship between smoking and
mid-expiratory flow (FEF), a measure of pulmonary health
 
Suppose you recruit study subjects and classify them into one of six
smoking categories
-  Nonsmokers (NS)
-  Passive smokers (PS)
-  Non-inhaling smokers (NI)
-  Light smokers (LS)
-  Moderate smokers (MS)
-  Heavy smokers (HS)
3
Motivating Example
 
You are interested in whether differences exist in mean FEF
amongst the six groups
 
Main outcome variable is mid-expiratory flow (FEF) in liters per
second
4
Motivating Example
 
One strategy is to perform lots of two-sample t-tests (for each
possible two-group comparison)
 
In this example, there would be 15 comparisons you would need to
do!
-  NS to PS, NS to NI, and so on . . .
5
Motivating Example
 
It would be nice to have one “catch-all” test
-  Something which would tell you whether there were any
differences amongst the six groups
-  If so, you could then do group to group comparisons to look for
specific group differences
6
Extension of the Two-Sample t-Test
 
Analysis of variance (One-Way ANOVA)
-  The t-test compares means in two populations
-  ANOVA compares means amongst more than two populations
with one test
 
The p-value from ANOVA helps answer the question
-  “Are there any differences in the means among the
populations?”
7
Extension of the Two-Sample t-Test
 
General idea behind ANOVA, comparing means for k-groups (k > 2):
- 
- 
Ho : µ1 = µ2 = . . . µk
HA : At least one mean different
8
Example
 
Smoking and FEF (Forced Mid-Expiratory Flow Rate)*
-  A sample of over 3,000 persons was classified into one of six
smoking categorizations based on responses to smoking related
questions
Source: * White, J.R., Froeb, H.F. (1980). Small-airways dysfunction in non-smokers chronically exposed to tobacco
smoke, New England Journal of Medicine 302: 13.
9
Example 1
 
Nonsmokers (NS)
 
Passive smokers (PS)
 
Non-inhaling smokers (NI)
 
Light smokers (LS)
 
Moderate smokers (MS)
 
Heavy smokers (HS)
10
Example 1
 
Smoking and FEF
-  From each smoking group, a random sample of 200 men was
drawn (except for the non-inhalers, as there were only 50 male
non-inhalers in the entire sample of 3,000)
-  FEF measurements were taken on each of the subjects
11
Example 1—Table
 
Data summary
Group
 
Mean FEF
SD FEF
(L/s)
(L/s)
n
NS
3.78
0.79
200
PS
3.30
0.77
200
NI
3.32
0.86
50
LS
3.23
0.78
200
MS
2.73
0.81
200
HS
2.59
0.82
200
Based on a one-way analysis of variance, there are statistically
significant differences in FEF levels among the six smoking groups
(p < .001)
12
What’s the Rationale behind Analysis of Variance?
 
The variation in the sample means between groups is compared to
the variation within a group
 
If the between group variation is a lot bigger than the within group
variation, that suggests there are some differences among the
populations
13
Analysis of Variance
14
Summary: Smoking and FEF
 
Statistical methods
-  200 men were randomly selected from each of five smoking
classification groups (non-smoker, passive smokers, light
smokers, moderate smokers, and heavy smokers), as well as 50
men classified as non-inhaling smokers for a study designed to
analyze the relationship between smoking and respiratory
function
15
Summary: Smoking and FEF
 
Statistical Methods
-  Analysis of variance was used to test for any differences in FEF
levels amongst the six groups of men
-  Individual group comparisons were performed with a series of
two sample t-tests, and 95% confidence intervals were
constructed for the mean difference in FEF between each
combination of groups
-  Analysis of variance showed statistically significant
(p < .001) differences in FEF between the six groups of smokers
-  Non-smokers had the highest mean FEF value, 3.78 L/s, and this
was statistically significantly larger than the five other smokingclassification groups
16
Summary: Smoking and FEF
 
Results
-  Analysis of variance showed statistically significant
(p < .001) differences in FEF between the six groups of smokers
-  Non-smokers had the highest mean FEF value, 3.78 L/s, and this
was statistically significantly larger than the five other smokingclassification groups
-  The mean FEF value for non-smokers was 1.19 L/s higher than
the mean FEF for heavy smokers (95% CI 1.03–1.35 L/s), the
largest mean difference between any two smoking groups
-  Confidence intervals for all smoking group FEF comparisons are
in Table 1
17
Example 2
 
FEV1 and three medical centers*
-  Data was collected on 63 patients with coronary artery disease
at 3 difference medical centers (Johns Hopkins, Ranchos Los
Amigos Medical Center, St. Louis University School of Medicine)
-  Purpose of study to investigate effects of carbon monoxide
exposure on these patients
-  Prior to analyzing CO effects data, researchers wished to
compare the respiratory health of these patients across the
three medical centers
Source: * Pagano, M., Gauvreau, K. (2000). Principles of biostatistics. Duxbury Press.
18
Example 2
 
Snippet of data in Stata
19
Boxplots
 
FEV1 values by center
20
Example 2
 
ANOVA with Stata
-  syntax oneway outcome_var group_var
21
Example 2
 
ANOVA with Stata
-  syntax oneway outcome_var group_var
22
Example 2
 
FEV and 3 medical centers 95% CIs for FEV1 by medical center
23