Stata 2, Bivariate Hein Stigum Presentation, data and programs at:

advertisement
Stata 2, Bivariate
Hein Stigum
Presentation, data and programs at:
http://folk.uio.no/heins/
May-16
H.S.
1
Datatypes
• Categorical data
– Nominal:
– Ordinal:
married/ single/ divorced
small/ medium/ large
• Numerical data
– Discrete:
number of children
– Continuous: weight
May-16
H.S.
2
Data type dictates type of analysis
Data
type
Numerical
Yes
Means
T-test
Linear regression
May-16
Normal
data
Categorical
No
Medians
Non-par tests
H.S.
Freq table
Cross, Chisquare
Logistic regression
3
Continuous symmetric outcome
Example:
Birth weight
May-16
H.S.
4
Distribution
0
0
.0002 .0004 .0006 .0008
drop if weight<2000
kdensity weight
Density
kdensity weight
0
2000
4000
6000
weight
May-16
H.S.
2000
3000
4000
weight
5000
6000
5
Central tendency and dispersion
Mean and standard deviation:
Mean with confidence interval:
May-16
H.S.
6
Compare groups, equal variance?
• Equal
2
May-16
0
• Not equal
2
4
2
H.S.
0
2
4
7
2 independent samples
Are birth weights the same for boys and girls?
Density plot
2000
3000
4000
5000
6000
Scatterplot
Boys
Girls
2000
3000
sex
May-16
H.S.
4000
Birth weight
5000
6000
8
2 independent samples test
May-16
H.S.
9
K independent samples
• Is birth weight the same over parity?
Density plot
6000
Scatterplot
2000
3000
4000
5000
0
1
2+
0
1
Parity
2-7
2000
3000
Equal means? Linear effect?
Outliers?
May-16
4000
Birth weight, g
5000
6000
Equal variances?
H.S.
10
K independent samples test
equal means?
Equal variances?
May-16
H.S.
11
Continuous by continuous
• Does birth weight depend on gestational age?
Scatterplot
4000
3000
2000
2000
3000
4000
Birth weight
5000
5000
6000
Scatterplot, outlier dropped
200
May-16
300 400 500 600
Gestational age
700
200 220 240 260 280 300
Gestational age
H.S.
12
Continuous by continuous tests
• Cut gestational age up in groups,
then use T-test or ANOVA
or
• Use linear regression with 1 covariate
May-16
H.S.
13
Test situations
• 2 independent samples
• ttest weight, by(sex)
• K independent samples
• oneway weight parity
• By continuous
• regress weight gestAge
• 2 dependent samples (Paired)
• ttest weight_last_year = weight_today
May-16
H.S.
14
Continuous skewed outcome
Example:
Number of sexual partners
May-16
H.S.
15
Distribution
kdensity partners if partners<=50
0
.02
.04
.06
.08
.1
Distribution of number of lifetime partners
25%50%
75%
95%
1 4
9
20
50
Partners
N=394
May-16
H.S.
16
Central tendency and dispersion
Median and percentiles:
May-16
H.S.
17
2 independent samples
Do males and females have the same number of partners?
Density plot
0
50
100
150
200
Scatterplot
Males
Females
0
Gender
May-16
H.S.
10
20
30
partners
40
50
18
2 independent samples test
equal medians?
May-16
H.S.
19
K independent samples
Do partners vary with age?
Density plot (partners<20)
200
20
Scatterplot (partners<20)
00
50
5
100
10
150
15
Age:
18-29
30-44
45-60
18-29
18-29
May-16
30-44
30-44
agegr3
agegr3
45-60
45-60
0
H.S.
5
10
Partners
15
20
20
K independent samples test
equal medians?
May-16
H.S.
21
Table of tests
Numerical data
Normal
Skewed
1 sample
One sample T-test
Kolmogorov-Smirnov
2 independent samples Independent sample T-test Mann-Whitney U
K independent samples ANOVA
Kruskal-Wallis
2 dependent samples Paired sample T-test
Wilcoxon signed rank test
Categorical ordered:
May-16
Proportions
Binomial
Chi-square
Chi-square
Mc-Nemar (2x2)
use nonparametric tests
H.S.
22
Categorical data
Example:
Being bullied
May-16
H.S.
23
Frequency and proportion
Frequency:
Proportion with CI:
May-16
H.S.
24
Proportion, confidence interval
proportion:
x=”disease”
n=total number
x
p
n
p (1  p )
n
standard error:
se( p ) 
confidence interval:
CI ( p )  p  2 se( p )
May-16
H.S.
25
Crosstables
Are boys bullied as much as girls?
equal proportions?
May-16
H.S.
26
Ordered categories, trend
.1
.15
.2
.25
Does bullied vary with age?
twoway (fpfitci bullied agegr) ///
(lfit
bullied agegr)
2-6 y
May-16
7-12 y
Age group
H.S.
13-17 y
27
Ordered categories, trend
Trend?
equal proportions?
May-16
H.S.
28
Table of tests
Numerical data
Normal
Skewed
1 sample
One sample T-test
Kolmogorov-Smirnov
2 independent samples Independent sample T-test Mann-Whitney U
K independent samples ANOVA
Kruskal-Wallis
2 dependent samples Paired sample T-test
Wilcoxon signed rank test
Categorical ordered:
May-16
Proportions
Binomial
Chi-square
Chi-square
Mc-Nemar (2x2)
use nonparametric tests
H.S.
29
Download