Sudaan / Esa Virtala, Terveyden ja hyvinvoinnin laitos

advertisement
Sudaan - koulutus
KTL/TTO 2004
Research Triangle Institute
2
SUDAAN
3
Why ?
4
Pitfalls
PITFALLS OF USING STANDARD STATISTICAL SOFTWARE
PACKAGES FOR SAMPLE SURVEY DATA
CAUTIONS IN USING STANDARD STATISTICAL
SOFTWARE PACKAGES
Donna J. Brogan, Ph.D.
Rollins School of Public Health, Emory University, Atlanta
April 15, 1997
Standard statistical software packages generally do not take into
account four common characteristics of sample survey data: (1)
unequal probability selection of observations, (2) clustering of
observations, (3) stratification and (4) nonresponse and other
adjustments [2 ].
Point estimates of population parameters are impacted by the value
of the analysis weight for each observation. These weights depend
upon the selection probabilities and other survey design features
such as stratification and clustering. Hence, standard packages will
yield biased point estimates if the weights are ignored.
Estimated variance formulas for point estimates based on sample
survey data are impacted by clustering, stratification and the
weights. By ignoring these aspects, standard packages generally
underestimate the estimated variance of a point estimate, sometimes
substantially so.
COPYRIGHT: This article is copyrighted and is not to be used
without proper acknowledgment and citation. It will appear as a
chapter in Encyclopedia of Biostatistics, edited by Peter Armitage
and Theodore Colton (Editors-in-Chief), to be published by John
Wiley in summer, 1998 as six volumes. The article will be in a
section titled “Design of Experiments and Sample Surveys”,
edited by Paul Levy.
AUTHOR CONTACT INFORMATION:
Donna Jean Brogan, Ph.D.
Professor of Biostatistics
Rollins School of Public Health
1518 Clifton Road N.E.—Room 324
Emory University
Atlanta, GA 30322
e-mail: brogan@sph.emory.edu
phone: 404-727-7701
fax: 404-727-1370
Most standard statistical packages can perform weighted analyses,
usually via a WEIGHT statement added to the program code. Use of
standard statistical packages with a weighting variable may yield the
same point estimates for population parameters as sample survey
software packages. However, the estimated variance often is not
correct and can be substantially wrong, depending upon the
particular program within the standard software package.
5
History
6
Platforms
7
Price
8
SUDAAN Overview
SUDAAN is a single program consisting of a family of procedures used to
analyze data from complex surveys and other observational and
experimental studies involving cluster-correlated data. A complex sample
may be multistage, stratified, or clustered. Many samples also have
unequal probabilities of selection, or are drawn from finite populations.
SUDAAN enables you to use survey data to obtain consistent estimates of
population parameters and their standard errors in accordance with the
sample design. SUDAAN also produces consistent estimates of
regression coefficients, descriptive statistics, and their associated standard
errors for cluster-correlated and repeated measures data applications in
clinical, epidemiological, toxicological, and behavioral research.
SUDAAN SUPPORT
Direct inquiries about SUDAAN to:
SUDAAN Business Coordinator
Telephone: 919-541-6602
Fax: 919-541-7431
Email: SUDAAN@rti.org
Research Triangle Institute, 3040 Cornwallis Road, Research Triangle Park, NC
27709 USA
9
SUDAAN Procedures
Utility Procedure
RECORDS Procedure
The RECORDS procedure is designed to print records from any ASCII, SAS, SASXPORT, SUDAAN,
SUDXPORT or SPSS record file. This is particularly useful when you wish to verify that SUDAAN is
reading your data properly.
You can also use this procedure to obtain a file contents summary or convert an input file of one type to
another. For instance, you can convert an ASCII data set to a SUDAAN data set (and vice versa).
NOTE: For SAS-Callable SUDAAN only, you can convert among the following file types: SAS, SUDAAN
and SUDXPORT.
The RECORDS procedure statements can be grouped into these categories:
O Procedure statement: PROC RECORDS
O Computation statement: SUBPOPN
O Output statements: TITLE, FOOTNOTE, SETENV, PRINT, OUTPUT
10
SUDAAN Procedures
Utility Procedure
RECORDS Quick Reference
PROC RECORDS DATA=filename [SUDDATA=filename]
[FILETYPE=ASCII|SAS|SPSS|SUDAAN|SUDXPORT|SASXPORT]
[COUNTREC] [CONTENTS] [HISTORY][NOPRINT] [MAXOBS=count] [NAMEFILE=filename] [LEVFILE=filename];
/* SUDDATA=filename must be used to specify a SUDAAN input file in SAS-Callable SUDAAN */
[SUBPOPN expression / [ NAME="label" ];]
< TITLE string(s) / [ APPEND|REPLACE ]; >
< FOOTNOTE string(s); >
/* for SAS-Callable SUDAAN use RTITLE */
/* for SAS-Callable SUDAAN use RFOOTNOTE */
< SETENV{PAGEBEG=integer TABBEG=integer LINESIZE=integer PAGESIZE=integer
LINESPCE=integer ROWSPCE=integer COLSPCE=integer ROWWIDTH=integer
COLWIDTH=integer DECWIDTH=integer INDROWD=integer INDROWS=integer
MAXIND=integer TOPMGN=integer LEFTMGN=integer LABWIDTH=integer }; >
<PRINT <keyword[=label]>/<keywordFMT=format> <keywordUNT=unit>
[FILENAME=filename] [REPLACE] [STARTREC=number] [MAXREC=number] [NOHEAD] [NODATE] [NOTIME];>
<OUTPUT <keyword[=label]> / FILENAME=filename [REPLACE] [NOCOMP|NOCOMPRESS]
[FILETYPE=ASCII|SAS|SUDAAN|SUDXPORT|SPSS|SASXPORT]
[NAMEFILE=filename] [LEVFILE=filename] <keywordFMT=format> [STDTYPE=number];>
RUN;
11
SUDAAN Procedures
Descriptive Procedures
CROSSTAB Procedure
The CROSSTAB procedure produces weighted frequency and percentage distributions for one-way
(univariate, single-variable) and multi-way (multivariate or multiple-variable) tabulations.
CROSSTAB also tests the hypothesis of no association between row and column variables in 2-way and
multi-way tables, as well as odds ratios and relative risks in 2x2 tables.
CROSSTAB is primarily for descriptive analyses of categorical variables.
DESCRIPT and RATIO produce descriptive statistics for continuous variables. Although DESCRIPT allows
you to request weighted frequency counts, CROSSTAB is computationally more efficient for this purpose.
The CROSSTAB statements can be grouped into these categories:
O Procedure statement: PROC CROSSTAB
O Sample design statements: WEIGHT, NEST, TOTCNT, SAMCNT, JOINTPROB, REPWGT, IDVAR, JACKWGTS, JACKMULT
O Computation statements: SUBGROUP, LEVELS, RECODE, SUBPOPN, TABLES, TEST
O Output statements: SETENV, PRINT, TITLE, FOOTNOTE, OUTPUT, FORMAT
12
SUDAAN Procedures
Descriptive Procedures
CROSSTAB Quick Reference
PROC CROSSTAB DATA=filename
[ SUDDATA=filename ] [ FILETYPE=ASCII|SAS|SPSS|SUDAAN|SUDXPORT|SASXPORT ]
[ DESIGN=WR|WOR|UNEQWOR|STRWR|STRWOR|SRS|BRR|JACKKNIFE ]
[ PSUDATA=filename ] [ PSU_REC=count ] [ CONF_LIM=percent ]
[ SMALL_CELL=count ] [ ATLEVEL1=position ] [ ATLEVEL2=position ]
[ DDF=number ] [ DEFT4|DEFT1|DEFT2|DEFT3|DEFT|DEFF ] [ DISPLAY ][ INCLUDE] [ MERGEHI ]
[ NOMARG ] [ NOCOL] [ NOROW ] [ NOTOT ] [ NOPER] [ NOSE ] [ NOWGT ] [ NOPRINT] [ MAXOBS=count ]
[ REPDATA=filename ] [ REP_REC=count ]
[ EST_STR=count ] [ EST_PSU = count ] [ NAMEFILE=filename ]
[ LEVFILE=filename ];
/* SUDDATA=filename must be used to specify a SUDAAN input file /* in SAS-Callable SUDAAN */
[
[
[
[
[
[
[
[
[
[
[
[
[
[
[
WEIGHT variable; ]
REPWGT variables / [ ADJFAY=value ] ; ]
IDVAR variable(s) ; ]
NEST variable(s) / [ PSULEV=position|FRL=position] [ STRLEV=position ] [ MISSUNIT] [ NOSORTCK ] ; ]
TOTCNT variable(s); ]
SAMCNT variable(s); ]
JOINTPROB variable(s); ]
JACKWGTS varlist / ADJJACK=value ; ]
JACKMULT value(s) ; ]
RECODE variable=(code_list) < variable=(code_list) >; ]
SUBPOPN expression / [ NAME=”label” ]; ]
SUBGROUP variable(s); ]
LEVELS level(s); ]
TABLES table_request(s);]
TEST { CHISQ LLCHISQ CMH }; ]
continued
13
SUDAAN Procedures
Descriptive Procedures
CROSSTAB Quick Reference (cont.)
< TITLE string(s) / [ APPEND|REPLACE ]; >
/* for SAS-Callable SUDAAN use RTITLE */
< FOOTNOTE string(s); >
/* for SAS-Callable SUDAAN use RFOOTNOTE */
< FORMAT variable(s) format_name.; >
/* for SAS-Callable SUDAAN use RFORMAT */
<OUTPUT [ SMLCELL ] [ NSUM ] [ WSUM ] [ SEWGT ] [ DEFFWGT ]
[ ROWPER ] [ COLPER] [ TOTPER ]
[ SEROW ] [ SECOL ] [ SETOT ] [ DEFFROW ] [ DEFFCOL ] [ DEFFTOT ]
[ ATLEV1][ ATLEV2 ]
[ CHISQ ] [ CHISQP ] [ CHISQDF ] [ LLCHISQ ] [ LLCHISQP ] [ LLCHISQDF ]
[ COR ] [ LOGCOR ] [ SELOGCOR ] [ UPCOR ] [ LOWCOR ]
[ RR1 ] [ LOGRR1 ] [ SELOGRR1] [ LOWRR1 ] [ UPRR1 ] [ RR2 ] [ LOGRR2 ] [ SELOGRR2 ][ LOWRR2 ] [ UPRR2 ]
[ CMH] [ CMHDF ] [ CMHPVAL ] [ DDF ] [ COVWGT ] [ SRSWGT ] [ COVROW ]
[ SRSROW ] [ COVCOL ] [ SRSCOL ] [ COVTOT ] [ SRSTOT]
/
[TABLECELL=DEFAULT|ALL ][ RISK=DEFAULT|ALL ][ TESTS=DEFAULT|ALL ]
[ CMHTEST=DEFAULT|ALL ] [ WGTCOV=DEFAULT|ALL ][ ROWCOV=DEFAULT|ALL ]
[ COLCOV=DEFAULT|ALL ][TOTCOV=DEFAULT|ALL ] < keywordFMT=format >
FILENAME=filename [ FILETYPE=ASCII|SAS|SUDAAN|SUDXPORT|SPSS|SASXPORT ]
[ NAMEFILE=filename ] [ LEVFILE=filename ] [ REPLACE ] [ NOCOMP|NOCOMPRESS ]
[ STDTYPE=number ] ;>
...or... NSUM=label WSUM=label ... etc.
continued
14
SUDAAN Procedures
Descriptive Procedures
CROSSTAB Quick Reference (cont.)
<SETENV { PAGEBEG=integer TABBEG=integer LINESIZE=integer PAGESIZE=integer
LINESPCE=integer ROWSPCE=integer COLSPCE=integer ROWWIDTH=integer
COLWIDTH=integer DECWIDTH=integer INDROWD=integer INDROWS=integer
MAXIND=integer TOPMGN=integer LEFTMGN=integer LABWIDTH=integer }; >
<PRINT [ SMLCELL ] [ NSUM ] [ WSUM ] [ SEWGT ] [ DEFFWGT ]
[ ROWPER ] [ COLPER ] [ TOTPER ]
[ SEROW ] [ SECOL ] [ SETOT ] [ DEFFROW ] [ DEFFCOL ] [ DEFFTOT ]
[ ATLEV1 ] [ ATLEV2 ]
[ CHISQ ] [ CHISQP ] [ CHISQDF ] [ LLCHISQ ] [ LLCHISQP ] [ LLCHISQDF ]
[ COR ] [ LOGCOR ] [ SELOGCOR ] [ UPCOR ] [ LOWCOR ]
[ RR1 ] [ LOGRR1] [ SELOGRR1 ] [ LOWRR1 ] [ UPRR1 ] [ RR2 ] [ LOGRR2 ] [ SELOGRR2 ] [ LOWRR2 ] [ UPRR2 ]
[CMH] [ CMHDF ] [ CMHPVAL ] [ DDF ] [ COVWGT ][ SRSWGT ][COVROW]
[ SRSROW ] [ COVCOL ] [ SRSCOL] [ COVTOT ] [ SRSTOT ]
/
[ TABLECELL=DEFAULT|ALL ] [ TESTS=DEFAULT|ALL ]
[ CMHTEST=DEFAULT|ALL] [ RISK=DEFAULT|ALL ] [WGTCOV=DEFAULT|ALL ]
[ ROWCOV=DEFAULT|ALL ] [ COLCOV=DEFAULT|ALL ] [ TOTCOV=DEFAULT|ALL ]
< keywordFMT=format > < keywordUNT=power > [STYLE=BOX|NCHS ]
[ NDIMROW=integer ] [ NDIMCOL=integer ]
[ FILENAME=filename ] [ REPLACE ] [ NOHEAD ] [ NODATE ] [ NOTIME ]; >
...or... NSUM=label WSUM=label ... etc.
RUN;
15
SUDAAN Procedures
Descriptive Procedures
DESCRIPT Procedure
The DESCRIPT procedure produces descriptive statistics for analysis variables, including means, totals,
percentages, geometric means, medians and other quantiles, and their standard errors for sample surveys
and other clustered data applications. The analysis variables can be continuous or categorical.
DESCRIPT computes standardized means according to the method of direct standardization. The
standardizing weights are assumed to be known.
Within one call to DESCRIPT, all analysis variables must be either continuous or categorical. The analysis
of both continuous and categorical variables requires separate calls to the DESCRIPT procedure.
For continuous analysis variables, you can request estimates of totals, means, proportions, geometric
means, and quantiles.
For categorical variables, you can request estimates of totals, percentages, and their standard errors.
You can request design effects for means, totals, and percentages. Design effects are not available for
contrast statistics, standardized estimates, or post-stratified estimates.
DESCRIPT is primarily for the descriptive analysis of continuous (and sometimes discrete) variables, while
CROSSTAB is primarily for descriptive analyses of categorical variables.
16
SUDAAN Procedures
Descriptive Procedures
RATIO Procedure
The RATIO procedure produces ratio estimates and their standard errors for sample surveys and other
clustered data applications. The numerator and denominator variables can be continuous or categorical.
RATIO computes standardized means according to the method of direct standardization. The
standardizing weights are assumed to be known.
For continuous variables, RATIO computes a ratio of weighted sums. If VAR1 and VAR2 denote the
numerator and denominator variables respectively, and the variable WT denotes the weight, then the ratio
estimate is computed by summing over the analysis observations on the input data set.
For categorical variables, RATIO computes a ratio of weighted counts of individuals falling into a specified
response category, as follows:
1) Any positive integer is a valid response category.
2) The numerator is the weighted sum of individuals who gave the specified integer response to the
numerator variable.
3) The denominator is the weighted sum of individuals who gave the specified integer response to the
denominator variable.
RATIO estimates can consist of a continuous numerator variable and a categorical denominator variable
or vice versa. However, within one call to RATIO, all numerator variables must be of the same type, and all
denominator variables must be of the same type.
17
SUDAAN Procedures
Regression Procedures
REGRESS Procedure
The REGRESS procedure fits linear models to sample survey data and other clustered data and repeated
measures applications. Estimates of the model parameters and their standard errors are computed, along
with tests of hypotheses.
REGRESS offers GEE model fitting techniques for efficient parameter estimation. For estimating variance
of the parameter estimates, REGRESS implements two robust methods described in Binder (1983) and
Zeger and Liang (1986), as well as a model-based (naive) variance estimation method. A choice of
independent or exchangeable "working" correlations is also provided.
You can specify tests for linear combinations of the model parameters, and you can output the predicted
values, residuals, parameter estimates, and their associated variance-covariance matrix for further
hypothesis testing. Also, you can estimate and test linear combinations of the adjusted group means (also
known as least squares means).
18
SUDAAN Procedures
Regression Procedures
LOGISTIC (RLOGIST) Procedure
The LOGISTIC procedure fits logistic regression models to complex sample survey data and other
clustered data applications. LOGISTIC produces estimates of the model parameters and their standard
errors, and tests the null hypothesis that individual regression coefficients associated with each variable in
the model are equal to zero.
LOGISTIC also provides tests for overall model significance, model minus intercept, as well as model
main effects and interactions. In addition, you can test linear combinations of the model parameters or
output the parameter estimates and variance-covariance matrix to a data set for further hypothesis testing.
You can also estimate and test linear combinations of the conditional and predicted marginals
(generalizations of adjusted group means to non-linear models.)
The LOGISTIC procedure estimates model parameters using generalized estimating equations (GEE). A
choice of independent vs. exchangeable "working" correlations is also provided. For estimating variance of
the parameter estimates, LOGISTIC implements two robust methods described in Binder (1983) and
Zeger and Liang (1986), as well as a model-based (naive) variance estimation method.
NOTE: For SAS-Callable SUDAAN, the name LOGISTIC conflicts with a SAS procedure of the same
name. Use RLOGIST to invoke the SUDAAN logistic regression procedure.
19
SUDAAN Procedures
Regression Procedures
MULTILOG Procedure
The MULTILOG procedure extends the modeling capabilities of SUDAAN to include categorical outcomes with more than
two categories which may or may not have a natural ordering. These models can be viewed as generalizations of logit
models for binary outcomes already available in SUDAAN in the LOGISTIC procedure. MULTILOG analyzes data from
sample surveys as well as from randomized experiments and other observational studies involving cluster-correlated or
longitudinal responses.
Two models have been implemented in the MULTILOG procedure: the proportional odds model with cumulative logit link for
ordinal responses and a generalized multinomial logit model for nominal outcomes. Both models handle continuous as well
as discrete explanatory variables. The generalized Multinomial Logit Model produces separate parameter vectors for each of
the generalized logit equations of interest; the Proportional Odds Model produces a common slope but separate intercepts
for each of the cumulative logit equations of interest.
The MULTILOG procedure estimates model parameters using generalized estimating equations (GEE). For estimating
variance of the parameter estimates, MULTILOG implements two robust methods described in Binder (1983) and Zeger and
Liang (1986), as well as a model-based (naive) variance estimation method. All three variance estimation methods allow a
choice of independent vs exchangeable working correlations for describing the dependence of responses within clusters. By
default, the GEE iterative fitting procedure in the exchangeable case uses the one-step approach, although a multistep GEE
procedure can also be obtained.
MULTILOG produces estimates of the model parameters and their standard errors, and tests the null hypothesis that
individual regression coefficients associated with each variable in the model are equal to zero. MULTILOG also provides
tests for overall model significance, model minus intercept, as well as model main effects and interactions. In addition, you
can test linear combinations of the model parameters and output many statistics to an output data set. You can also
estimate and test linear combinations of the conditional and predicted marginals (generalizations of adjusted group means
to non-linear models).
20
SUDAAN Procedures
Regression Procedures
LOGLINK Procedure
The LOGLINK procedure in SUDAAN fits log-linear regression models to cluster-correlated count data not
in the form of proportions. The counts are typically counts of events in a Poisson-like process. The
LOGLINK procedure estimates model parameters using generalized estimating equations (GEE).
LOGLINK implements two robust variance estimation methods described in Binder (1983) and Zeger and
Liang (1986), as well as a model-based (naive) variance estimation method. A choice of independent vs.
exchangeable "working" correlations is also provided. You can specify tests for linear combinations of the
model parameters, and you can output many statistics for further hypothesis testing. Also, you can
estimate and test linear combinations of the conditional and predicted marginals (generalizations of
adjusted group means to non-linear models).
Like all of SUDAAN's procedures, LOGLINK is designed to analyze data from complex sample surveys
(weighted, stratified, cluster-correlated data) as well as from randomized experiments and other
observational studies involving cluster-correlated or longitudinal responses.
21
SUDAAN Procedures
Regression Procedures
SURVIVAL Procedure
SURVIVAL provides proportional hazards modeling for failure time outcomes, which may contain left- and
right- censored observations, time-dependent covariates, and multiple events per subject.
The SURVIVAL procedure fits the discrete or continuous (Cox) proportional hazards model to sample
surveys and other clustered data applications. Estimates of the model parameters and their standard
errors are computed, along with tests of hypotheses.
Enhancements in the current software release are as follows:
• Counting process style of input (Andersen and Gill, 1982) to permit left truncation, multiple events per
subject, and time-dependent covariates. A time-dependent covariate is one whose value for any given
individual can change over time during the course of a study.
• Computation of Schoenfeld residuals and Score residuals to allow users to evaluate “goodness of fit” and
the validity of the proportional hazards assumption.
• Computation of Efron's likelihood approximation for ties in addition to the current default formula by
Breslow.
• Option to allow stratified baseline hazard functions for different types of failures or different subgroups of
the population.
22
Tunnusluvut
eri ohjelmistoilla
TITLE1 'TUNNUSLUVUT';
TITLE2 'SAS MEANS';
PROC MEANS DATA=WORK.T2K_DATA
N NMISS MEAN STDERR MAXDEC=3;
CLASS SP2 T2K;
VAR
SYSTBP2;
TYPES () SP2 T2K SP2*T2K;
RUN;
options nolabel;
TITLE2 'SAS SURVEYMEANS';
PROC SURVEYMEANS DATA=WORK.T2K_DATA
NOBS NMISS MEAN STDERR SUMWGT;
STRATA OSITE;
CLUSTER RYVAS;
WEIGHT WAN_UNIONI;
DOMAIN T2K SP2 T2K*SP2;
VAR
SYSTBP2;
RUN;
options label;
TITLE2 'SAS MEANS + WEIGHT';
PROC MEANS DATA=WORK.T2K_DATA
N NMISS SUMWGT MEAN STDERR MAXDEC=3;
WEIGHT WAN_UNIONI;
CLASS SP2 T2K;
VAR
SYSTBP2;
TYPES () SP2 T2K SP2*T2K;
RUN;
TITLE2 'SUDAAN DESCRIPT';
PROC DESCRIPT DATA=WORK.T2K_DATA DESIGN=WR;
SETENV
COLWIDTH=12 DECWIDTH=3;
NEST
OSITE RYVAS;
WEIGHT
WAN_UNIONI;
SUBGROUP SP2 T2K;
LEVELS
2
2;
VAR
SYSTBP2;
TABLES
SP2*T2K;
PRINT / STYLE=NCHS;
* tai STYLE=BOX;
RUN;
23
Tunnusluvut
SAS Means
Analysis Variable : systbp2 Systolinen verenpaine
N
N Obs
N
Miss
Mean
Std Error
------------------------------------------------------14395
13612
783
140.412
0.202
------------------------------------------------------Tutkimus
N
(1=T2K,2=MS)
N Obs
N
Miss
Mean
Std Error
----------------------------------------------------------------------1
7178
6401
777
133.675
0.266
2
7217
7211
6
146.393
0.282
----------------------------------------------------------------------Sukupuoli
N
(1=M,2=N)
N Obs
N
Miss
Mean
Std Error
----------------------------------------------------------------------1
6525
6220
305
140.036
0.266
2
7870
7392
478
140.729
0.298
----------------------------------------------------------------------Sukupuoli
Tutkimus
N
(1=M,2=N)
(1=T2K,2=MS)
N Obs
N
Miss
Mean
Std Error
--------------------------------------------------------------------------------------1
1
3203
2899
304
134.618
0.357
2
3322
3321
1
144.765
0.368
2
1
3975
3502
473
132.893
0.385
2
3895
3890
5
147.783
0.417
---------------------------------------------------------------------------------------
24
Tunnusluvut
SAS Means + Weight
Analysis Variable : systbp2 Systolinen verenpaine
N
N Obs
N
Miss
Sum Wgts
Mean
Std Error
----------------------------------------------------------------------14395
13612
783
13716.104
139.854
0.200
----------------------------------------------------------------------Tutkimus
N
(1=T2K,2=MS)
N Obs
N
Miss
Sum Wgts
Mean
Std Error
--------------------------------------------------------------------------------------1
7178
6401
777
6497.937
133.482
0.262
2
7217
7211
6
7218.167
145.591
0.280
--------------------------------------------------------------------------------------Sukupuoli
N
(1=M,2=N)
N Obs
N
Miss
Sum Wgts
Mean
Std Error
--------------------------------------------------------------------------------------1
6525
6220
305
6473.673
139.395
0.262
2
7870
7392
478
7242.431
140.265
0.296
--------------------------------------------------------------------------------------Sukupuoli
Tutkimus
N
(1=M,2=N)
(1=T2K,2=MS)
N Obs
N
Miss
Sum Wgts
Mean
Std Error
----------------------------------------------------------------------------------------------------1
1
3203
2899
304
3102.263
134.416
0.355
2
3322
3321
1
3371.411
143.976
0.363
2
1
3975
3502
473
3395.674
132.628
0.379
2
3895
3890
5
3846.757
147.007
0.417
-----------------------------------------------------------------------------------------------------
25
Tunnusluvut
SAS Surveymeans
Data Summary
Number of Strata
Number of Clusters
Number of Observations
Sum of Weights
44
5155
14395
14404.1532
Statistics
Sum of
Std Error
Variable
N
N Miss
Weights
Mean
of Mean
---------------------------------------------------------------------------------------systbp2
13612
783
13716
139.854495
0.329956
---------------------------------------------------------------------------------------Sum of
Std Error
t2k
Variable
N
N Miss
Weights
Mean
of Mean
-----------------------------------------------------------------------------------------------1
systbp2
6401
777
6497.936845
133.481608
0.428515
2
systbp2
7211
6
7218.167255
145.591493
0.488738
-----------------------------------------------------------------------------------------------Sum of
Std Error
sp2
Variable
N
N Miss
Weights
Mean
of Mean
-----------------------------------------------------------------------------------------------1
systbp2
6220
305
6473.673134
139.394726
0.335483
2
systbp2
7392
478
7242.430966
140.265460
0.422606
-----------------------------------------------------------------------------------------------Sum of
Std Error
sp2
t2k
Variable
N
N Miss
Weights
Mean
of Mean
---------------------------------------------------------------------------------------------------1
1
systbp2
2899
304
3102.262548
134.415902
0.485048
2
systbp2
3321
1
3371.410585
143.976079
0.467768
2
1
systbp2
3502
473
3395.674296
132.628044
0.515660
2
systbp2
3890
5
3846.756670
147.007290
0.616352
----------------------------------------------------------------------------------------------------
26
Tunnusluvut
SUDAAN Descript
Number of observations read
:
Denominator degrees of freedom :
14395
5111
Weighted count :
14404
Variance Estimation Method: Taylor Series (WR)
by: Variable, Sukupuoli (1=M,2=N), Tutkimus (1=T2K,2=MS).
for: Variable = Systolinen verenpaine.
----------------------------------------------------------------------------------------------Sukupuoli (1=M,2=N)
Tutkimus
Weighted
(1=T2K,2=MS)
Sample Size
Size
Total
Mean
SE Mean
----------------------------------------------------------------------------------------------Total
Total
13612.000
13716.104
1918258.807
139.854
0.330
Missing
0.000
0.000
0.000
.
.
1
6401.000
6497.937
867355.059
133.482
0.429
2
7211.000
7218.167
1050903.748
145.591
0.489
Missing
Total
0.000
0.000
0.000
.
.
Missing
0.000
0.000
0.000
.
.
1
0.000
0.000
0.000
.
.
2
0.000
0.000
0.000
.
.
1
Total
6220.000
6473.673
902395.895
139.395
0.336
Missing
0.000
0.000
0.000
.
.
1
2899.000
3102.263
416993.419
134.416
0.485
2
3321.000
3371.411
485402.476
143.976
0.468
2
Total
7392.000
7242.431
1015862.912
140.265
0.423
Missing
0.000
0.000
0.000
.
.
1
3502.000
3395.674
450361.640
132.628
0.516
2
3890.000
3846.757
565501.272
147.007
0.616
-----------------------------------------------------------------------------------------------
27
Frekvenssit
eri ohjelmistoilla
TITLE1 'FREKVENSSIT';
TITLE2 'SAS FREQ';
PROC FREQ DATA=WORK.T2K_DATA;
TABLE SYSTBP2_123;
RUN;
PROC FREQ DATA=WORK.T2K_DATA;
TABLE SYSTBP2_123;
BY T2K;
RUN;
TITLE2 'SAS FREQ + WEIGHT';
PROC FREQ DATA=WORK.T2K_DATA;
WEIGHT WAN_UNIONI;
TABLE SYSTBP2_123;
RUN;
PROC FREQ DATA=WORK.T2K_DATA;
WEIGHT WAN_UNIONI;
TABLE SYSTBP2_123;
BY T2K;
RUN;
TITLE2 'SAS SURVEYMEANS';
PROC SURVEYMEANS DATA=WORK.T2K_DATA
NOBS MEAN STDERR SUMWGT;
STRATA OSITE;
CLUSTER RYVAS;
WEIGHT WAN_UNIONI;
DOMAIN T2K;
VAR
SYSTBP2_123;
CLASS
SYSTBP2_123;
RUN;
TITLE2 'SUDAAN CROSSTAB';
PROC CROSSTAB DATA=WORK.T2K_DATA DESIGN=WR;
SETENV
COLWIDTH=12 DECWIDTH=3;
NEST
OSITE RYVAS;
WEIGHT
WAN_UNIONI;
SUBGROUP T2K SYSTBP2_123;
LEVELS
2
3;
TABLES
T2K*SYSTBP2_123;
PRINT
NSUM WSUM ROWPER SEROW / STYLE=NCHS;
RUN;
28
Frekvenssit
SAS Freq
Syst.vp 3-luok.
Cumulative
Cumulative
SystBP2_123
Frequency
Percent
Frequency
Percent
---------------------------------------------------------------1
3663
25.45
3663
25.45
2
8217
57.08
11880
82.53
3
2515
17.47
14395
100.00
Tutkimus (1=T2K,2=MS)=1
Syst.vp 3-luok.
Cumulative
Cumulative
SystBP2_123
Frequency
Percent
Frequency
Percent
---------------------------------------------------------------1
2741
38.19
2741
38.19
2
3735
52.03
6476
90.22
3
702
9.78
7178
100.00
Tutkimus (1=T2K,2=MS)=2
Syst.vp 3-luok.
Cumulative
Cumulative
SystBP2_123
Frequency
Percent
Frequency
Percent
---------------------------------------------------------------1
922
12.78
922
12.78
2
4482
62.10
5404
74.88
3
1813
25.12
7217
100.00
29
Frekvenssit
SAS Freq + Weight
Syst.vp 3-luok.
Cumulative
Cumulative
SystBP2_123
Frequency
Percent
Frequency
Percent
---------------------------------------------------------------1
3637.334
25.25
3637.334
25.25
2
8354.206
58.00
11991.54
83.25
3
2412.614
16.75
14404.15
100.00
Tutkimus (1=T2K,2=MS)=1
Syst.vp 3-luok.
Cumulative
Cumulative
SystBP2_123
Frequency
Percent
Frequency
Percent
---------------------------------------------------------------1
2667.273
37.15
2667.273
37.15
2
3828.789
53.32
6496.062
90.47
3
684.0961
9.53
7180.158
100.00
Tutkimus (1=T2K,2=MS)=2
Syst.vp 3-luok.
Cumulative
Cumulative
SystBP2_123
Frequency
Percent
Frequency
Percent
---------------------------------------------------------------1
970.0609
13.43
970.0609
13.43
2
4525.417
62.64
5495.478
76.07
3
1728.518
23.93
7223.995
100.00
30
Frekvenssit
SAS Surveymeans
Data Summary
Number of Strata
Number of Clusters
Number of Observations
Sum of Weights
44
5155
14395
14404.1532
Statistics
Sum of
Std Error
Variable
N
Weights
Mean
of Mean
---------------------------------------------------------------------------SystBP2_123=1
3663
14404
0.252520
0.005200
SystBP2_123=2
8217
14404
0.579986
0.005244
SystBP2_123=3
2515
14404
0.167494
0.004260
----------------------------------------------------------------------------
Tutkimus
(1=T2K,2=MS)
Sum of
Std Error
Variable
N
Weights
Mean
of Mean
---------------------------------------------------------------------------------------1
SystBP2_123=1
2741
7180.157895
0.371478
0.008186
SystBP2_123=2
3735
7180.157895
0.533246
0.007138
SystBP2_123=3
702
7180.157895
0.095276
0.004580
2
SystBP2_123=1
922
7223.995328
0.134283
0.005741
SystBP2_123=2
4482
7223.995328
0.626442
0.007495
SystBP2_123=3
1813
7223.995328
0.239274
0.007120
----------------------------------------------------------------------------------------
31
Frekvenssit
SUDAAN Crosstab
Number of observations read
:
Denominator degrees of freedom :
14395
5111
Weighted count :
14404
Variance Estimation Method: Taylor Series (WR)
by: Tutkimus (1=T2K,2=MS), Syst.vp 3-luok..
-------------------------------------------------------------------------------Tutkimus
(1=T2K,2=MS)
Weighted
SE Row
Syst.vp 3-luok.
Sample Size
Size
Row Percent
Percent
-------------------------------------------------------------------------------Total
Total
14395.000
14404.153
100.000
0.000
1
3663.000
3637.334
25.252
0.520
2
8217.000
8354.206
57.999
0.524
3
2515.000
2412.614
16.749
0.426
1
Total
7178.000
7180.158
100.000
0.000
1
2741.000
2667.273
37.148
0.819
2
3735.000
3828.789
53.325
0.714
3
702.000
684.096
9.528
0.458
2
Total
7217.000
7223.995
100.000
0.000
1
922.000
970.061
13.428
0.574
2
4482.000
4525.417
62.644
0.750
3
1813.000
1728.518
23.927
0.712
--------------------------------------------------------------------------------
32
Lineaarinen malli
eri ohjelmistoilla
TITLE1 'LINEAARINEN MALLI';
TITLE2 'SAS GLM + WEIGHT';
PROC GLM DATA=WORK.T2K_DATA;
WEIGHT WAN_UNIONI;
CLASS
SP2 IKA6 T2K;
MODEL
SYSTBP2 = SP2 IKA6 T2K BMI / SOLUTION;
RUN;
TITLE2 'SAS SURVEYREG';
PROC SURVEYREG DATA=WORK.T2K_DATA;
STRATA OSITE;
CLUSTER RYVAS;
WEIGHT WAN_UNIONI;
CLASS
SP2 IKA6 T2K;
MODEL
SYSTBP2 = SP2 IKA6 T2K BMI /
SOLUTION;
RUN;
TITLE2 'SUDAAN REGRESS';
PROC REGRESS DATA=WORK.T2K_DATA DESIGN=WR;
SETENV
COLWIDTH=12 DECWIDTH=3;
NEST
OSITE RYVAS;
WEIGHT
WAN_UNIONI;
SUBGROUP SP2 IKA6 T2K;
LEVELS
2
6
2;
MODEL
SYSTBP2 = SP2 IKA6 T2K BMI;
PREDMARG SP2 IKA6 T2K;
RUN;
33
Lineaarinen malli
Testit
SAS GLM + WEIGHT:
Source
sp2
ika6
t2k
bmi
DF
Type III SS
Mean Square
F Value
Pr > F
1
5
1
1
4598.814
1505233.917
670714.692
197789.472
4598.814
301046.783
670714.692
197789.472
12.85
841.38
1874.54
552.79
0.0003
<.0001
<.0001
<.0001
SAS SURVEYREG:
Effect
Model
Intercept
sp2
ika6
t2k
bmi
Num DF
F Value
Pr > F
8
1
1
5
1
1
526.87
13313.7
13.07
708.03
610.56
495.48
<.0001
<.0001
0.0003
<.0001
<.0001
<.0001
SUDAAN REGRESS:
----------------------------------------------------------------Contrast
Degrees of
P-value Wald
Freedom
Wald F
F
----------------------------------------------------------------OVERALL MODEL
9.000
34092.890
0.000
MODEL MINUS
INTERCEPT
8.000
525.793
0.000
INTERCEPT
.
.
.
SP2
1.000
13.075
0.000
IKA6
5.000
704.653
0.000
T2K
1.000
610.521
0.000
BMI
1.000
495.911
0.000
-----------------------------------------------------------------
34
Lineaarinen malli
Parametriestimaatit
SAS GLM + WEIGHT:
Parameter
Intercept
sp2
sp2
ika6
ika6
ika6
ika6
ika6
ika6
t2k
t2k
bmi
Estimate
1
2
1
2
3
4
5
6
1
2
141.0322272
1.1842537
0.0000000
-32.3430867
-25.5086806
-17.1008806
-7.3126631
-1.6082852
0.0000000
-14.3704931
0.0000000
0.9326216
SAS SURVEYREG:
Parameter
Intercept
sp2 1
sp2 2
ika6 1
ika6 2
ika6 3
ika6 4
ika6 5
ika6 6
t2k 1
t2k 2
bmi
Estimate
141.032227
1.184254
0.000000
-32.343087
-25.508681
-17.100881
-7.312663
-1.608285
0.000000
-14.370493
0.000000
0.932622
B
B
B
B
B
B
B
B
B
B
B
Standard
Error
t Value
Pr > |t|
1.43648255
0.33032660
.
1.04374694
1.04840361
1.05129964
1.07586068
1.11667995
.
0.33191316
.
0.03966666
98.18
3.59
.
-30.99
-24.33
-16.27
-6.80
-1.44
.
-43.30
.
23.51
<.0001
0.0003
.
<.0001
<.0001
<.0001
<.0001
0.1498
.
<.0001
.
<.0001
Standard
Error
1.53280246
0.32756747
0.00000000
1.11845398
1.17469039
1.10502478
1.26905700
1.19456103
0.00000000
0.58157558
0.00000000
0.04189813
t Value
Pr > |t|
92.01
3.62
.
-28.92
-21.72
-15.48
-5.76
-1.35
.
-24.71
.
22.26
<.0001
0.0003
.
<.0001
<.0001
<.0001
<.0001
0.1783
.
<.0001
.
<.0001
35
Lineaarinen malli
Parametriestimaatit
SUDAAN REGRESS:
-------------------------------------------------------------------------------Independent
Variables and
P-value TEffects
Beta Coeff.
SE Beta
T-Test B=0
Test B=0
-------------------------------------------------------------------------------Intercept
141.032
1.532
92.040
0.000
Sukupuoli (1=M,2=N)
1
1.184
0.328
3.616
0.000
2
0.000
0.000
.
.
Ikäryhmä
1
-32.343
1.118
-28.920
0.000
2
-25.509
1.174
-21.721
0.000
3
-17.101
1.105
-15.481
0.000
4
-7.313
1.269
-5.764
0.000
5
-1.608
1.194
-1.347
0.178
6
0.000
0.000
.
.
Tutkimus (1=T2K,2=MS)
1
-14.370
0.582
-24.709
0.000
2
0.000
0.000
.
.
BodyMass-index
0.933
0.042
22.269
0.000
--------------------------------------------------------------------------------
36
Lineaarinen malli
Mallivakiointi
SUDAAN REGRESS:
-------------------------------------------------------------------------------Marginal
Predicted
Marginal
SE
T:Marg=0
P-value
-------------------------------------------------------------------------------Sukupuoli (1=M,2=N)
1
140.496
0.346
406.176
0.000
2
139.312
0.394
353.964
0.000
Ikäryhmä
1
127.070
0.332
383.051
0.000
2
133.904
0.439
305.297
0.000
3
142.312
0.393
362.442
0.000
4
152.100
0.642
236.785
0.000
5
157.804
0.755
208.987
0.000
6
159.413
1.090
146.270
0.000
Tutkimus (1=T2K,2=MS)
1
132.072
0.382
345.813
0.000
2
146.443
0.475
308.568
0.000
--------------------------------------------------------------------------------
37
Poikkileikkaustutkimus / jatkuva vaste
Regress
TITLE1 'POIKKILEIKKAUSTUTKIMUS / JATKUVA VASTE';
TITLE2 'SUDAAN REGRESS';
PROC REGRESS DATA=WORK.T2K_DATA DESIGN=WR;
SETENV
COLWIDTH=12 DECWIDTH=3;
NEST
OSITE RYVAS;
WEIGHT
WAN_UNIONI;
SUBPOPN T2K=1 / NAME="TERVEYS 2000";
SUBGROUP SP2 AA01 PORTAANNOUSU;
LEVELS
2
4
2;
MODEL
SYSTBP2 = BMI SP2 SP2*T114 IKA2 AA01 PORTAANNOUSU;
TEST
WALDF SATADJF;
REFLEVEL SP2=1;
PREDMARG SP2;
PRINT / TESTS=DEFAULT BETAS=ALL PRED_MRG=ALL;
RUN;
38
Poikkileikkaustutkimus / jatkuva vaste
SUBPOPN
SUBPOPN EXAMPLE:
By including the following statements in your SUDAAN program, you can limit the
analysis to records for which the value of the RACE variable is 2 (AfricanAmericans in this case) and the value of the SEX variable is 2 (Females in this
case), and the value of the AGE variable is either less than 18 or over 65.
SUBGROUP RACE SEX;
LEVELS 2 2;
SUBPOPN RACE=2 & SEX=2 & (AGE<18 | AGE >65) / NAME='African-American Females not
in Labor Force';
WARNING:
Expressions such as 18 <= AGE <= 65 are NOT appropriate on the SUBPOPN statement
and may lead to unexpected results.
To indicate all values of AGE between 18 and 65, use the expression:
(18 <= AGE) & (AGE <= 65).
39
Poikkileikkaustutkimus / jatkuva vaste
Parametrit
Variance Estimation Method: Taylor Series (WR)
SE Method: Robust (Binder, 1983)
Working Correlations: Independent
Link Function: Identity
Response variable SYSTBP2: Systolinen verenpaine
For Subpopulation: TERVEYS 2000
----------------------------------------------------------------------------------------------Independent
Variables and
P-value TEffects
Beta Coeff.
DEFF Beta #4
SE Beta
T-Test B=0
Test B=0
----------------------------------------------------------------------------------------------Intercept
63.115
1.525
3.879
16.270
0.000
Sukupuoli (1=M,2=N)
1
0.000
.
0.000
.
.
2
-4.500
1.304
3.075
-1.463
0.143
Siviilisääty
1
0.105
1.319
1.121
0.094
0.925
2
0.735
1.168
1.321
0.557
0.578
3
0.075
1.488
1.418
0.053
0.958
4
0.000
.
0.000
.
.
Kahden portaan nousu
1
2.980
1.475
1.304
2.285
0.022
2
0.000
.
0.000
.
.
BodyMass-index
0.874
1.178
0.061
14.341
0.000
Ikä
0.696
1.010
0.022
31.338
0.000
Sukupuoli (1=M,2=N),
fS-Kol mmol/l
1, 1
1.471
1.025
0.328
4.482
0.000
2, 1
1.825
1.413
0.366
4.981
0.000
-----------------------------------------------------------------------------------------------
40
Poikkileikkaustutkimus / jatkuva vaste
Testit
-----------------------------------------------------------------------------------------------------------Contrast
P-value
Degrees of
S_waite Adj
S_waite Adj
S_waite Adj
P-value Wald
Freedom
DF
F
F
Wald F
F
-----------------------------------------------------------------------------------------------------------OVERALL MODEL
10.000
8.038
23195.089
0.000
17787.847
0.000
MODEL MINUS
INTERCEPT
9.000
7.899
198.420
0.000
220.470
0.000
INTERCEPT
.
.
.
.
.
.
SP2
1.000
1.000
2.142
0.143
2.142
0.143
AA01
3.000
2.835
0.204
0.884
0.288
0.834
PORTAANNOUSU
1.000
1.000
5.220
0.022
5.220
0.022
BMI
1.000
1.000
205.668
0.000
205.668
0.000
IKA2
1.000
1.000
982.072
0.000
982.072
0.000
T114 * SP2
2.000
1.912
22.602
0.000
25.972
0.000
------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------Marginal
Predicted
Marginal
SE
T:Marg=0
P-value
-------------------------------------------------------------------------------Sukupuoli (1=M,2=N)
1
134.691
0.481
280.090
0.000
2
132.297
0.504
262.703
0.000
--------------------------------------------------------------------------------
41
Poikkileikkaustutkimus / binäärinen vaste
Rlogist (Logistic)
TITLE 'POIKKILEIKKAUSTUTKIMUS / BINÄÄRINEN (0/1) VASTE';
TITLE2 'SUDAAN RLOGIST (LOGISTIC)';
PROC RLOGIST DATA=WORK.T2K_DATA DESIGN=WR;
SETENV
COLWIDTH=12 DECWIDTH=3;
NEST
OSITE RYVAS;
WEIGHT
WAN_UNIONI;
SUBPOPN T2K=1 / NAME="TERVEYS 2000";
SUBGROUP SP2 AA01 PORTAANNOUSU;
LEVELS
2
4
2;
MODEL
SYSTBP2_01 = BMI SP2 SP2*T114 IKA2 AA01 PORTAANNOUSU;
TEST
WALDF SATADJF;
REFLEVEL SP2=1;
PREDMARG SP2;
PRINT / TESTS=DEFAULT BETAS=ALL PRED_MRG=ALL RISK=ALL;
RUN;
42
Poikkileikkaustutkimus / binäärinen vaste
Parametrit
Variance Estimation Method: Taylor Series (WR)
SE Method: Robust (Binder, 1983)
Working Correlations: Independent
Link Function: Logit
Response variable SYSTBP2_01: Syst.vp (0/1)
For Subpopulation: TERVEYS 2000
----------------------------------------------------------------------------------------------Independent
Variables and
P-value TEffects
Beta Coeff.
DEFF Beta #4
SE Beta
T-Test B=0
Test B=0
----------------------------------------------------------------------------------------------Intercept
-9.214
1.436
0.561
-16.411
0.000
Sukupuoli (1=M,2=N)
1
0.000
.
0.000
.
.
2
-0.040
1.070
0.397
-0.102
0.919
Siviilisääty
1
0.175
0.945
0.118
1.484
0.138
2
0.207
1.097
0.173
1.196
0.232
3
0.215
1.308
0.171
1.263
0.207
4
0.000
.
0.000
.
.
Kahden portaan nousu
1
0.442
1.286
0.149
2.964
0.003
2
0.000
.
0.000
.
.
BodyMass-index
0.086
1.061
0.008
11.012
0.000
Ikä
0.078
1.126
0.003
22.312
0.000
Sukupuoli (1=M,2=N),
fS-Kol mmol/l
1, 1
0.216
1.166
0.047
4.583
0.000
2, 1
0.199
1.206
0.047
4.193
0.000
-----------------------------------------------------------------------------------------------
43
Poikkileikkaustutkimus / binäärinen vaste
Testit
-----------------------------------------------------------------------------------------------------------Contrast
P-value
Degrees of
S_waite Adj
S_waite Adj
S_waite Adj
P-value Wald
Freedom
DF
F
F
Wald F
F
-----------------------------------------------------------------------------------------------------------OVERALL MODEL
10.000
8.603
106.794
0.000
142.211
0.000
MODEL MINUS
INTERCEPT
9.000
8.123
89.622
0.000
111.093
0.000
INTERCEPT
.
.
.
.
.
.
SP2
1.000
1.000
0.010
0.919
0.010
0.919
AA01
3.000
2.877
0.786
0.497
0.763
0.515
PORTAANNOUSU
1.000
1.000
8.784
0.003
8.784
0.003
BMI
1.000
1.000
121.267
0.000
121.267
0.000
IKA2
1.000
1.000
497.811
0.000
497.811
0.000
T114 * SP2
2.000
1.987
19.472
0.000
18.094
0.000
------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------Marginal
Predicted
Marginal
SE
T:Marg=0
P-value
-------------------------------------------------------------------------------Sukupuoli (1=M,2=N)
1
0.323
0.010
31.989
0.000
2
0.299
0.010
30.350
0.000
--------------------------------------------------------------------------------
44
Poikkileikkaustutkimus / binäärinen vaste
OR
----------------------------------------------------------------Independent
Variables and
Lower 95%
Upper 95%
Effects
Odds Ratio
Limit OR
Limit OR
----------------------------------------------------------------Intercept
0.000
0.000
0.000
Sukupuoli (1=M,2=N)
1
1.000
1.000
1.000
2
0.961
0.441
2.090
Siviilisääty
1
1.191
0.945
1.500
2
1.230
0.876
1.726
3
1.240
0.888
1.733
4
1.000
1.000
1.000
Kahden portaan nousu
1
1.555
1.161
2.083
2
1.000
1.000
1.000
BodyMass-index
1.090
1.074
1.107
Ikä
1.081
1.073
1.088
Sukupuoli (1=M,2=N),
fS-Kol mmol/l
1, 1
1.241
1.131
1.360
2, 1
1.220
1.112
1.339
-----------------------------------------------------------------
45
Poikkileikkaustutkimus / moniluokkainen vaste
Multilog
TITLE 'POIKKILEIKKAUSTUTKIMUS / MONILUOKKAINEN VASTE';
TITLE2 'SUDAAN MULTILOG';
PROC MULTILOG DATA=WORK.T2K_DATA DESIGN=WR;
SETENV
COLWIDTH=12 DECWIDTH=3;
NEST
OSITE RYVAS;
WEIGHT
WAN_UNIONI;
SUBPOPN T2K=1 / NAME="TERVEYS 2000";
SUBGROUP SYSTBP2_123 SP2
AA01 PORTAANNOUSU;
LEVELS
3
2
4
2;
MODEL
SYSTBP2_123 = BMI SP2 T114 IKA2 AA01 PORTAANNOUSU
TEST
WALDF SATADJF;
REFLEVEL SP2=1;
PREDMARG SP2;
PRINT / TESTS=DEFAULT BETAS=ALL PRED_MRG=ALL STYLE=NCHS;
RUN;
/ CUMLOGIT;
46
Poikkileikkaustutkimus / moniluokkainen vaste
Parametrit
----------------------------------------------------------------------------------------------SYSTBP2_123 (log-odds)
Independent Variables and
P-value TEffects
Beta Coeff.
DEFF Beta #4
SE Beta
T-Test B=0
Test B=0
----------------------------------------------------------------------------------------------1 vs 3
Intercept
14.365
0.996
0.694
20.689
0.000
Sukupuoli (1=M,2=N)
1
0.000
.
0.000
.
.
2
0.305
1.073
0.120
2.542
0.011
Siviilisääty
1
-0.221
0.953
0.204
-1.084
0.278
2
-0.381
0.987
0.282
-1.350
0.177
3
-0.208
1.102
0.267
-0.780
0.435
4
0.000
.
0.000
.
.
Kahden portaan
nousu
1
-0.782
1.006
0.227
-3.447
0.001
2
0.000
.
0.000
.
.
Ikä
-0.126
0.959
0.005
-23.191
0.000
BodyMass-index
-0.148
1.049
0.013
-11.145
0.000
fS-Kol mmol/l
-0.281
1.090
0.052
-5.356
0.000
2 vs 3
Intercept
7.811
0.832
0.542
14.412
0.000
Sukupuoli (1=M,2=N)
1
0.000
.
0.000
.
.
2
-0.185
0.845
0.094
-1.975
0.048
Siviilisääty
1
-0.265
0.924
0.141
-1.882
0.060
2
-0.449
0.992
0.233
-1.930
0.054
3
-0.266
0.998
0.197
-1.352
0.177
4
0.000
.
0.000
.
.
Kahden portaan nousu
1
-0.231
1.014
0.154
-1.499
0.134
2
0.000
.
0.000
.
.
Ikä
-0.066
0.883
0.004
-14.975
0.000
BodyMass-index
-0.032
1.086
0.011
-2.902
0.004
fS-Kol mmol/l
-0.112
1.041
0.043
-2.592
0.010
-----------------------------------------------------------------------------------------------
47
Poikkileikkaustutkimus / moniluokkainen vaste
Testit
-----------------------------------------------------------------------------------------------------------Contrast
P-value
Degrees of
S_waite Adj
S_waite Adj
S_waite Adj
P-value Wald
Freedom
DF
F
F
Wald F
F
-----------------------------------------------------------------------------------------------------------OVERALL MODEL
18.000
14.913
103.271
0.000
107.345
0.000
MODEL MINUS
INTERCEPT
16.000
14.485
65.961
0.000
58.829
0.000
INTERCEPT
.
.
.
.
.
.
SP2
2.000
1.925
25.720
0.000
25.057
0.000
AA01
6.000
5.811
0.718
0.630
0.859
0.524
PORTAANNOUSU
2.000
1.935
6.590
0.002
6.442
0.002
IKA2
2.000
1.984
285.183
0.000
277.345
0.000
BMI
2.000
1.993
94.348
0.000
99.011
0.000
T114
2.000
1.995
17.109
0.000
16.575
0.000
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------Syst.vp 3-luok.
Predicted
Marginal
Marginal
SE
T:Marg=0
P-value
-------------------------------------------------------------------------------1
Sukupuoli (1=M,2=N)
1
0.266
0.011
24.838
0.000
2
0.347
0.009
38.501
0.000
2
Sukupuoli (1=M,2=N)
1
0.633
0.011
58.495
0.000
2
0.545
0.009
62.273
0.000
3
Sukupuoli (1=M,2=N)
48
1
0.101
0.006
15.619
0.000
2
0.109
0.007
16.498
0.000
--------------------------------------------------------------------------------
Kahden otoksen vertailu / jatkuva vaste
Regress
TITLE1 'KAHDEN OTOKSEN VERTAILU / JATKUVA VASTE';
TITLE2 'SUDAAN REGRESS';
PROC REGRESS DATA=WORK.T2K_DATA DESIGN=WR;
SETENV
COLWIDTH=12 DECWIDTH=3;
NEST
OSITE RYVAS;
WEIGHT
WAN_UNIONI;
SUBGROUP T2K SP2 AA01 PORTAANNOUSU;
LEVELS
2
2
4
2;
MODEL
SYSTBP2 = T2K BMI SP2 IKA2 AA01 PORTAANNOUSU;
TEST
SATADJF;
REFLEVEL T2K=1;
PREDMARG T2K;
RUN;
49
Kahden otoksen vertailu / binäärinen vaste
Rlogist (Logistic)
TITLE 'KAHDEN OTOKSEN VERTAILU / BINÄÄRINEN VASTE';
TITLE2 'SUDAAN RLOGIST (LOGISTIC)';
PROC RLOGIST DATA=WORK.T2K_DATA DESIGN=WR;
SETENV
COLWIDTH=12 DECWIDTH=3;
NEST
OSITE RYVAS;
WEIGHT
WAN_UNIONI;
SUBGROUP T2K SP2 AA01 PORTAANNOUSU;
LEVELS
2
2
4
2;
MODEL
SYSTBP2_01 = T2K BMI SP2 IKA2 AA01 PORTAANNOUSU;
TEST
SATADJF;
REFLEVEL T2K=1;
PREDMARG T2K;
RUN;
50
Kahden otoksen vertailu / moniluokkainen vaste
Multilog
TITLE 'KAHDEN OTOKSEN VERTAILU / MONILUOKKAINEN VASTE';
TITLE2 'SUDAAN MULTILOG';
PROC MULTILOG DATA=WORK.T2K_DATA DESIGN=WR;
SETENV
COLWIDTH=12 DECWIDTH=3;
NEST
OSITE RYVAS;
WEIGHT
WAN_UNIONI;
SUBGROUP SYSTBP2_123 T2K SP2 AA01 PORTAANNOUSU;
LEVELS
3
2
2
4
2;
MODEL
SYSTBP2_123 = T2K BMI SP2 IKA2 AA01 PORTAANNOUSU / CUMLOGIT;
TEST
SATADJF;
REFLEVEL T2K=1;
PREDMARG T2K;
PRINT / STYLE=NCHS;
RUN;
51
Ajonaikainen uudelleen luokittelu
Recode
TITLE1 'AJONAIKAINEN UUDELLEEN LUOKITTELU';
TITLE2 'RECODE';
PROC CROSSTAB DATA=WORK.T2K_DATA DESIGN=WR;
SETENV
COLWIDTH=12 DECWIDTH=3;
NEST
OSITE RYVAS;
WEIGHT
WAN_UNIONI;
RECODE
SYSTBP2_01=(0 1);
SUBGROUP T2K SYSTBP2_01;
LEVELS
2
2;
TABLES
T2K*SYSTBP2_01;
PRINT
NSUM ROWPER / STYLE=NCHS;
RUN;
RECODE EXAMPLES:
RECODE X = 1.5;
will recode the continuous or categorical variable X to a 0-1 variable whose
value is 0 if the input value is less than 1.5 and 1 if the input value is
greater than or equal to 1.5.
RECODE ZERONE = (0 1);
recodes the 0-1 variable ZERONE to be a 1-2 variable suitable for use on the
SUBGROUP statement. Level 0 goes to 1; level 1 goes to 2.
52
Ajonaikainen uudelleen luokittelu
Ennen - jälkeen
-------------------------------------------------Tutkimus
(1=T2K,2=MS)
Syst.vp (0/1)
Sample Size
Row Percent
-------------------------------------------------Total
Total
5885.000
100.000
1
5885.000
100.000
2
0.000
0.000
1
Total
2022.000
100.000
1
2022.000
100.000
2
0.000
0.000
2
Total
3863.000
100.000
1
3863.000
100.000
2
0.000
0.000
--------------------------------------------------
-------------------------------------------------Tutkimus
(1=T2K,2=MS)
Syst.vp (0/1)
Sample Size
Row Percent
-------------------------------------------------Total
Total
14395.000
100.000
1
8510.000
59.909
2
5885.000
40.091
1
Total
7178.000
100.000
1
5156.000
71.840
2
2022.000
28.160
2
Total
7217.000
100.000
1
3354.000
48.050
2
3863.000
51.950
--------------------------------------------------
53
Suora vakiointi
Descript + Stdvar & Stdwgt
TITLE1 'SUORA VAKIOINTI';
TITLE2 'SUDAAN DESCRIPT + STDVAR & STDWGT';
PROC DESCRIPT DATA=WORK.T2K_DATA DESIGN=WR;
NEST
OSITE RYVAS;
WEIGHT
WAN_UNIONI;
SUBGROUP IKA6 T2K;
LEVELS
6
2;
STDVAR
IKA6;
STDWGT
.1 .1 .2 .3 .2 .1; * summa=1;
* STDWGT
10 10 20 30 20 10; * summa=100;
* STDWGT
20000 20000 40000 60000 40000 20000; * ohjelma skaalaa itse;
VAR
SYSTBP2;
TABLES
T2K;
PRINT / STYLE=NCHS;
RUN;
54
Suora vakiointi
Ennen - jälkeen
------------------------------------------------------------------------------------Variable
Tutkimus
Sample
Weighted
(1=T2K,2=MS)
Size
Size
Total
Mean
SE Mean
------------------------------------------------------------------------------------Systolinen
verenpaine
Total
13612
13716.10
1918258.81
139.85
0.33
1
6401
6497.94
867355.06
133.48
0.43
2
7211
7218.17
1050903.75
145.59
0.49
-------------------------------------------------------------------------------------
------------------------------------------------------------------------------------Variable
Tutkimus
Sample
Weighted
(1=T2K,2=MS)
Size
Size
Total
Mean
SE Mean
------------------------------------------------------------------------------------Systolinen
verenpaine
Total
13612
13716.10
1918258.81
147.73
0.37
1
6401
6497.94
867355.06
140.09
0.47
2
7211
7218.17
1050903.75
155.13
0.60
-------------------------------------------------------------------------------------
55
Mallivakioidut keskiarvot tiedostoon
Regress + Output
TITLE1 'MALLIVAKIOIDUT KESKIARVOT TIEDOSTOON';
TITLE2 'SUDAAN REGRESS + OUTPUT';
PROC REGRESS DATA=WORK.T2K_DATA DESIGN=WR;
NEST
OSITE RYVAS;
WEIGHT
WAN_UNIONI;
SUBGROUP SP2 IKA6 T2K;
LEVELS
2
6
2;
MODEL
SYSTBP2 = SP2 IKA6 T2K BMI;
PREDMARG SP2 IKA6 T2K;
OUTPUT
/ FILENAME=MARGIN FILETYPE=SAS REPLACE PRED_MRG=ALL;
RUN;
PROC PRINT DATA=MARGIN LABEL;
RUN;
56
Mallivakioidut keskiarvot tiedostoon
SUDAAN - tulostus
-------------------------------------------------------------------------------Marginal
Predicted
Marginal
SE
T:Marg=0
P-value
-------------------------------------------------------------------------------Sukupuoli (1=M,2=N)
1
140.496
0.346
406.176
0.000
2
139.312
0.394
353.964
0.000
Ikäryhmä
1
127.070
0.332
383.051
0.000
2
133.904
0.439
305.297
0.000
3
142.312
0.393
362.442
0.000
4
152.100
0.642
236.785
0.000
5
157.804
0.755
208.987
0.000
6
159.413
1.090
146.270
0.000
Tutkimus (1=T2K,2=MS)
1
132.072
0.382
345.813
0.000
2
146.443
0.475
308.568
0.000
--------------------------------------------------------------------------------
57
Mallivakioidut keskiarvot tiedostoon
SAS - tulostus
Obs
1
2
3
4
5
6
7
8
9
10
Procedure
Number
4
4
4
4
4
4
4
4
4
4
Table
Number
1
1
1
1
1
1
1
1
1
1
Marginal
1
2
3
4
5
6
7
8
9
10
Predicted
Marginal
SE
T:Marg=0
P-value
140.496
139.312
127.070
133.904
142.312
152.100
157.804
159.413
132.072
146.443
0.346
0.394
0.332
0.439
0.393
0.642
0.755
1.090
0.382
0.475
406.176
353.964
383.051
305.297
362.442
236.785
208.987
146.270
345.813
308.568
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
58
Kiitos!
59
Download