Sudaan - koulutus KTL/TTO 2004 Research Triangle Institute 2 SUDAAN 3 Why ? 4 Pitfalls PITFALLS OF USING STANDARD STATISTICAL SOFTWARE PACKAGES FOR SAMPLE SURVEY DATA CAUTIONS IN USING STANDARD STATISTICAL SOFTWARE PACKAGES Donna J. Brogan, Ph.D. Rollins School of Public Health, Emory University, Atlanta April 15, 1997 Standard statistical software packages generally do not take into account four common characteristics of sample survey data: (1) unequal probability selection of observations, (2) clustering of observations, (3) stratification and (4) nonresponse and other adjustments [2 ]. Point estimates of population parameters are impacted by the value of the analysis weight for each observation. These weights depend upon the selection probabilities and other survey design features such as stratification and clustering. Hence, standard packages will yield biased point estimates if the weights are ignored. Estimated variance formulas for point estimates based on sample survey data are impacted by clustering, stratification and the weights. By ignoring these aspects, standard packages generally underestimate the estimated variance of a point estimate, sometimes substantially so. COPYRIGHT: This article is copyrighted and is not to be used without proper acknowledgment and citation. It will appear as a chapter in Encyclopedia of Biostatistics, edited by Peter Armitage and Theodore Colton (Editors-in-Chief), to be published by John Wiley in summer, 1998 as six volumes. The article will be in a section titled “Design of Experiments and Sample Surveys”, edited by Paul Levy. AUTHOR CONTACT INFORMATION: Donna Jean Brogan, Ph.D. Professor of Biostatistics Rollins School of Public Health 1518 Clifton Road N.E.—Room 324 Emory University Atlanta, GA 30322 e-mail: brogan@sph.emory.edu phone: 404-727-7701 fax: 404-727-1370 Most standard statistical packages can perform weighted analyses, usually via a WEIGHT statement added to the program code. Use of standard statistical packages with a weighting variable may yield the same point estimates for population parameters as sample survey software packages. However, the estimated variance often is not correct and can be substantially wrong, depending upon the particular program within the standard software package. 5 History 6 Platforms 7 Price 8 SUDAAN Overview SUDAAN is a single program consisting of a family of procedures used to analyze data from complex surveys and other observational and experimental studies involving cluster-correlated data. A complex sample may be multistage, stratified, or clustered. Many samples also have unequal probabilities of selection, or are drawn from finite populations. SUDAAN enables you to use survey data to obtain consistent estimates of population parameters and their standard errors in accordance with the sample design. SUDAAN also produces consistent estimates of regression coefficients, descriptive statistics, and their associated standard errors for cluster-correlated and repeated measures data applications in clinical, epidemiological, toxicological, and behavioral research. SUDAAN SUPPORT Direct inquiries about SUDAAN to: SUDAAN Business Coordinator Telephone: 919-541-6602 Fax: 919-541-7431 Email: SUDAAN@rti.org Research Triangle Institute, 3040 Cornwallis Road, Research Triangle Park, NC 27709 USA 9 SUDAAN Procedures Utility Procedure RECORDS Procedure The RECORDS procedure is designed to print records from any ASCII, SAS, SASXPORT, SUDAAN, SUDXPORT or SPSS record file. This is particularly useful when you wish to verify that SUDAAN is reading your data properly. You can also use this procedure to obtain a file contents summary or convert an input file of one type to another. For instance, you can convert an ASCII data set to a SUDAAN data set (and vice versa). NOTE: For SAS-Callable SUDAAN only, you can convert among the following file types: SAS, SUDAAN and SUDXPORT. The RECORDS procedure statements can be grouped into these categories: O Procedure statement: PROC RECORDS O Computation statement: SUBPOPN O Output statements: TITLE, FOOTNOTE, SETENV, PRINT, OUTPUT 10 SUDAAN Procedures Utility Procedure RECORDS Quick Reference PROC RECORDS DATA=filename [SUDDATA=filename] [FILETYPE=ASCII|SAS|SPSS|SUDAAN|SUDXPORT|SASXPORT] [COUNTREC] [CONTENTS] [HISTORY][NOPRINT] [MAXOBS=count] [NAMEFILE=filename] [LEVFILE=filename]; /* SUDDATA=filename must be used to specify a SUDAAN input file in SAS-Callable SUDAAN */ [SUBPOPN expression / [ NAME="label" ];] < TITLE string(s) / [ APPEND|REPLACE ]; > < FOOTNOTE string(s); > /* for SAS-Callable SUDAAN use RTITLE */ /* for SAS-Callable SUDAAN use RFOOTNOTE */ < SETENV{PAGEBEG=integer TABBEG=integer LINESIZE=integer PAGESIZE=integer LINESPCE=integer ROWSPCE=integer COLSPCE=integer ROWWIDTH=integer COLWIDTH=integer DECWIDTH=integer INDROWD=integer INDROWS=integer MAXIND=integer TOPMGN=integer LEFTMGN=integer LABWIDTH=integer }; > <PRINT <keyword[=label]>/<keywordFMT=format> <keywordUNT=unit> [FILENAME=filename] [REPLACE] [STARTREC=number] [MAXREC=number] [NOHEAD] [NODATE] [NOTIME];> <OUTPUT <keyword[=label]> / FILENAME=filename [REPLACE] [NOCOMP|NOCOMPRESS] [FILETYPE=ASCII|SAS|SUDAAN|SUDXPORT|SPSS|SASXPORT] [NAMEFILE=filename] [LEVFILE=filename] <keywordFMT=format> [STDTYPE=number];> RUN; 11 SUDAAN Procedures Descriptive Procedures CROSSTAB Procedure The CROSSTAB procedure produces weighted frequency and percentage distributions for one-way (univariate, single-variable) and multi-way (multivariate or multiple-variable) tabulations. CROSSTAB also tests the hypothesis of no association between row and column variables in 2-way and multi-way tables, as well as odds ratios and relative risks in 2x2 tables. CROSSTAB is primarily for descriptive analyses of categorical variables. DESCRIPT and RATIO produce descriptive statistics for continuous variables. Although DESCRIPT allows you to request weighted frequency counts, CROSSTAB is computationally more efficient for this purpose. The CROSSTAB statements can be grouped into these categories: O Procedure statement: PROC CROSSTAB O Sample design statements: WEIGHT, NEST, TOTCNT, SAMCNT, JOINTPROB, REPWGT, IDVAR, JACKWGTS, JACKMULT O Computation statements: SUBGROUP, LEVELS, RECODE, SUBPOPN, TABLES, TEST O Output statements: SETENV, PRINT, TITLE, FOOTNOTE, OUTPUT, FORMAT 12 SUDAAN Procedures Descriptive Procedures CROSSTAB Quick Reference PROC CROSSTAB DATA=filename [ SUDDATA=filename ] [ FILETYPE=ASCII|SAS|SPSS|SUDAAN|SUDXPORT|SASXPORT ] [ DESIGN=WR|WOR|UNEQWOR|STRWR|STRWOR|SRS|BRR|JACKKNIFE ] [ PSUDATA=filename ] [ PSU_REC=count ] [ CONF_LIM=percent ] [ SMALL_CELL=count ] [ ATLEVEL1=position ] [ ATLEVEL2=position ] [ DDF=number ] [ DEFT4|DEFT1|DEFT2|DEFT3|DEFT|DEFF ] [ DISPLAY ][ INCLUDE] [ MERGEHI ] [ NOMARG ] [ NOCOL] [ NOROW ] [ NOTOT ] [ NOPER] [ NOSE ] [ NOWGT ] [ NOPRINT] [ MAXOBS=count ] [ REPDATA=filename ] [ REP_REC=count ] [ EST_STR=count ] [ EST_PSU = count ] [ NAMEFILE=filename ] [ LEVFILE=filename ]; /* SUDDATA=filename must be used to specify a SUDAAN input file /* in SAS-Callable SUDAAN */ [ [ [ [ [ [ [ [ [ [ [ [ [ [ [ WEIGHT variable; ] REPWGT variables / [ ADJFAY=value ] ; ] IDVAR variable(s) ; ] NEST variable(s) / [ PSULEV=position|FRL=position] [ STRLEV=position ] [ MISSUNIT] [ NOSORTCK ] ; ] TOTCNT variable(s); ] SAMCNT variable(s); ] JOINTPROB variable(s); ] JACKWGTS varlist / ADJJACK=value ; ] JACKMULT value(s) ; ] RECODE variable=(code_list) < variable=(code_list) >; ] SUBPOPN expression / [ NAME=”label” ]; ] SUBGROUP variable(s); ] LEVELS level(s); ] TABLES table_request(s);] TEST { CHISQ LLCHISQ CMH }; ] continued 13 SUDAAN Procedures Descriptive Procedures CROSSTAB Quick Reference (cont.) < TITLE string(s) / [ APPEND|REPLACE ]; > /* for SAS-Callable SUDAAN use RTITLE */ < FOOTNOTE string(s); > /* for SAS-Callable SUDAAN use RFOOTNOTE */ < FORMAT variable(s) format_name.; > /* for SAS-Callable SUDAAN use RFORMAT */ <OUTPUT [ SMLCELL ] [ NSUM ] [ WSUM ] [ SEWGT ] [ DEFFWGT ] [ ROWPER ] [ COLPER] [ TOTPER ] [ SEROW ] [ SECOL ] [ SETOT ] [ DEFFROW ] [ DEFFCOL ] [ DEFFTOT ] [ ATLEV1][ ATLEV2 ] [ CHISQ ] [ CHISQP ] [ CHISQDF ] [ LLCHISQ ] [ LLCHISQP ] [ LLCHISQDF ] [ COR ] [ LOGCOR ] [ SELOGCOR ] [ UPCOR ] [ LOWCOR ] [ RR1 ] [ LOGRR1 ] [ SELOGRR1] [ LOWRR1 ] [ UPRR1 ] [ RR2 ] [ LOGRR2 ] [ SELOGRR2 ][ LOWRR2 ] [ UPRR2 ] [ CMH] [ CMHDF ] [ CMHPVAL ] [ DDF ] [ COVWGT ] [ SRSWGT ] [ COVROW ] [ SRSROW ] [ COVCOL ] [ SRSCOL ] [ COVTOT ] [ SRSTOT] / [TABLECELL=DEFAULT|ALL ][ RISK=DEFAULT|ALL ][ TESTS=DEFAULT|ALL ] [ CMHTEST=DEFAULT|ALL ] [ WGTCOV=DEFAULT|ALL ][ ROWCOV=DEFAULT|ALL ] [ COLCOV=DEFAULT|ALL ][TOTCOV=DEFAULT|ALL ] < keywordFMT=format > FILENAME=filename [ FILETYPE=ASCII|SAS|SUDAAN|SUDXPORT|SPSS|SASXPORT ] [ NAMEFILE=filename ] [ LEVFILE=filename ] [ REPLACE ] [ NOCOMP|NOCOMPRESS ] [ STDTYPE=number ] ;> ...or... NSUM=label WSUM=label ... etc. continued 14 SUDAAN Procedures Descriptive Procedures CROSSTAB Quick Reference (cont.) <SETENV { PAGEBEG=integer TABBEG=integer LINESIZE=integer PAGESIZE=integer LINESPCE=integer ROWSPCE=integer COLSPCE=integer ROWWIDTH=integer COLWIDTH=integer DECWIDTH=integer INDROWD=integer INDROWS=integer MAXIND=integer TOPMGN=integer LEFTMGN=integer LABWIDTH=integer }; > <PRINT [ SMLCELL ] [ NSUM ] [ WSUM ] [ SEWGT ] [ DEFFWGT ] [ ROWPER ] [ COLPER ] [ TOTPER ] [ SEROW ] [ SECOL ] [ SETOT ] [ DEFFROW ] [ DEFFCOL ] [ DEFFTOT ] [ ATLEV1 ] [ ATLEV2 ] [ CHISQ ] [ CHISQP ] [ CHISQDF ] [ LLCHISQ ] [ LLCHISQP ] [ LLCHISQDF ] [ COR ] [ LOGCOR ] [ SELOGCOR ] [ UPCOR ] [ LOWCOR ] [ RR1 ] [ LOGRR1] [ SELOGRR1 ] [ LOWRR1 ] [ UPRR1 ] [ RR2 ] [ LOGRR2 ] [ SELOGRR2 ] [ LOWRR2 ] [ UPRR2 ] [CMH] [ CMHDF ] [ CMHPVAL ] [ DDF ] [ COVWGT ][ SRSWGT ][COVROW] [ SRSROW ] [ COVCOL ] [ SRSCOL] [ COVTOT ] [ SRSTOT ] / [ TABLECELL=DEFAULT|ALL ] [ TESTS=DEFAULT|ALL ] [ CMHTEST=DEFAULT|ALL] [ RISK=DEFAULT|ALL ] [WGTCOV=DEFAULT|ALL ] [ ROWCOV=DEFAULT|ALL ] [ COLCOV=DEFAULT|ALL ] [ TOTCOV=DEFAULT|ALL ] < keywordFMT=format > < keywordUNT=power > [STYLE=BOX|NCHS ] [ NDIMROW=integer ] [ NDIMCOL=integer ] [ FILENAME=filename ] [ REPLACE ] [ NOHEAD ] [ NODATE ] [ NOTIME ]; > ...or... NSUM=label WSUM=label ... etc. RUN; 15 SUDAAN Procedures Descriptive Procedures DESCRIPT Procedure The DESCRIPT procedure produces descriptive statistics for analysis variables, including means, totals, percentages, geometric means, medians and other quantiles, and their standard errors for sample surveys and other clustered data applications. The analysis variables can be continuous or categorical. DESCRIPT computes standardized means according to the method of direct standardization. The standardizing weights are assumed to be known. Within one call to DESCRIPT, all analysis variables must be either continuous or categorical. The analysis of both continuous and categorical variables requires separate calls to the DESCRIPT procedure. For continuous analysis variables, you can request estimates of totals, means, proportions, geometric means, and quantiles. For categorical variables, you can request estimates of totals, percentages, and their standard errors. You can request design effects for means, totals, and percentages. Design effects are not available for contrast statistics, standardized estimates, or post-stratified estimates. DESCRIPT is primarily for the descriptive analysis of continuous (and sometimes discrete) variables, while CROSSTAB is primarily for descriptive analyses of categorical variables. 16 SUDAAN Procedures Descriptive Procedures RATIO Procedure The RATIO procedure produces ratio estimates and their standard errors for sample surveys and other clustered data applications. The numerator and denominator variables can be continuous or categorical. RATIO computes standardized means according to the method of direct standardization. The standardizing weights are assumed to be known. For continuous variables, RATIO computes a ratio of weighted sums. If VAR1 and VAR2 denote the numerator and denominator variables respectively, and the variable WT denotes the weight, then the ratio estimate is computed by summing over the analysis observations on the input data set. For categorical variables, RATIO computes a ratio of weighted counts of individuals falling into a specified response category, as follows: 1) Any positive integer is a valid response category. 2) The numerator is the weighted sum of individuals who gave the specified integer response to the numerator variable. 3) The denominator is the weighted sum of individuals who gave the specified integer response to the denominator variable. RATIO estimates can consist of a continuous numerator variable and a categorical denominator variable or vice versa. However, within one call to RATIO, all numerator variables must be of the same type, and all denominator variables must be of the same type. 17 SUDAAN Procedures Regression Procedures REGRESS Procedure The REGRESS procedure fits linear models to sample survey data and other clustered data and repeated measures applications. Estimates of the model parameters and their standard errors are computed, along with tests of hypotheses. REGRESS offers GEE model fitting techniques for efficient parameter estimation. For estimating variance of the parameter estimates, REGRESS implements two robust methods described in Binder (1983) and Zeger and Liang (1986), as well as a model-based (naive) variance estimation method. A choice of independent or exchangeable "working" correlations is also provided. You can specify tests for linear combinations of the model parameters, and you can output the predicted values, residuals, parameter estimates, and their associated variance-covariance matrix for further hypothesis testing. Also, you can estimate and test linear combinations of the adjusted group means (also known as least squares means). 18 SUDAAN Procedures Regression Procedures LOGISTIC (RLOGIST) Procedure The LOGISTIC procedure fits logistic regression models to complex sample survey data and other clustered data applications. LOGISTIC produces estimates of the model parameters and their standard errors, and tests the null hypothesis that individual regression coefficients associated with each variable in the model are equal to zero. LOGISTIC also provides tests for overall model significance, model minus intercept, as well as model main effects and interactions. In addition, you can test linear combinations of the model parameters or output the parameter estimates and variance-covariance matrix to a data set for further hypothesis testing. You can also estimate and test linear combinations of the conditional and predicted marginals (generalizations of adjusted group means to non-linear models.) The LOGISTIC procedure estimates model parameters using generalized estimating equations (GEE). A choice of independent vs. exchangeable "working" correlations is also provided. For estimating variance of the parameter estimates, LOGISTIC implements two robust methods described in Binder (1983) and Zeger and Liang (1986), as well as a model-based (naive) variance estimation method. NOTE: For SAS-Callable SUDAAN, the name LOGISTIC conflicts with a SAS procedure of the same name. Use RLOGIST to invoke the SUDAAN logistic regression procedure. 19 SUDAAN Procedures Regression Procedures MULTILOG Procedure The MULTILOG procedure extends the modeling capabilities of SUDAAN to include categorical outcomes with more than two categories which may or may not have a natural ordering. These models can be viewed as generalizations of logit models for binary outcomes already available in SUDAAN in the LOGISTIC procedure. MULTILOG analyzes data from sample surveys as well as from randomized experiments and other observational studies involving cluster-correlated or longitudinal responses. Two models have been implemented in the MULTILOG procedure: the proportional odds model with cumulative logit link for ordinal responses and a generalized multinomial logit model for nominal outcomes. Both models handle continuous as well as discrete explanatory variables. The generalized Multinomial Logit Model produces separate parameter vectors for each of the generalized logit equations of interest; the Proportional Odds Model produces a common slope but separate intercepts for each of the cumulative logit equations of interest. The MULTILOG procedure estimates model parameters using generalized estimating equations (GEE). For estimating variance of the parameter estimates, MULTILOG implements two robust methods described in Binder (1983) and Zeger and Liang (1986), as well as a model-based (naive) variance estimation method. All three variance estimation methods allow a choice of independent vs exchangeable working correlations for describing the dependence of responses within clusters. By default, the GEE iterative fitting procedure in the exchangeable case uses the one-step approach, although a multistep GEE procedure can also be obtained. MULTILOG produces estimates of the model parameters and their standard errors, and tests the null hypothesis that individual regression coefficients associated with each variable in the model are equal to zero. MULTILOG also provides tests for overall model significance, model minus intercept, as well as model main effects and interactions. In addition, you can test linear combinations of the model parameters and output many statistics to an output data set. You can also estimate and test linear combinations of the conditional and predicted marginals (generalizations of adjusted group means to non-linear models). 20 SUDAAN Procedures Regression Procedures LOGLINK Procedure The LOGLINK procedure in SUDAAN fits log-linear regression models to cluster-correlated count data not in the form of proportions. The counts are typically counts of events in a Poisson-like process. The LOGLINK procedure estimates model parameters using generalized estimating equations (GEE). LOGLINK implements two robust variance estimation methods described in Binder (1983) and Zeger and Liang (1986), as well as a model-based (naive) variance estimation method. A choice of independent vs. exchangeable "working" correlations is also provided. You can specify tests for linear combinations of the model parameters, and you can output many statistics for further hypothesis testing. Also, you can estimate and test linear combinations of the conditional and predicted marginals (generalizations of adjusted group means to non-linear models). Like all of SUDAAN's procedures, LOGLINK is designed to analyze data from complex sample surveys (weighted, stratified, cluster-correlated data) as well as from randomized experiments and other observational studies involving cluster-correlated or longitudinal responses. 21 SUDAAN Procedures Regression Procedures SURVIVAL Procedure SURVIVAL provides proportional hazards modeling for failure time outcomes, which may contain left- and right- censored observations, time-dependent covariates, and multiple events per subject. The SURVIVAL procedure fits the discrete or continuous (Cox) proportional hazards model to sample surveys and other clustered data applications. Estimates of the model parameters and their standard errors are computed, along with tests of hypotheses. Enhancements in the current software release are as follows: • Counting process style of input (Andersen and Gill, 1982) to permit left truncation, multiple events per subject, and time-dependent covariates. A time-dependent covariate is one whose value for any given individual can change over time during the course of a study. • Computation of Schoenfeld residuals and Score residuals to allow users to evaluate “goodness of fit” and the validity of the proportional hazards assumption. • Computation of Efron's likelihood approximation for ties in addition to the current default formula by Breslow. • Option to allow stratified baseline hazard functions for different types of failures or different subgroups of the population. 22 Tunnusluvut eri ohjelmistoilla TITLE1 'TUNNUSLUVUT'; TITLE2 'SAS MEANS'; PROC MEANS DATA=WORK.T2K_DATA N NMISS MEAN STDERR MAXDEC=3; CLASS SP2 T2K; VAR SYSTBP2; TYPES () SP2 T2K SP2*T2K; RUN; options nolabel; TITLE2 'SAS SURVEYMEANS'; PROC SURVEYMEANS DATA=WORK.T2K_DATA NOBS NMISS MEAN STDERR SUMWGT; STRATA OSITE; CLUSTER RYVAS; WEIGHT WAN_UNIONI; DOMAIN T2K SP2 T2K*SP2; VAR SYSTBP2; RUN; options label; TITLE2 'SAS MEANS + WEIGHT'; PROC MEANS DATA=WORK.T2K_DATA N NMISS SUMWGT MEAN STDERR MAXDEC=3; WEIGHT WAN_UNIONI; CLASS SP2 T2K; VAR SYSTBP2; TYPES () SP2 T2K SP2*T2K; RUN; TITLE2 'SUDAAN DESCRIPT'; PROC DESCRIPT DATA=WORK.T2K_DATA DESIGN=WR; SETENV COLWIDTH=12 DECWIDTH=3; NEST OSITE RYVAS; WEIGHT WAN_UNIONI; SUBGROUP SP2 T2K; LEVELS 2 2; VAR SYSTBP2; TABLES SP2*T2K; PRINT / STYLE=NCHS; * tai STYLE=BOX; RUN; 23 Tunnusluvut SAS Means Analysis Variable : systbp2 Systolinen verenpaine N N Obs N Miss Mean Std Error ------------------------------------------------------14395 13612 783 140.412 0.202 ------------------------------------------------------Tutkimus N (1=T2K,2=MS) N Obs N Miss Mean Std Error ----------------------------------------------------------------------1 7178 6401 777 133.675 0.266 2 7217 7211 6 146.393 0.282 ----------------------------------------------------------------------Sukupuoli N (1=M,2=N) N Obs N Miss Mean Std Error ----------------------------------------------------------------------1 6525 6220 305 140.036 0.266 2 7870 7392 478 140.729 0.298 ----------------------------------------------------------------------Sukupuoli Tutkimus N (1=M,2=N) (1=T2K,2=MS) N Obs N Miss Mean Std Error --------------------------------------------------------------------------------------1 1 3203 2899 304 134.618 0.357 2 3322 3321 1 144.765 0.368 2 1 3975 3502 473 132.893 0.385 2 3895 3890 5 147.783 0.417 --------------------------------------------------------------------------------------- 24 Tunnusluvut SAS Means + Weight Analysis Variable : systbp2 Systolinen verenpaine N N Obs N Miss Sum Wgts Mean Std Error ----------------------------------------------------------------------14395 13612 783 13716.104 139.854 0.200 ----------------------------------------------------------------------Tutkimus N (1=T2K,2=MS) N Obs N Miss Sum Wgts Mean Std Error --------------------------------------------------------------------------------------1 7178 6401 777 6497.937 133.482 0.262 2 7217 7211 6 7218.167 145.591 0.280 --------------------------------------------------------------------------------------Sukupuoli N (1=M,2=N) N Obs N Miss Sum Wgts Mean Std Error --------------------------------------------------------------------------------------1 6525 6220 305 6473.673 139.395 0.262 2 7870 7392 478 7242.431 140.265 0.296 --------------------------------------------------------------------------------------Sukupuoli Tutkimus N (1=M,2=N) (1=T2K,2=MS) N Obs N Miss Sum Wgts Mean Std Error ----------------------------------------------------------------------------------------------------1 1 3203 2899 304 3102.263 134.416 0.355 2 3322 3321 1 3371.411 143.976 0.363 2 1 3975 3502 473 3395.674 132.628 0.379 2 3895 3890 5 3846.757 147.007 0.417 ----------------------------------------------------------------------------------------------------- 25 Tunnusluvut SAS Surveymeans Data Summary Number of Strata Number of Clusters Number of Observations Sum of Weights 44 5155 14395 14404.1532 Statistics Sum of Std Error Variable N N Miss Weights Mean of Mean ---------------------------------------------------------------------------------------systbp2 13612 783 13716 139.854495 0.329956 ---------------------------------------------------------------------------------------Sum of Std Error t2k Variable N N Miss Weights Mean of Mean -----------------------------------------------------------------------------------------------1 systbp2 6401 777 6497.936845 133.481608 0.428515 2 systbp2 7211 6 7218.167255 145.591493 0.488738 -----------------------------------------------------------------------------------------------Sum of Std Error sp2 Variable N N Miss Weights Mean of Mean -----------------------------------------------------------------------------------------------1 systbp2 6220 305 6473.673134 139.394726 0.335483 2 systbp2 7392 478 7242.430966 140.265460 0.422606 -----------------------------------------------------------------------------------------------Sum of Std Error sp2 t2k Variable N N Miss Weights Mean of Mean ---------------------------------------------------------------------------------------------------1 1 systbp2 2899 304 3102.262548 134.415902 0.485048 2 systbp2 3321 1 3371.410585 143.976079 0.467768 2 1 systbp2 3502 473 3395.674296 132.628044 0.515660 2 systbp2 3890 5 3846.756670 147.007290 0.616352 ---------------------------------------------------------------------------------------------------- 26 Tunnusluvut SUDAAN Descript Number of observations read : Denominator degrees of freedom : 14395 5111 Weighted count : 14404 Variance Estimation Method: Taylor Series (WR) by: Variable, Sukupuoli (1=M,2=N), Tutkimus (1=T2K,2=MS). for: Variable = Systolinen verenpaine. ----------------------------------------------------------------------------------------------Sukupuoli (1=M,2=N) Tutkimus Weighted (1=T2K,2=MS) Sample Size Size Total Mean SE Mean ----------------------------------------------------------------------------------------------Total Total 13612.000 13716.104 1918258.807 139.854 0.330 Missing 0.000 0.000 0.000 . . 1 6401.000 6497.937 867355.059 133.482 0.429 2 7211.000 7218.167 1050903.748 145.591 0.489 Missing Total 0.000 0.000 0.000 . . Missing 0.000 0.000 0.000 . . 1 0.000 0.000 0.000 . . 2 0.000 0.000 0.000 . . 1 Total 6220.000 6473.673 902395.895 139.395 0.336 Missing 0.000 0.000 0.000 . . 1 2899.000 3102.263 416993.419 134.416 0.485 2 3321.000 3371.411 485402.476 143.976 0.468 2 Total 7392.000 7242.431 1015862.912 140.265 0.423 Missing 0.000 0.000 0.000 . . 1 3502.000 3395.674 450361.640 132.628 0.516 2 3890.000 3846.757 565501.272 147.007 0.616 ----------------------------------------------------------------------------------------------- 27 Frekvenssit eri ohjelmistoilla TITLE1 'FREKVENSSIT'; TITLE2 'SAS FREQ'; PROC FREQ DATA=WORK.T2K_DATA; TABLE SYSTBP2_123; RUN; PROC FREQ DATA=WORK.T2K_DATA; TABLE SYSTBP2_123; BY T2K; RUN; TITLE2 'SAS FREQ + WEIGHT'; PROC FREQ DATA=WORK.T2K_DATA; WEIGHT WAN_UNIONI; TABLE SYSTBP2_123; RUN; PROC FREQ DATA=WORK.T2K_DATA; WEIGHT WAN_UNIONI; TABLE SYSTBP2_123; BY T2K; RUN; TITLE2 'SAS SURVEYMEANS'; PROC SURVEYMEANS DATA=WORK.T2K_DATA NOBS MEAN STDERR SUMWGT; STRATA OSITE; CLUSTER RYVAS; WEIGHT WAN_UNIONI; DOMAIN T2K; VAR SYSTBP2_123; CLASS SYSTBP2_123; RUN; TITLE2 'SUDAAN CROSSTAB'; PROC CROSSTAB DATA=WORK.T2K_DATA DESIGN=WR; SETENV COLWIDTH=12 DECWIDTH=3; NEST OSITE RYVAS; WEIGHT WAN_UNIONI; SUBGROUP T2K SYSTBP2_123; LEVELS 2 3; TABLES T2K*SYSTBP2_123; PRINT NSUM WSUM ROWPER SEROW / STYLE=NCHS; RUN; 28 Frekvenssit SAS Freq Syst.vp 3-luok. Cumulative Cumulative SystBP2_123 Frequency Percent Frequency Percent ---------------------------------------------------------------1 3663 25.45 3663 25.45 2 8217 57.08 11880 82.53 3 2515 17.47 14395 100.00 Tutkimus (1=T2K,2=MS)=1 Syst.vp 3-luok. Cumulative Cumulative SystBP2_123 Frequency Percent Frequency Percent ---------------------------------------------------------------1 2741 38.19 2741 38.19 2 3735 52.03 6476 90.22 3 702 9.78 7178 100.00 Tutkimus (1=T2K,2=MS)=2 Syst.vp 3-luok. Cumulative Cumulative SystBP2_123 Frequency Percent Frequency Percent ---------------------------------------------------------------1 922 12.78 922 12.78 2 4482 62.10 5404 74.88 3 1813 25.12 7217 100.00 29 Frekvenssit SAS Freq + Weight Syst.vp 3-luok. Cumulative Cumulative SystBP2_123 Frequency Percent Frequency Percent ---------------------------------------------------------------1 3637.334 25.25 3637.334 25.25 2 8354.206 58.00 11991.54 83.25 3 2412.614 16.75 14404.15 100.00 Tutkimus (1=T2K,2=MS)=1 Syst.vp 3-luok. Cumulative Cumulative SystBP2_123 Frequency Percent Frequency Percent ---------------------------------------------------------------1 2667.273 37.15 2667.273 37.15 2 3828.789 53.32 6496.062 90.47 3 684.0961 9.53 7180.158 100.00 Tutkimus (1=T2K,2=MS)=2 Syst.vp 3-luok. Cumulative Cumulative SystBP2_123 Frequency Percent Frequency Percent ---------------------------------------------------------------1 970.0609 13.43 970.0609 13.43 2 4525.417 62.64 5495.478 76.07 3 1728.518 23.93 7223.995 100.00 30 Frekvenssit SAS Surveymeans Data Summary Number of Strata Number of Clusters Number of Observations Sum of Weights 44 5155 14395 14404.1532 Statistics Sum of Std Error Variable N Weights Mean of Mean ---------------------------------------------------------------------------SystBP2_123=1 3663 14404 0.252520 0.005200 SystBP2_123=2 8217 14404 0.579986 0.005244 SystBP2_123=3 2515 14404 0.167494 0.004260 ---------------------------------------------------------------------------- Tutkimus (1=T2K,2=MS) Sum of Std Error Variable N Weights Mean of Mean ---------------------------------------------------------------------------------------1 SystBP2_123=1 2741 7180.157895 0.371478 0.008186 SystBP2_123=2 3735 7180.157895 0.533246 0.007138 SystBP2_123=3 702 7180.157895 0.095276 0.004580 2 SystBP2_123=1 922 7223.995328 0.134283 0.005741 SystBP2_123=2 4482 7223.995328 0.626442 0.007495 SystBP2_123=3 1813 7223.995328 0.239274 0.007120 ---------------------------------------------------------------------------------------- 31 Frekvenssit SUDAAN Crosstab Number of observations read : Denominator degrees of freedom : 14395 5111 Weighted count : 14404 Variance Estimation Method: Taylor Series (WR) by: Tutkimus (1=T2K,2=MS), Syst.vp 3-luok.. -------------------------------------------------------------------------------Tutkimus (1=T2K,2=MS) Weighted SE Row Syst.vp 3-luok. Sample Size Size Row Percent Percent -------------------------------------------------------------------------------Total Total 14395.000 14404.153 100.000 0.000 1 3663.000 3637.334 25.252 0.520 2 8217.000 8354.206 57.999 0.524 3 2515.000 2412.614 16.749 0.426 1 Total 7178.000 7180.158 100.000 0.000 1 2741.000 2667.273 37.148 0.819 2 3735.000 3828.789 53.325 0.714 3 702.000 684.096 9.528 0.458 2 Total 7217.000 7223.995 100.000 0.000 1 922.000 970.061 13.428 0.574 2 4482.000 4525.417 62.644 0.750 3 1813.000 1728.518 23.927 0.712 -------------------------------------------------------------------------------- 32 Lineaarinen malli eri ohjelmistoilla TITLE1 'LINEAARINEN MALLI'; TITLE2 'SAS GLM + WEIGHT'; PROC GLM DATA=WORK.T2K_DATA; WEIGHT WAN_UNIONI; CLASS SP2 IKA6 T2K; MODEL SYSTBP2 = SP2 IKA6 T2K BMI / SOLUTION; RUN; TITLE2 'SAS SURVEYREG'; PROC SURVEYREG DATA=WORK.T2K_DATA; STRATA OSITE; CLUSTER RYVAS; WEIGHT WAN_UNIONI; CLASS SP2 IKA6 T2K; MODEL SYSTBP2 = SP2 IKA6 T2K BMI / SOLUTION; RUN; TITLE2 'SUDAAN REGRESS'; PROC REGRESS DATA=WORK.T2K_DATA DESIGN=WR; SETENV COLWIDTH=12 DECWIDTH=3; NEST OSITE RYVAS; WEIGHT WAN_UNIONI; SUBGROUP SP2 IKA6 T2K; LEVELS 2 6 2; MODEL SYSTBP2 = SP2 IKA6 T2K BMI; PREDMARG SP2 IKA6 T2K; RUN; 33 Lineaarinen malli Testit SAS GLM + WEIGHT: Source sp2 ika6 t2k bmi DF Type III SS Mean Square F Value Pr > F 1 5 1 1 4598.814 1505233.917 670714.692 197789.472 4598.814 301046.783 670714.692 197789.472 12.85 841.38 1874.54 552.79 0.0003 <.0001 <.0001 <.0001 SAS SURVEYREG: Effect Model Intercept sp2 ika6 t2k bmi Num DF F Value Pr > F 8 1 1 5 1 1 526.87 13313.7 13.07 708.03 610.56 495.48 <.0001 <.0001 0.0003 <.0001 <.0001 <.0001 SUDAAN REGRESS: ----------------------------------------------------------------Contrast Degrees of P-value Wald Freedom Wald F F ----------------------------------------------------------------OVERALL MODEL 9.000 34092.890 0.000 MODEL MINUS INTERCEPT 8.000 525.793 0.000 INTERCEPT . . . SP2 1.000 13.075 0.000 IKA6 5.000 704.653 0.000 T2K 1.000 610.521 0.000 BMI 1.000 495.911 0.000 ----------------------------------------------------------------- 34 Lineaarinen malli Parametriestimaatit SAS GLM + WEIGHT: Parameter Intercept sp2 sp2 ika6 ika6 ika6 ika6 ika6 ika6 t2k t2k bmi Estimate 1 2 1 2 3 4 5 6 1 2 141.0322272 1.1842537 0.0000000 -32.3430867 -25.5086806 -17.1008806 -7.3126631 -1.6082852 0.0000000 -14.3704931 0.0000000 0.9326216 SAS SURVEYREG: Parameter Intercept sp2 1 sp2 2 ika6 1 ika6 2 ika6 3 ika6 4 ika6 5 ika6 6 t2k 1 t2k 2 bmi Estimate 141.032227 1.184254 0.000000 -32.343087 -25.508681 -17.100881 -7.312663 -1.608285 0.000000 -14.370493 0.000000 0.932622 B B B B B B B B B B B Standard Error t Value Pr > |t| 1.43648255 0.33032660 . 1.04374694 1.04840361 1.05129964 1.07586068 1.11667995 . 0.33191316 . 0.03966666 98.18 3.59 . -30.99 -24.33 -16.27 -6.80 -1.44 . -43.30 . 23.51 <.0001 0.0003 . <.0001 <.0001 <.0001 <.0001 0.1498 . <.0001 . <.0001 Standard Error 1.53280246 0.32756747 0.00000000 1.11845398 1.17469039 1.10502478 1.26905700 1.19456103 0.00000000 0.58157558 0.00000000 0.04189813 t Value Pr > |t| 92.01 3.62 . -28.92 -21.72 -15.48 -5.76 -1.35 . -24.71 . 22.26 <.0001 0.0003 . <.0001 <.0001 <.0001 <.0001 0.1783 . <.0001 . <.0001 35 Lineaarinen malli Parametriestimaatit SUDAAN REGRESS: -------------------------------------------------------------------------------Independent Variables and P-value TEffects Beta Coeff. SE Beta T-Test B=0 Test B=0 -------------------------------------------------------------------------------Intercept 141.032 1.532 92.040 0.000 Sukupuoli (1=M,2=N) 1 1.184 0.328 3.616 0.000 2 0.000 0.000 . . Ikäryhmä 1 -32.343 1.118 -28.920 0.000 2 -25.509 1.174 -21.721 0.000 3 -17.101 1.105 -15.481 0.000 4 -7.313 1.269 -5.764 0.000 5 -1.608 1.194 -1.347 0.178 6 0.000 0.000 . . Tutkimus (1=T2K,2=MS) 1 -14.370 0.582 -24.709 0.000 2 0.000 0.000 . . BodyMass-index 0.933 0.042 22.269 0.000 -------------------------------------------------------------------------------- 36 Lineaarinen malli Mallivakiointi SUDAAN REGRESS: -------------------------------------------------------------------------------Marginal Predicted Marginal SE T:Marg=0 P-value -------------------------------------------------------------------------------Sukupuoli (1=M,2=N) 1 140.496 0.346 406.176 0.000 2 139.312 0.394 353.964 0.000 Ikäryhmä 1 127.070 0.332 383.051 0.000 2 133.904 0.439 305.297 0.000 3 142.312 0.393 362.442 0.000 4 152.100 0.642 236.785 0.000 5 157.804 0.755 208.987 0.000 6 159.413 1.090 146.270 0.000 Tutkimus (1=T2K,2=MS) 1 132.072 0.382 345.813 0.000 2 146.443 0.475 308.568 0.000 -------------------------------------------------------------------------------- 37 Poikkileikkaustutkimus / jatkuva vaste Regress TITLE1 'POIKKILEIKKAUSTUTKIMUS / JATKUVA VASTE'; TITLE2 'SUDAAN REGRESS'; PROC REGRESS DATA=WORK.T2K_DATA DESIGN=WR; SETENV COLWIDTH=12 DECWIDTH=3; NEST OSITE RYVAS; WEIGHT WAN_UNIONI; SUBPOPN T2K=1 / NAME="TERVEYS 2000"; SUBGROUP SP2 AA01 PORTAANNOUSU; LEVELS 2 4 2; MODEL SYSTBP2 = BMI SP2 SP2*T114 IKA2 AA01 PORTAANNOUSU; TEST WALDF SATADJF; REFLEVEL SP2=1; PREDMARG SP2; PRINT / TESTS=DEFAULT BETAS=ALL PRED_MRG=ALL; RUN; 38 Poikkileikkaustutkimus / jatkuva vaste SUBPOPN SUBPOPN EXAMPLE: By including the following statements in your SUDAAN program, you can limit the analysis to records for which the value of the RACE variable is 2 (AfricanAmericans in this case) and the value of the SEX variable is 2 (Females in this case), and the value of the AGE variable is either less than 18 or over 65. SUBGROUP RACE SEX; LEVELS 2 2; SUBPOPN RACE=2 & SEX=2 & (AGE<18 | AGE >65) / NAME='African-American Females not in Labor Force'; WARNING: Expressions such as 18 <= AGE <= 65 are NOT appropriate on the SUBPOPN statement and may lead to unexpected results. To indicate all values of AGE between 18 and 65, use the expression: (18 <= AGE) & (AGE <= 65). 39 Poikkileikkaustutkimus / jatkuva vaste Parametrit Variance Estimation Method: Taylor Series (WR) SE Method: Robust (Binder, 1983) Working Correlations: Independent Link Function: Identity Response variable SYSTBP2: Systolinen verenpaine For Subpopulation: TERVEYS 2000 ----------------------------------------------------------------------------------------------Independent Variables and P-value TEffects Beta Coeff. DEFF Beta #4 SE Beta T-Test B=0 Test B=0 ----------------------------------------------------------------------------------------------Intercept 63.115 1.525 3.879 16.270 0.000 Sukupuoli (1=M,2=N) 1 0.000 . 0.000 . . 2 -4.500 1.304 3.075 -1.463 0.143 Siviilisääty 1 0.105 1.319 1.121 0.094 0.925 2 0.735 1.168 1.321 0.557 0.578 3 0.075 1.488 1.418 0.053 0.958 4 0.000 . 0.000 . . Kahden portaan nousu 1 2.980 1.475 1.304 2.285 0.022 2 0.000 . 0.000 . . BodyMass-index 0.874 1.178 0.061 14.341 0.000 Ikä 0.696 1.010 0.022 31.338 0.000 Sukupuoli (1=M,2=N), fS-Kol mmol/l 1, 1 1.471 1.025 0.328 4.482 0.000 2, 1 1.825 1.413 0.366 4.981 0.000 ----------------------------------------------------------------------------------------------- 40 Poikkileikkaustutkimus / jatkuva vaste Testit -----------------------------------------------------------------------------------------------------------Contrast P-value Degrees of S_waite Adj S_waite Adj S_waite Adj P-value Wald Freedom DF F F Wald F F -----------------------------------------------------------------------------------------------------------OVERALL MODEL 10.000 8.038 23195.089 0.000 17787.847 0.000 MODEL MINUS INTERCEPT 9.000 7.899 198.420 0.000 220.470 0.000 INTERCEPT . . . . . . SP2 1.000 1.000 2.142 0.143 2.142 0.143 AA01 3.000 2.835 0.204 0.884 0.288 0.834 PORTAANNOUSU 1.000 1.000 5.220 0.022 5.220 0.022 BMI 1.000 1.000 205.668 0.000 205.668 0.000 IKA2 1.000 1.000 982.072 0.000 982.072 0.000 T114 * SP2 2.000 1.912 22.602 0.000 25.972 0.000 ------------------------------------------------------------------------------------------------------------ -------------------------------------------------------------------------------Marginal Predicted Marginal SE T:Marg=0 P-value -------------------------------------------------------------------------------Sukupuoli (1=M,2=N) 1 134.691 0.481 280.090 0.000 2 132.297 0.504 262.703 0.000 -------------------------------------------------------------------------------- 41 Poikkileikkaustutkimus / binäärinen vaste Rlogist (Logistic) TITLE 'POIKKILEIKKAUSTUTKIMUS / BINÄÄRINEN (0/1) VASTE'; TITLE2 'SUDAAN RLOGIST (LOGISTIC)'; PROC RLOGIST DATA=WORK.T2K_DATA DESIGN=WR; SETENV COLWIDTH=12 DECWIDTH=3; NEST OSITE RYVAS; WEIGHT WAN_UNIONI; SUBPOPN T2K=1 / NAME="TERVEYS 2000"; SUBGROUP SP2 AA01 PORTAANNOUSU; LEVELS 2 4 2; MODEL SYSTBP2_01 = BMI SP2 SP2*T114 IKA2 AA01 PORTAANNOUSU; TEST WALDF SATADJF; REFLEVEL SP2=1; PREDMARG SP2; PRINT / TESTS=DEFAULT BETAS=ALL PRED_MRG=ALL RISK=ALL; RUN; 42 Poikkileikkaustutkimus / binäärinen vaste Parametrit Variance Estimation Method: Taylor Series (WR) SE Method: Robust (Binder, 1983) Working Correlations: Independent Link Function: Logit Response variable SYSTBP2_01: Syst.vp (0/1) For Subpopulation: TERVEYS 2000 ----------------------------------------------------------------------------------------------Independent Variables and P-value TEffects Beta Coeff. DEFF Beta #4 SE Beta T-Test B=0 Test B=0 ----------------------------------------------------------------------------------------------Intercept -9.214 1.436 0.561 -16.411 0.000 Sukupuoli (1=M,2=N) 1 0.000 . 0.000 . . 2 -0.040 1.070 0.397 -0.102 0.919 Siviilisääty 1 0.175 0.945 0.118 1.484 0.138 2 0.207 1.097 0.173 1.196 0.232 3 0.215 1.308 0.171 1.263 0.207 4 0.000 . 0.000 . . Kahden portaan nousu 1 0.442 1.286 0.149 2.964 0.003 2 0.000 . 0.000 . . BodyMass-index 0.086 1.061 0.008 11.012 0.000 Ikä 0.078 1.126 0.003 22.312 0.000 Sukupuoli (1=M,2=N), fS-Kol mmol/l 1, 1 0.216 1.166 0.047 4.583 0.000 2, 1 0.199 1.206 0.047 4.193 0.000 ----------------------------------------------------------------------------------------------- 43 Poikkileikkaustutkimus / binäärinen vaste Testit -----------------------------------------------------------------------------------------------------------Contrast P-value Degrees of S_waite Adj S_waite Adj S_waite Adj P-value Wald Freedom DF F F Wald F F -----------------------------------------------------------------------------------------------------------OVERALL MODEL 10.000 8.603 106.794 0.000 142.211 0.000 MODEL MINUS INTERCEPT 9.000 8.123 89.622 0.000 111.093 0.000 INTERCEPT . . . . . . SP2 1.000 1.000 0.010 0.919 0.010 0.919 AA01 3.000 2.877 0.786 0.497 0.763 0.515 PORTAANNOUSU 1.000 1.000 8.784 0.003 8.784 0.003 BMI 1.000 1.000 121.267 0.000 121.267 0.000 IKA2 1.000 1.000 497.811 0.000 497.811 0.000 T114 * SP2 2.000 1.987 19.472 0.000 18.094 0.000 ------------------------------------------------------------------------------------------------------------ -------------------------------------------------------------------------------Marginal Predicted Marginal SE T:Marg=0 P-value -------------------------------------------------------------------------------Sukupuoli (1=M,2=N) 1 0.323 0.010 31.989 0.000 2 0.299 0.010 30.350 0.000 -------------------------------------------------------------------------------- 44 Poikkileikkaustutkimus / binäärinen vaste OR ----------------------------------------------------------------Independent Variables and Lower 95% Upper 95% Effects Odds Ratio Limit OR Limit OR ----------------------------------------------------------------Intercept 0.000 0.000 0.000 Sukupuoli (1=M,2=N) 1 1.000 1.000 1.000 2 0.961 0.441 2.090 Siviilisääty 1 1.191 0.945 1.500 2 1.230 0.876 1.726 3 1.240 0.888 1.733 4 1.000 1.000 1.000 Kahden portaan nousu 1 1.555 1.161 2.083 2 1.000 1.000 1.000 BodyMass-index 1.090 1.074 1.107 Ikä 1.081 1.073 1.088 Sukupuoli (1=M,2=N), fS-Kol mmol/l 1, 1 1.241 1.131 1.360 2, 1 1.220 1.112 1.339 ----------------------------------------------------------------- 45 Poikkileikkaustutkimus / moniluokkainen vaste Multilog TITLE 'POIKKILEIKKAUSTUTKIMUS / MONILUOKKAINEN VASTE'; TITLE2 'SUDAAN MULTILOG'; PROC MULTILOG DATA=WORK.T2K_DATA DESIGN=WR; SETENV COLWIDTH=12 DECWIDTH=3; NEST OSITE RYVAS; WEIGHT WAN_UNIONI; SUBPOPN T2K=1 / NAME="TERVEYS 2000"; SUBGROUP SYSTBP2_123 SP2 AA01 PORTAANNOUSU; LEVELS 3 2 4 2; MODEL SYSTBP2_123 = BMI SP2 T114 IKA2 AA01 PORTAANNOUSU TEST WALDF SATADJF; REFLEVEL SP2=1; PREDMARG SP2; PRINT / TESTS=DEFAULT BETAS=ALL PRED_MRG=ALL STYLE=NCHS; RUN; / CUMLOGIT; 46 Poikkileikkaustutkimus / moniluokkainen vaste Parametrit ----------------------------------------------------------------------------------------------SYSTBP2_123 (log-odds) Independent Variables and P-value TEffects Beta Coeff. DEFF Beta #4 SE Beta T-Test B=0 Test B=0 ----------------------------------------------------------------------------------------------1 vs 3 Intercept 14.365 0.996 0.694 20.689 0.000 Sukupuoli (1=M,2=N) 1 0.000 . 0.000 . . 2 0.305 1.073 0.120 2.542 0.011 Siviilisääty 1 -0.221 0.953 0.204 -1.084 0.278 2 -0.381 0.987 0.282 -1.350 0.177 3 -0.208 1.102 0.267 -0.780 0.435 4 0.000 . 0.000 . . Kahden portaan nousu 1 -0.782 1.006 0.227 -3.447 0.001 2 0.000 . 0.000 . . Ikä -0.126 0.959 0.005 -23.191 0.000 BodyMass-index -0.148 1.049 0.013 -11.145 0.000 fS-Kol mmol/l -0.281 1.090 0.052 -5.356 0.000 2 vs 3 Intercept 7.811 0.832 0.542 14.412 0.000 Sukupuoli (1=M,2=N) 1 0.000 . 0.000 . . 2 -0.185 0.845 0.094 -1.975 0.048 Siviilisääty 1 -0.265 0.924 0.141 -1.882 0.060 2 -0.449 0.992 0.233 -1.930 0.054 3 -0.266 0.998 0.197 -1.352 0.177 4 0.000 . 0.000 . . Kahden portaan nousu 1 -0.231 1.014 0.154 -1.499 0.134 2 0.000 . 0.000 . . Ikä -0.066 0.883 0.004 -14.975 0.000 BodyMass-index -0.032 1.086 0.011 -2.902 0.004 fS-Kol mmol/l -0.112 1.041 0.043 -2.592 0.010 ----------------------------------------------------------------------------------------------- 47 Poikkileikkaustutkimus / moniluokkainen vaste Testit -----------------------------------------------------------------------------------------------------------Contrast P-value Degrees of S_waite Adj S_waite Adj S_waite Adj P-value Wald Freedom DF F F Wald F F -----------------------------------------------------------------------------------------------------------OVERALL MODEL 18.000 14.913 103.271 0.000 107.345 0.000 MODEL MINUS INTERCEPT 16.000 14.485 65.961 0.000 58.829 0.000 INTERCEPT . . . . . . SP2 2.000 1.925 25.720 0.000 25.057 0.000 AA01 6.000 5.811 0.718 0.630 0.859 0.524 PORTAANNOUSU 2.000 1.935 6.590 0.002 6.442 0.002 IKA2 2.000 1.984 285.183 0.000 277.345 0.000 BMI 2.000 1.993 94.348 0.000 99.011 0.000 T114 2.000 1.995 17.109 0.000 16.575 0.000 ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------Syst.vp 3-luok. Predicted Marginal Marginal SE T:Marg=0 P-value -------------------------------------------------------------------------------1 Sukupuoli (1=M,2=N) 1 0.266 0.011 24.838 0.000 2 0.347 0.009 38.501 0.000 2 Sukupuoli (1=M,2=N) 1 0.633 0.011 58.495 0.000 2 0.545 0.009 62.273 0.000 3 Sukupuoli (1=M,2=N) 48 1 0.101 0.006 15.619 0.000 2 0.109 0.007 16.498 0.000 -------------------------------------------------------------------------------- Kahden otoksen vertailu / jatkuva vaste Regress TITLE1 'KAHDEN OTOKSEN VERTAILU / JATKUVA VASTE'; TITLE2 'SUDAAN REGRESS'; PROC REGRESS DATA=WORK.T2K_DATA DESIGN=WR; SETENV COLWIDTH=12 DECWIDTH=3; NEST OSITE RYVAS; WEIGHT WAN_UNIONI; SUBGROUP T2K SP2 AA01 PORTAANNOUSU; LEVELS 2 2 4 2; MODEL SYSTBP2 = T2K BMI SP2 IKA2 AA01 PORTAANNOUSU; TEST SATADJF; REFLEVEL T2K=1; PREDMARG T2K; RUN; 49 Kahden otoksen vertailu / binäärinen vaste Rlogist (Logistic) TITLE 'KAHDEN OTOKSEN VERTAILU / BINÄÄRINEN VASTE'; TITLE2 'SUDAAN RLOGIST (LOGISTIC)'; PROC RLOGIST DATA=WORK.T2K_DATA DESIGN=WR; SETENV COLWIDTH=12 DECWIDTH=3; NEST OSITE RYVAS; WEIGHT WAN_UNIONI; SUBGROUP T2K SP2 AA01 PORTAANNOUSU; LEVELS 2 2 4 2; MODEL SYSTBP2_01 = T2K BMI SP2 IKA2 AA01 PORTAANNOUSU; TEST SATADJF; REFLEVEL T2K=1; PREDMARG T2K; RUN; 50 Kahden otoksen vertailu / moniluokkainen vaste Multilog TITLE 'KAHDEN OTOKSEN VERTAILU / MONILUOKKAINEN VASTE'; TITLE2 'SUDAAN MULTILOG'; PROC MULTILOG DATA=WORK.T2K_DATA DESIGN=WR; SETENV COLWIDTH=12 DECWIDTH=3; NEST OSITE RYVAS; WEIGHT WAN_UNIONI; SUBGROUP SYSTBP2_123 T2K SP2 AA01 PORTAANNOUSU; LEVELS 3 2 2 4 2; MODEL SYSTBP2_123 = T2K BMI SP2 IKA2 AA01 PORTAANNOUSU / CUMLOGIT; TEST SATADJF; REFLEVEL T2K=1; PREDMARG T2K; PRINT / STYLE=NCHS; RUN; 51 Ajonaikainen uudelleen luokittelu Recode TITLE1 'AJONAIKAINEN UUDELLEEN LUOKITTELU'; TITLE2 'RECODE'; PROC CROSSTAB DATA=WORK.T2K_DATA DESIGN=WR; SETENV COLWIDTH=12 DECWIDTH=3; NEST OSITE RYVAS; WEIGHT WAN_UNIONI; RECODE SYSTBP2_01=(0 1); SUBGROUP T2K SYSTBP2_01; LEVELS 2 2; TABLES T2K*SYSTBP2_01; PRINT NSUM ROWPER / STYLE=NCHS; RUN; RECODE EXAMPLES: RECODE X = 1.5; will recode the continuous or categorical variable X to a 0-1 variable whose value is 0 if the input value is less than 1.5 and 1 if the input value is greater than or equal to 1.5. RECODE ZERONE = (0 1); recodes the 0-1 variable ZERONE to be a 1-2 variable suitable for use on the SUBGROUP statement. Level 0 goes to 1; level 1 goes to 2. 52 Ajonaikainen uudelleen luokittelu Ennen - jälkeen -------------------------------------------------Tutkimus (1=T2K,2=MS) Syst.vp (0/1) Sample Size Row Percent -------------------------------------------------Total Total 5885.000 100.000 1 5885.000 100.000 2 0.000 0.000 1 Total 2022.000 100.000 1 2022.000 100.000 2 0.000 0.000 2 Total 3863.000 100.000 1 3863.000 100.000 2 0.000 0.000 -------------------------------------------------- -------------------------------------------------Tutkimus (1=T2K,2=MS) Syst.vp (0/1) Sample Size Row Percent -------------------------------------------------Total Total 14395.000 100.000 1 8510.000 59.909 2 5885.000 40.091 1 Total 7178.000 100.000 1 5156.000 71.840 2 2022.000 28.160 2 Total 7217.000 100.000 1 3354.000 48.050 2 3863.000 51.950 -------------------------------------------------- 53 Suora vakiointi Descript + Stdvar & Stdwgt TITLE1 'SUORA VAKIOINTI'; TITLE2 'SUDAAN DESCRIPT + STDVAR & STDWGT'; PROC DESCRIPT DATA=WORK.T2K_DATA DESIGN=WR; NEST OSITE RYVAS; WEIGHT WAN_UNIONI; SUBGROUP IKA6 T2K; LEVELS 6 2; STDVAR IKA6; STDWGT .1 .1 .2 .3 .2 .1; * summa=1; * STDWGT 10 10 20 30 20 10; * summa=100; * STDWGT 20000 20000 40000 60000 40000 20000; * ohjelma skaalaa itse; VAR SYSTBP2; TABLES T2K; PRINT / STYLE=NCHS; RUN; 54 Suora vakiointi Ennen - jälkeen ------------------------------------------------------------------------------------Variable Tutkimus Sample Weighted (1=T2K,2=MS) Size Size Total Mean SE Mean ------------------------------------------------------------------------------------Systolinen verenpaine Total 13612 13716.10 1918258.81 139.85 0.33 1 6401 6497.94 867355.06 133.48 0.43 2 7211 7218.17 1050903.75 145.59 0.49 ------------------------------------------------------------------------------------- ------------------------------------------------------------------------------------Variable Tutkimus Sample Weighted (1=T2K,2=MS) Size Size Total Mean SE Mean ------------------------------------------------------------------------------------Systolinen verenpaine Total 13612 13716.10 1918258.81 147.73 0.37 1 6401 6497.94 867355.06 140.09 0.47 2 7211 7218.17 1050903.75 155.13 0.60 ------------------------------------------------------------------------------------- 55 Mallivakioidut keskiarvot tiedostoon Regress + Output TITLE1 'MALLIVAKIOIDUT KESKIARVOT TIEDOSTOON'; TITLE2 'SUDAAN REGRESS + OUTPUT'; PROC REGRESS DATA=WORK.T2K_DATA DESIGN=WR; NEST OSITE RYVAS; WEIGHT WAN_UNIONI; SUBGROUP SP2 IKA6 T2K; LEVELS 2 6 2; MODEL SYSTBP2 = SP2 IKA6 T2K BMI; PREDMARG SP2 IKA6 T2K; OUTPUT / FILENAME=MARGIN FILETYPE=SAS REPLACE PRED_MRG=ALL; RUN; PROC PRINT DATA=MARGIN LABEL; RUN; 56 Mallivakioidut keskiarvot tiedostoon SUDAAN - tulostus -------------------------------------------------------------------------------Marginal Predicted Marginal SE T:Marg=0 P-value -------------------------------------------------------------------------------Sukupuoli (1=M,2=N) 1 140.496 0.346 406.176 0.000 2 139.312 0.394 353.964 0.000 Ikäryhmä 1 127.070 0.332 383.051 0.000 2 133.904 0.439 305.297 0.000 3 142.312 0.393 362.442 0.000 4 152.100 0.642 236.785 0.000 5 157.804 0.755 208.987 0.000 6 159.413 1.090 146.270 0.000 Tutkimus (1=T2K,2=MS) 1 132.072 0.382 345.813 0.000 2 146.443 0.475 308.568 0.000 -------------------------------------------------------------------------------- 57 Mallivakioidut keskiarvot tiedostoon SAS - tulostus Obs 1 2 3 4 5 6 7 8 9 10 Procedure Number 4 4 4 4 4 4 4 4 4 4 Table Number 1 1 1 1 1 1 1 1 1 1 Marginal 1 2 3 4 5 6 7 8 9 10 Predicted Marginal SE T:Marg=0 P-value 140.496 139.312 127.070 133.904 142.312 152.100 157.804 159.413 132.072 146.443 0.346 0.394 0.332 0.439 0.393 0.642 0.755 1.090 0.382 0.475 406.176 353.964 383.051 305.297 362.442 236.785 208.987 146.270 345.813 308.568 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 58 Kiitos! 59