SAS PROC Freq

advertisement
SAS PROC Freq
1.
Introduction
Frequency tables show the distribution of variable values. Cross-tabulation tables show combined
frequency distributions for two or more variables. For one-way tables, PROC FREQ can compute chisquare tests for equal or specified proportions. For two-way tables, PROC FREQ computes tests and
measures of association. For n-way tables, PROC FREQ does stratified analysis, computing statistics
within as well as across strata.
2.
Syntax
PROC FREQ options;
OUTPUT <OUT= SAS-data-set><output-statistic-list>;
TABLES requests / options;
WEIGHT variable;
EXACT statistic-keywords;
BY variable-list;
3.
Details.
a) The following options are available in the PROC FREQ statement:
COMPRESS
DATA= SAS-data-set
ORDER= INTERNAL|FREQ|DATA|FORMATTED
FORMCHAR(1,2,7)= 'string'
PAGE
NOPRINT
COMRPESS
The COMPRESS option includes the next one-way frequency table on the same page if there is
enough space to begin the table. By default, the next one-way table begins on the same page only if
the entire table fits on that page.
ORDER= INTERNAL | FREQ | DATA | FORMATTED
The ORDER= option specifies the order the variable levels are to be reported.
INTERNAL: Levels are ordered by their internal value.
FREQ : Levels are ordered by descending frequency count.
DATA: Levels are ordered as they were ordered in the input SAS data set.
FORMATTED: Levels are ordered by their external formatted value.
Default: INTERNAL
Note: the ORDER= option does not apply to missing values, which are always ordered first, or to
observations with zero weights.
FORMCHAR(1,2,7)= 'string'
The FORMCHAR option defines the characters to be used for constructing the outlines and dividers
for the cells of contingency tables.
The string should be three characters long. The characters are used to denote (1) vertical divider, (2)
horizontal divider, and (7) vertical-horizontal intersection.
Default: FORMCHAR(1,2,7)= '|-+'
PAGE
The PAGE option requests that FREQ print only one table per page.
NOPRINT
The NOPRINT option suppresses all printed output from PROC FREQ. Note that a NOPRINT
options continues to be available in the TABLES statement. It suppresses printing of the tables, but
allows printing of the statistics specified by the ALL, CHISQ, CMH, EXACT, MEASURES, and
PLCORR options.
b) OUTPUT <OUT= SAS-data-set> <output-statistic-list>;
The OUTPUT statement creates a SAS data set containing statistics computed by PROC FREQ. The
output SAS data set can include any statistics requested in the TABLES statement. You can request
these statistics by using keywords identical to the options used to request them in the TABLES
statement: AGREE, ALL, CHISQ, CMH, CMH1, CMH2, EXACT, MEASURES, and PLCORR. Or,
request individual statistics by specifying one of the keywords listed below:
AJCHI
BDCHI
CMHCOR
CMHGA
CMHRMS
CONTGY
EXACT
JT
KAPPA
LAMCR
LAMDAS
LAMRC
MCNEM
MHCHI
MHOR
MHRRC1
MHRRC2
N
PHI
PLCORR
RDIF1
RDIF2
RRC1
RRC2
RSK11
RSK12
RSK21
RSK22
RELRISK
RISKDIFF
SMDCR
SMDRC
STUTC
TREND
TSYMM
U
CQ
CRAMV
EQKAPS
EQWTKAPS
LGOR
LGRRC1
LGRRC2
LRCHI
NMISS
PCHI
PCORR
RROR
RSK1
RSK2
RISKDIFF1 UCR
RISKDIFF2 URC
SCORR
WTKAPPA
Only one OUTPUT statement is allowed for each execution of the FREQ procedure. Where there are
multiple TABLES statements, the contents of the output SAS data set correspond to the last TABLES
statement; when there are multiple table requests in a TABLES statement, the contents correspond to
the last table request. For each stratum, there is one observation that contains the requested statistics.
The names for the requested statistics are the names of the keywords enclosed in underscores. If a
statistic has a corresponding p-value, the name for the p-value is formed by adding P and an
underscore before the keyword. Other variables included are BY variables, if any, and variables that
identify the stratum.
c) TABLES requests / options;
The TABLES command requests tables be produced. Any number of TABLES statements can be
included. If no TABLES statement is given, one-way frequencies for all of the variables in the data
set are produced. To request a one-way frequency table for a variable, name the variable in a TABLES
statement. For example: PROC FREQ;TABLES a;
For a crosstabulation table of two variables, give their names separated by an asterisk. The first
variable's values form the rows of the table, and the second variable's values form the columns. For
example: PROC FREQ; TABLES a*b;
For n-way crosstabulation tables, the last variable's values form the columns; the next-to-last variable's
values form the rows. Each level (or combination of levels) of the other variables form one stratum.
A contingency table is produced for each stratum.
TABLES requests / options ;
Options that can be used in the TABLES statement:
General
LIST
MISSING OUT=
V5FMT
Request Statistical analysis:
AGREE
ALL
CHISQ
CL
CMH
CMH1
CMH2
EXACT
JT
MEASURES PLCORR
RISKDIFF TESTF= TESTP=
TREND
Statistical Details ALPHA=
CONVERGE= MAXITER=
RELRISK
SCORES=
Request Additional Table information
CELLCHI2 CUMCOL DEVIATION EXPECTED MISSPRINT SPARSE TOTPCT
Suppress Printing
NOCOL
NOCUM
NOFREQ NOPERCENT NOPRINT
NOROW
NOTE: see SAS online manual for more details.
d) WEIGHT variable;
Normally, each observation contributes a value of 1 to the frequency counts. When a WEIGHT
statement appears, each observation contributes the weighting variable's value for that observation.
The values do not have to be integers. Negative values for the specified variable are allowed. Since
negative values cannot correspond to actual frequencies, the total frequency, percentages, and
statistical calculations are undefined and, therefore, not printed when there are negative weights.
If the value of the weight variable is missing or zero, the corresponding observation is ignored.
Only one WEIGHT statement can be used, and that statement applies to counts collected for all tables.
e) EXACT statistic-keywords;
The EXACT statement allows you to specify statistics for which to calculate exact p-values. You can
request exact computations for groups of statistics by specifying keywords identical to the TABLES
statement options AGREE, CHISQ, and MEASURES. You can request exact p-values for an
individual statistic by specifying the corresponding keyword in the following list. Note that
specifying the keyword RROR requests exact confidence bounds for the odds ratio for 2x2 tables.
JT
LRCHI
MHCHI
PCORR
SCORR
WTKAP
KAPPA
MCNEM
PCHI
RROR
TREND
f) BY <DESCENDING> variables ... <NOTSORTED>;
A BY statement is used with a procedure to obtain separate analyses on observations in groups
defined by the BY variables. The data set being processed need not have been previously sorted by
the SORT procedure. However, the data set must be in the same order as though PROC SORT had
sorted it unless NOTSORTED is specified. If you have used a FORMAT or ATTRIB statement to
group a continuous variable into discrete groups, the BY statement creates BY groups based on the
formatted values. You can also ensure that variables are processed in ascending order by creating an
index for one or more variables in the SAS data set. The usages of the BY statement differ in each
procedure. Please refer to the Users' Guide for the details.
Download