National Health and Nutrition Examination Survey

advertisement
National Health and Nutrition Examination
Survey: A Very General Overview
Taken from various NHANES sources and
Lein’s comments.
NHANES Objective
To measure and assess the health
and nutritional status of adults and
children in the United States
When did NHANES start?
•
The Health Examination Survey – the
forerunner in the 1960’s
• The first three National Health Examination
Surveys (NHES) were conducted between
1960 and 1970. These surveys were known
as NHES I, II, and III.
• Between 1971 and 1975, a large nutrition
component was added. Name was changed
to NHANES.
Sample
•
Civilian, non-institutionalized household
population:
•
Residents of all states and the District of
Columbia
•
All ages
• Unique in combining a home interview
with health examinations conducted in a
Mobile Examination Center (MEC)
•
New Survey Available: 2009-2010
Six Principal Data Collection
Methods
•
•
•
•
•
•
Household interview
Personal interviews
Physical examination (MEC)
Anthropometry (MEC)
Diagnostic screening (MEC)
Laboratory analysis (MEC)
NHANES Mobile Exam Center
(MEC)
MEC examination components
• Dietary
interviews/MEC
interviews
•
•
•
•
•
•
•
•
Phlebotomy
Urine collection
Blood pressure
Physician’s exam
Hearing
Eye exam
Dental exam
DXA
•
•
•
•
Muscle strength
Balance
Anthropometry
Skin
disease/Melanoma
• TB skin test
• Cognitive testing
• Cardiorespiratory
fitness
• Peripheral vascular
disease
• Peripheral neuropathy
The major categories of NHANES
data files
•
Demographics files: survey design and
demographic variables
• Examination files: information collected
through physical exams, dental exams, and
dietary interview components
•
Laboratory files: results from specimens
such as blood, urine, hair, air, tuberculosis
skin test, and household dust and water
specimens
•
Questionnaire files: household interview
and mobile examination center (MEC)
interview
Examples of NHANES Findings
and Uses
Landmark findings and
public health results
• High blood lead levels
Lead out of gasoline
• Low folate levels
Mandatory food fortification
• Rising levels of obesity
Public health action plan
• Racial and ethnic disparities in Hepatitis B
Universal vaccination of all infants & children
Trends in Child and Adolescent Overweight
Percent
20
15
6-11 y
12-19 y
10
5
0
1963-5 1966-70 1971-4 1976-80 1988-94 1999-00 2001-2
2003-4
Note: Overweight is defined as BMI >= gender- and weight-specific 95th percentile from the 2000 CDC Growth Charts.
Source: National Health Examination Surveys II (ages 6-11) and III (ages 12-17), National Health and Nutrition Examination
Surveys I, II, III and 1999-2004, NCHS, CDC.
OH9900
NHANES Complex Survey Design
(sometimes called “multi-stage” survey design)
NHANES data are NOT obtained using a
simple random sample. Rather, a
“complex”, multistage, probability
sampling design is used to select
participants representative of the
civilian, non-institutionalized US
population.
In Brief..
•
The entire US is broken into about 40
strata.
•
Each stratum is divided into many
primary sampling units (PSUs) – mostly
single counties.
•
Within each stratum two PSUs are
selected.
•
Within each of the selected PSUs,
individual households are selected and
then individual subjects are selected.
Household/Individual
Oversampling
In some geographic areas the proportion
of some age, ethnic, or income groups are
oversampled to provide for accurate
subgroup reporting.
E.g.: Native American subjects
More on Individual Weights
The sample weight is assigned to
each individual subject. It is a
measure of the number of people in
the population represented by that
sample person in NHANES.
This design creates three
sampling weighting variables
STRATA (will have names with the
letters “STRA” at the end )
PSUs (will have “PSU” at the end)
Individual WEIGHTs (will begin
with the prefix “WT”)
Do I have to use sample weights and
other survey design variables?
•
Yes. For NHANES datasets, the use of
sampling weights and sample design
variables is recommended for all
analyses.
•
If you fail to account for the
sampling parameters, you may
obtain biased estimates and
overstate significance levels.
Selecting the Correct Weight
To produce estimates appropriately
adjusted for survey non-response, it is
important to check all of the variables in
your analysis and select the weight of the
smallest analysis subpopulation.
Using Weighting Variables in SAS
SAS allows for three weighting
statements in its survey procedures..
•
•
•
STRATA Statement (for Strata Vars)
CLUSTER Statement (for PSU Vars)
WEIGHT Statement (for Weight Vars)
An example of a SAS Survey
Procedure with NHANES data
(always begins with the prefix “survey”)
proc surveymeans; var kcal;
cluster SDMVPSU;
strata SDMVSTRA;
weight WTDRD1;
run;
What kinds of NHANES
documents are available online?
•
Codebook -- The codebook portion
lists all the variables in the data file.
•
Data file documentation – Provides a
brief description of the file.
•
Frequency Tables -- Contains the
frequency count for each item in the
data file and can be used to verify the
sample size.
Where and how can I access
NHANES data files?
•
NHANES data can be downloaded from
the NHANES website.
http://www.cdc.gov/nchs/nhanes.htm
•
NHANES files are in SAS transport file
format (.xpt).
•
xpt files are easily read directly by SAS or
converted to a .sas7bdat file with
StatTransfer.
How do you read .xpt files
directly with SAS?
libname demog xport 'c:\nhanes\demo_e.xpt';
data demo; set demog.demo_e;
run;
1
libname demog xport 'c:\nhanes\demo_e.xpt';
NOTE: Libref DEMOG was successfully assigned as follows:
Engine:
XPORT
Physical Name: c:\nhanes\demo_e.xpt
2
data demo; set demog.demo_e;
3
run;
NOTE: There were 10149 observations read from the data set
DEMOG.DEMO_E.
NOTE: The data set WORK.DEMO has 10149 observations and 43
variables.
StatTransfer: another option for
using SAS .xpt files
You can also use StatTransfer to
convert SAS .xpt files into .sas7bdat
StatTransfer Screen Shot
Do I have to format and label all
variables?
•
NHANES provides variable labels
built into their data sets.
•
Formats are not included so you must
create your own formats by using
PROC FORMAT and the FORMAT
Statement.
Lots of Merging with NHANES
The data files remain separate by type of
measurement. This requires that you
merge files together for analysis.
Things to know about Survey Analysis
•
Not all software packages are
equipped to analyze complex survey
data.
•
See “Summary of Survey Analysis
Software” at Harvard Med’s site for
list and limitations:
www.hcp.med.harvard.edu/statistics/survey-soft/
•
All have limitations
More…
•
Stata allows for an interactive “survey
mode”.
• SPSS provides limited menu-driven
procedures.
• SAS provides limited procedures.
• SUDAAN is a program that can work with
SAS to expand its survey procedures. Not
widely available at UCB.
Analyzing Sub-Population in
Survey Analysis
When analyzing sub-populations in complex
survey design, it’s important NOT to subset
your data. Instead, create a sub-population
indicator variable. Correctly written
statistical software will allow for the subpopulation variable to be included in the
model specification.
Quote from AJE Article
(Graubard and Korn, 1996)
“One frequently analyzes a subset of the
data collected in a survey when interest
focuses on individuals in a certain
subpopulation of the sampled population.
Although it may seem natural to eliminate
from the data set all data from individuals
outside the subpopulation before analysis,
this procedure may yield incorrect standard
errors and confidence intervals. “
Why is this? It seems
counterintuitive.
In complex (multi-stage) survey designs, the
weighting variables for all subjects are used
to compute the standard errors for subpopulations. The mathematics of this is
complex – in general terms, though, the
relative weight of each subject can only be
fully accounted for by analyzing all of the
subjects.
How does SAS deal with subpopulations?
Most SAS Survey procedures use a DOMAIN
statement for sub-population analysis. This
statemnet identifies a sub-population
indicator variable.
Your SAS Review Assignment will provide some
practice in using NHANES data with SAS
Download