How to Use Survey Data From the Community Tracking Study

advertisement
How to Use Survey Data From the Community
Tracking Study
December 11, 2008
By Paul Leigh and Dan Tancredi
Center for Healthcare Policy and Research, University of
California, Davis
Outline
1. Description of CTS Data
2. How to obtain data from University of Michigan
3. How to up-load CTS data onto personal computer
4. How to program
5. References
6. Appendix # 1. Letter to Peter Granda at ICPSR to request data
7. Appendix # 2. Data Protection Plan
1. Description of CTS Data (partially drawn from the website, see references)
The Community Tracking Survey (CTS) is the core research effort of the Center for Studying
Health System Change (HSC), which aims to describe how the accessibility, cost and quality of
locally delivered health care is determined by the interactions of providers, insurers and policy
makers. In addition to in-depth site visits, the CTS is comprised of nationally representative
surveys, including ongoing biennial national surveys of households and physicians, as well as
employer surveys and a single health insurer follow-back survey conducted in the first two
rounds. The first four rounds were conducted beginning in 1996 and were concentrated in a
nationally representative probability sample of 60 communities. The first three of these rounds
also included “national supplement” probability sample, which surveyed respondents from
throughout the nation and which can be used by itself or in combination with the site sample.
Data from the first four rounds are semi-longitudinal i.e. roughly 50% of households/physicians
in a later round having been a respondent in the previous round. The physician survey data in
these rounds can be used in panel data analysis of individual respondents. Because the sampling
unit for the household survey is telephone number, not household members, survey designers
decided that panel data analysis was not feasible with that survey. Household and physician
survey data collected over the first four rounds can be used to draw conclusions for the nation as
well as for individual sites, either for individual rounds or by pooling multiple rounds. Having
data from multiple surveys for a common set of sites permits analysts to relate the individuallevel measures obtained from one survey to market-level health system characteristics obtained
from the other surveys. Because of changes to the survey design, the 2007-08 Physician and
Household Surveys will support analyses on the national-level only. Note that there are no links
between any of the survey respondents in the household, physician and employer surveys (e.g.,
respondents to the Household Survey are not patients of physicians in the Physician Survey).
Household Survey
17,800 individuals in 9,400 families comprise the sample for the most recent Household Survey
in 2007, which focuses on tracking changes in health care access, utilization, insurance,
perceptions of care quality and problems paying medical bills. The response rate was 43% in
2007. Particular areas of inquiry include access, satisfaction, use of services and insurance
coverage. Information about health status, sociodemographic characteristics and employment is
also collected. Mathematica Policy Research conducts the Household Survey for HSC. The first
four household surveys were conducted in 1996-97, 1998-99, 2000-01, and 2003. The fifth
survey was conducted primarily in calendar year 2007.
Physician Survey
Physicians respond to a series of questions about source of practice revenue, problems they face
in practicing medicine, quality of care, access to services, information technology, sources of
practice revenue and compensation, as well as questions about their practice arrangements and
care practices. Over twelve thousand (>12,000) practicing physicians across the country
provided perspectives on how health care delivery is changing in the first three rounds of the
survey (1996-97, 1998-99, 2000-01), while more than 6,600 physicians were interviewed in
round four (2004-05). The 2008 Physician Survey is currently underway and over 4,500
physicians are expected to respond. Unlike previous rounds of the survey that were administered
over the telephone, this round uses a mail questionnaire.
2. How to obtain data from ICPSR, including precautions
CTS survey data are available through the Health and Medical Care Archive (HMCA) at the
University of Michigan’s Interuniversity Consortium for Political and Social Research (ICPSR).
UC Davis is a member of ICPSR and public-use versions of the survey data are freely available
to UC Davis researchers and can be downloaded from the web. Web downloading requires that
the user complete a simple registration form and agree to reasonable terms regarding data use.
The public-use data are very helpful for preliminary analysis. However, these files do not include
certain data fields whose inclusion could jeopardize the confidentiality of survey participants. In
particular, the public-use data do not include site identifiers and other survey design variables
used by survey data analysis procedures to account for the complex survey design, nor do they
include physician identifier codes that can be used to link data from adjacent survey rounds in
order to conduct panel data analysis.
The so-called restricted-use data contain more complete information, including the data fields
needed for design-adjusted and panel data analysis. Access to these data requires that the user
apply for permission and agree to strict terms contained in a data use agreement. Practically
speaking, the user needs to write a letter, fill-out a Data Protection Plan, and then snail-mail
these to the person in charge of the CTS at the ICPSR. See the appendix, below, for the letter and
Plan we sent in 2006. In response, the requested data arrived on a single CD about two weeks
after we sent our letter.
The ICPSR people are quite serious about guarding the data while using it as well as destroying
it once you have finished. They contacted Paul in July, 2008, and asked whether we had
destroyed the data. We said “no” and asked for an extension, which they granted, but only until
January, 31, 2009.
3. How to up-load CTS data onto personal computer
The data products in the CTS series are similar to other titles in the ICPSR highly regarded
archives in that they are well documented and include data definition statements that allow the
data to be easily used by a person with basic experience using any of the three major statistical
analysis applications SAS, SPSS or Stata. Data products are identified by the combination of
version (public-use vs. restricted-used), survey type (household vs. physician vs. employer vs.
health plan) and survey round. Each product includes a rectangular ASCII data file, data
definition statements specific to SAS, SPSS or Stata, a very useful and complete user’s guide and
separate codebook (in PDF format) and a small collection of small text files with miscellaneous
notes of potential interest to user.
For example, to read the data into SAS, one would simply need to load the accompanying SAS
data definition file into the SAS Program Editor and update the INFILE statement so that it refers
to the location of the ASCII data file. It’s also a good idea to modify the DATA statement by
supplying a dataset name (otherwise SAS will give the dataset a default name, such as DATA1).
Once the data definition statements (i.e. SAS program) have been updated, they can be submitted
in SAS for execution. Execution of the data definition statements will create user-defined
formats and a SAS dataset with variables properly labeled and formatted, ready for additional
analyses. It’s a good idea to save the statements as a program file, which can then be reused and
expanded to include additional data and proc steps needed for particular analyses.
4. Data analysis
It is highly recommended that the user browse the user guides that accompany CTS survey data,
in order to become acquainted with important features of the CTS data. Most notably, the CTS
surveys follow complex probability sampling survey plans that involve such features as unequal
response probabilities, stratification and clustering. To account for these features when making
point and variance estimates for statistical parameters of interest, the use of SUDAAN software
is the preferred and sometimes only option. The survey design adjustments available in SAS and
Stata can provide reasonable adjustments for many analyses. Guidance on these matters is
available in the user guides and in documentation referenced therein to technical publications
available from the CTS website.
In addition, the user should be aware that missing values for many key variables are replaced by
imputed values. Another notable feature about the data is that respondents in multiple rounds of a
survey are not identifiable by a single ID field. Instead, the CTS supplies adjacent-round panel
ID fields. These allow, say, the linking of round 4 data to round 3 data. In order to link round 4
data to round 2 data, though, one has to make use of this ID as well as the separate panel ID field
on the round 3 survey that allows linking to the round 2 survey.
These and similar features can present serious pitfalls to a user who ignores the information
contained in the user guides! Additional details and guidance are available in the extensive (but
somewhat formidable) collection of publications available at the HSC/CTS website
(www.hschange.org).
5. References
http://www.hschange.com/index.cgi?data=01
6. Appendix # 1. Letter to ICPSR to request data
DEPARTMENT OF PUBLIC HEALTH SCIENCES
UNIVERSITY OF CALIFORNIA
ONE SHIELDS AVENUE
DAVIS, CALIFORNIA 95616-8638
(530) 752-2793
FAX: (530) 752-3239
http://www.phs.ucdavis.edu/
August 3, 2006
Peter Granda
Health and Medical Care Archive
ICPSR
330 Packard, Room 2132
Ann Arbor, MI 48104
Dear Mr. Granda,
I am writing to ask for some Restricted Data from the Community Tracking Study Physician
Survey. I have enclosed three copies of my application. If I have forgotten anything from the
application would you please let me know.
Thank you for considering this request.
Sincerely,
Paul Leigh
Professor of Health Economics
Department of Public Health, TB168
UC Davis Medical School
One Shields Avenue
Davis CA 95616-8638
530-754-8605
pleigh@ucdavis.edu
7.. Appendix # 2. Data Protection Plan
Date: August 3, 2006
Application for Community Tracking Study
Physican Survey, 2000-2001 Restricted Data
File
Contents:
1. Application
2. Vitas
3. Detailed Data Protection Plan
1. Application
For Project Entitled: “New Estimates of Career
Satisfaction Across Specialties”
Name of Principal Investigator: J. Paul Leigh
Title: Professor
Department (if applicable): Public Health Sciences
Organization: University of California Davis Medical School
Street Address: One Shields Ave, TB-168
City, State, ZIP: Davis, California, 95616-8638
Phone: 530-754-8605
Fax: 530-752-3239
Email: pleigh@ucdavis.edu
Name of Co-Principal Investigator (if applicable): None
2. Title of research project for which the CTS Physician Survey, 2000-2001
restricted data file is requested.
New Estimates of Career Satisfaction Across Physician Specialties
3. Short description of research project including research questions, primary
methodology, categories of variables to be used (attach additional sheets if
required).
The proposed project will rely on a prior study entitled:
Physician Career Satisfaction Across Specialties
And written by J. Paul Leigh, PhD; Richard L. Kravitz, MD, MSPH; Mike Schembri, MS;
Steven J. Samuels, PhD; Shanaz Mobley, BS
Arch Intern Med. 2002;162:1577-1584.
ABSTRACT
Background The career satisfaction and dissatisfaction physicians experience
likely influence the quality of medical care.
Objective To compare career satisfaction across specialties among US physicians.
Methods We analyzed data from the Community Tracking Study of 12 474
physicians (response rate, 65%) for the late 1990s. Data are cross-sectional. Two
satisfaction variables were created: very satisfied and dissatisfied. Thirty-three
specialty categories were analyzed.
Results After adjusting for control variables, the following specialties are
significantly more likely than family medicine to be very satisfying: geriatric internal
medicine (odds ratio [OR], 2.04); neonatal-perinatal medicine (OR, 1.89);
dermatology (OR, 1.48); and pediatrics (OR, 1.36). The following are significantly
more likely than family medicine to be dissatisfying: otolaryngology (OR, 1.78);
obstetrics-gynecology (OR, 1.61); ophthalmology (OR, 1.51); orthopedics (OR,
1.36); and internal medicine (OR, 1.22). Among the control variables, we also
found nonlinear relations between age and satisfaction; high satisfaction among
physicians in the west north Central and New England states and high
dissatisfaction in the south Atlantic, west south Central, Mountain, and Pacific
states; positive associations between income and satisfaction; and no differences
between women and men.
Conclusions Career satisfaction and dissatisfaction vary across specialty as well as
age, income, and region. These variations are likely to be of interest to residency
directors, managed care administrators, students selecting a specialty, and
physicians in the groups with high satisfaction and dissatisfaction.
This prior study used the first wave of the CTS data from the 1990s. We would
simply like to update the original study by using the 2000-2001 data .
4. What types of data from other sources will be merged with the CTS Physician
Survey, 2000-2001 restricted data file?
We do not plan to merge any data with the CTS restricted file.
5. State reasons why the CTS Physician Survey, 2000-2001 public use data file is
not adequate for conduct of the research project.
The public use file contains only 6-7 very broad specialties. Since the focus of our
study will be on differences across specialties, we need more than 7 specialties. Our
prior study used the restricted file for 1996-97.
6. Describe all the ways that you intend to use the results of the research, including
plans for public dissemination.
We plan to publish the paper in a good medical journal.
7. Provide names, titles, and affiliations of other members of the research team
who will have access to the restricted data or to output derived from these data. If
not all members have been selected, please list as "unassigned" and indicate the
job titles. Include individuals who are employed by different organizations.
Richard Kravitz MD, Director, Center for Health Services in Primary Care, and
Professor of Medicine, University of California Davis Medical Center; Dan Tancredi
PhD, Senior Research Statistician , the Center for Health Services Research in
Primary Care, University of California Davis Medical Center.
8. If employed at an organization that has a current NIH Multiple Project
Assurances (MPA) Certification Number or Federal Wide Assurances (FWA)
Certification Number, please provide the number and expiration date.
FWA 00004557
9. If a member of the proposed research team, including subcontractors, is
employed at an organization that does not have an NIH Multiple Project Assurances
(MPA) Certification Number or Federal Wide Assurances (FWA) Certification
Number, please respond to the following questions:
Not applicable. Both Tancredi and Kravitz are employed by the University of
California, Davis.
Download