Secondary Data:

Basic Issues

Secondary Data and Youth Violence Prevention Research

Aug 5, 2010 UC DATA

A quick overview

 Secondary data: what is it and where does it come from?

 Survey and Administrative data

 Secondary data: why and how would you want to use it?

 Multiple uses

 Secondary data: where can you find it?

 Sites (archives, research organizations, government agencies)

Strategies (keyword, literature, snowball)

Tools (SDA)

Secondary data: basic characteristics

Primary data

“New” data

Collected to answer specific questions or serve specific needs

Known universe/sample

Tailored data items

Secondary data

“Recycled” data

Collected by others and re-used

Often (but not always) collected for a different use

Value reliant on meta-data (information about the data)

Secondary data: basic characteristics

 Secondary data tend to emerge from two kinds of collection processes:

 Survey data: collection for research purposes, coherent research design, well-defined sampling process, intent to generalize

 Administrative data: collection for program administration or routine record-keeping

 Secondary data may be available either as:

 Microdata: individual level records for a unit of analysis

 Aggregate data: summary counts or statistics across multiple units

 Secondary data may be available either as:

 Cross-sectional: data collected at a single point in time

 Longitudinal data: data collected for the same unit of observation at multiple points in time

ADMINSTRATIVE DATA ARE UBIQUITOUS

What happens:

When you get a parking ticket?

When you go to the emergency room?

When you enroll your child to school?

When you register and vote?

When you are born, marry, or die?

When you pay taxes?

An administrative record is created

Administrative records most closely tied to youth violence include:

Health/Injury records

Criminal Justice records

Educational System records

Child Welfare records

Administrative Data VS Survey Data

Administrative data characteristics

Restricted universe, but can have large amounts of data (millions of observations)

Data collected only for program administration

Other data spotty, even if described in program

Rarely includes participant opinion

Survey Data Characteristics

Well defined sampling process

Usually fewer observations

American community survey (~200K)

GSS (~1500-6000) –

Public Opinion (~1200)

Individual opinions and characteristics often gatherered

A quick overview

 Secondary data: what is it and where does it come from?

 Survey and Administrative data

 Secondary data: why and how would you want to use it?

 Multiple uses

 Secondary data: where can you find it?

 Sites (archives, research organizations, government agencies)

Strategies (keyword, literature, snowball)

Tools (SDA)

Uses of Secondary Data

Exploratory/Preliminary:

What is the ballpark you’re looking at? How much variation is there in your dependent measure? What comparison groups/causal mechanisms can be identified?

Research Design:

 What is the sampling frame and how can it be identified/stratified/clustered?

 How did previous researchers phrase questions/ collect data items?

Uses of Secondary Data

Context:

How “important” is your research question?

How many people/areas will it impact? What are the characteristics of your study population and how does it differ from other populations?

Analysis

 Data allows, in whole or in part, answering a research question.

 Data may be extended or linked to other secondary or primary data collection elements

Advantages of Secondary Data

Cost: original data collector bear burden

Comparability: results may be contrasted with others using same/similar sources

Time: research process can be shortened dramatically

Coverage: data may address points in time or geographies not directly available to researcher

Knowledge/Skill: data collection may use specially trained/knowledgeable staff

Disadvantages/ Concerns about

Secondary Data

Sample design may be unknown/ undocumented

Quality of data elements may vary dramatically

Data collection strategies/problems may be difficult to ascertain

 Data may be gathered for different purposes/ coded in inappropriate ways

Data may be outdated

Cost/ Availability: proprietary or confidential data

A quick overview

 Secondary data: what is it and where does it come from?

 Survey and Administrative data

 Secondary data: why and how would you want to use it?

 Multiple uses

 Secondary data: where can you find it?

 Sites (archives, research organizations, government agencies)

Strategies (keyword, literature, snowball)

Tools (SDA)

ICPSR

(Inter-University Consortium for Political and Social Research) is a membership-based organization which collects data from individual researchers, polling agencies, and governmental and international agencies. Data set cover areas such as political attitudes and behavior patterns, crime and criminal justice, state and national voting records, election studies, census enumerations, economic behavior, family studies, and social atttitudes. Holdings at ICPSR are available to UCB subject to

IP verification. ( www.icpsr.umich.edu

)

Basic Data Search: by keyword

Bibliographic Data Search

Specialty Archives: http://www.icpsr.umich.edu/NACJD/index.html

NACJD: Selected data

National Crime Victimization Survey http://www.icpsr.umich.edu/NACJD/NCVS/

The National Crime Victimization Survey (NCVS) series, previously called the National Crime Survey (NCS), has been collecting data on personal and household victimization since 1973. Data from 1992-2008 available as microdata; a 1979-2004 extract for MSA cores is also available.

National Juvenile Corrections Data http://www.icpsr.umich.edu/NACJD/NCVS/

Includes three series of national juvenile corrections data collections:

Census of Juveniles in Residential Placement (CJRP), Juvenile

Residential Facility Census (JRFC), and the predecessor to the CJRP series, Children in Custody (CIC). CIC is aggregate data, JRFC is facility-level data, CJRP is individual level data. Current access only to

CIC.

NACJD: Selected data

National Incident-Based Reporting System http://www.icpsr.umich.edu/NACJD/NIBRS/

Incident-based reporting system for crimes known to the police; part of the Uniform Crime Reporting (UCR) Program; 1996-2005

Incident: one or more offenses committed by the same offender(s) at the same time and place.

Complex data structure:

Group "A" and Group "B" offenses.

"A" includes assault, homicide, sex offenses, "B" tends to be less serious crimes.

A: administrative record (ID, state, agency, related segments, date/time)

Offense (up to 10:

Property

Victim (up to 999: age, race, ethnicity, relationship to offender)

Offender (up to 99: age, sex, race)

Arrestee (age, sex, race, ethnicity, date of arrest, resident status, disposition)

Example: http://www.icpsr.umich.edu/cgibin/SDA/NACJD/hsda?nacjd+04292-0007

UCR, age;

Specialty Archives http://www.icpsr.umich.edu/icpsrweb/SAMHDA/

SAMHDA: Selected data

Monitoring the Future:

A Continuing Study of the Lifestyles and Values of Youth

Approximately 125 to 140 public and private high schools and approximately 14,000 to 18,000 students are selected in order to provide a representative sample of high school seniors throughout the U.S. In addition, recent samples included 17,000 to 19,000 8th graders from about 180 schools and 14,000 to 18,000 10th graders from about 130 to 140 schools.

Monitoring the Future has been conducted every year since 1975 by researchers at the Institute for Social

Research (ISR), University of Michigan. In 1991, the survey was expanded to include 8th and 10th graders.

High school senior respondents are given one of six different questionnaires, which vary by the extent of questions about drug use and by the behaviors other than drug use that are probed. Core data asked on each, questions about violent/deviant behavior and victimization asked only on form data.

Cross-time question indices at http://www.icpsr.umich.edu/files/SAMHDA/PDF/25382-ug.pdf

(Grade 12) and http://www.icpsr.umich.edu/files/SAMHDA/PDF/25422-ug.pdf

(Grade 8-10)

Health Behavior in School-Aged Children [U.S]

Since 1982, the World Health Organization (WHO) Regional Office for Europe has sponsored a cross-national, school-based study of health-related attitudes and behaviors of young people. These studies, generally known as Health Behavior in School-Aged Children (HBSC), are based on independent national surveys of schoolaged children in as many as 30 participating countries. The HBSC studies were conducted every four years since the 1985-1986 school year. U.S. sample is roughly 15,000 students from 350 schools. ICPSR has 3 waves, from 1995/96 to 2001/02. Questions include bullying/violent actions as recipient and perpetrator, weapons at school. Data from non-US countries are available for secondary research from Norwegian Data

Archive ( http://www.hbsc.org/survey_data.html

).

SAMHDA: Selected data

National Youth Survey (NYS) Series – 7 Waves, 1976-1987

Parents and youth were interviewed about events and behavior of the preceding year to gain a better understanding of both conventional and deviant types of behavior by youths. Data were collected on demographic and socioeconomic status of respondents, disruptive events in the home, neighborhood problems, parental aspirations for youth, labeling, integration of family and peer contexts, attitudes toward deviance in adults and juveniles, parental discipline, community involvement, drug and alcohol use, victimization, pregnancy, depression, use of outpatient services, spouse violence by respondent and partner, and sexual activity. Demographic variables include sex, ethnicity, birth date, age, marital status, and employment of the youths, and information on the marital status and employment of the parents.

http://www.cdc.gov/ViolencePrevention/youthviolence/index.html

http://www.cdc.gov/ViolencePrevention/youthviolence/index.html

WISQARS

WISQARS (Web-based Injury Statistics Query and Reporting System) is an interactive database that provides data about fatal and non-fatal injuries at the aggregate level. Detail by geography, age, sex, race/ethnicity, intent and cause of injury. Tabulations accessed in this system include:

National Electronic Injury Surveillance System-All Injury Program (NEISS-AIP)

NEISS-AIP provides nationally representative data about all types and causes of nonfatal injuries treated in U.S. hospital emergency departments.

National Violent Death Reporting System

Link state-level data on violent deaths for 16 states. NVDRS provides CDC and states with a more accurate understanding of violent deaths.

Fatal Injuries from Death Certificate Records

Death certificate data from the National Vital Statistics System — deaths, death rates, and years of potential life lost (a measure of premature death) by specific causes of injury mortality and common causes of death.

http://www.cdc.gov/ViolencePrevention/youthviolence/index.html

School Health Policies and Programs Study (SHPPS)

SHPPS is a national survey conducted periodically to assess school health policies and programs at state, district, school, and classroom levels.

Youth Risk Behavior Surveillance System (YRBSS)

CDC's YRBSS monitors health risk behaviors that contribute to the leading causes of death and disability among young people in the United States, including violence. Measures include carrying weapons, carrying guns, in a physical fight, injured in a physical fight, hit or physically hurt by boyfriend/girlfriend.

Youth Risk Behavior Survey - online http://apps.nccd.cdc.gov/youthonline/App/Default.aspx

Census of Juveniles in Residential Placement

Monitoring the Future: A Continuing Study of the Lifestyles and Values of Youth

Uniform Crime Reports, Summary Reporting of Offenses and Arrests

National Incident Based Reporting System

National Longitudinal Survey of Youth 1997

The National Youth Risk Behavior Survey

National Crime Victimization Survey

National Child Abuse and Neglect Data System Child File

Aggregate data available from OJJDP http://ojjdp.ncjrs.gov/ojstatbb/dat.html

http://www.ndacan.cornell.edu/index.html

Family Research Lab http://www.unh.edu/frl/frlbroch.htm

Roper Center:

The Roper Center archives data from thousands of surveys with national adult, state, foreign, and special subpopulation samples conducted by Gallup, NORC, CBS, ABC,

Harris, the LA Times, the NY Times, and many other polling organizations. Polls are available from as far back as the mid-

1930’s. Holdings at the Roper Center are also available via IP screening. Over ½ million questions are searchable at the Roper

Center site. ( www.ropercenter.uconn.edu

)

http://www.ropercenter.uconn.edu/

http://www.ropercenter.uconn.edu/

http://www.ropercenter.uconn.edu/

Sociometrics

http://www.socio.com/ssedl.php