SS17.1

advertisement
Data structure for a
continuous-time event history analysis
Jane E. Miller, PhD
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Overview
• Structure of most survey data: one record per
respondent
• Event history analysis requires separate records for
each period at risk of the event
– One record per spell
• How to create one record per spell
– Components of the dependent variable
– Fixed characteristics
– Time-varying characteristics
Event history analysis: Continuous time data
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Data preparation for an event history
• To conduct a continuous-time event history analysis
requires one record per period at risk
– Also known as a spell
• Survey data often contains one record per
respondent
• Creating an event history data set involves
generating one record per spell
– Creating the components of the dependent variable
– Mapping the fixed covariates onto each spell
– Calculating time-varying covariates for each spell
Event history analysis: Continuous time data
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Source data:
1 record per respondent
ID
1
2
Date of
birth
Date of Date of Date of Date of
1st
1st
2nd
2nd
marriage divorce marriage divorce
2/1/52.
Date of
death
Date of Date of
1st
2nd
Date 1st Date last
child's child's
observed observed Gender birth
birth
.
.
.
.
7/15/85 10/1/10
F
.
.
7/15/69 6/22/10.
.
.
.
9/21/85 11/5/10
M
.
.
.
10/8/85
5/1/05
M
12/5/95.
10/1/02 12/2/85 10/2/02
F
9/21/645/11/67
3
3/1/65
8/1/90 1/1/97 10/1/04.
4
3/1/42
6/1/63.
.
.
Event history analysis: Continuous time data
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Creating an analytic data set
for study of divorce Fixed covariate
Dates of beginning of
period(s) at risk
Dates pertaining to
Dates pertaining to
Dates of event censoring and status at
time-varying
end of observation
covariates
ID
1
2
Date of
birth
Date of Date of Date of Date of
1st
1st
2nd
2nd
marriage divorce marriage divorce
2/1/52.
Date of
death
Date of Date of
1st
2nd
Date 1st Date last
child's child's
observed observed Gender birth
birth
.
.
.
.
7/15/85 10/1/10
F
.
.
7/15/69 6/22/10.
.
.
.
9/21/85 11/5/10
M
.
.
.
10/8/85
5/1/05
M
12/5/95.
10/1/02 12/2/85 10/2/02
F
9/21/645/11/67
3
3/1/65
8/1/90 1/1/97 10/1/04.
4
3/1/42
6/1/63.
.
.
Event history analysis: Continuous time data
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Example timelines for study of divorce
M = Married
D = Divorced
L = Lost to follow-up
O = Censored by end of study.
X = Died
Case 1: Never married -> no spells
Case 2: Married once, censored
by end of survey
Case 3: Married twice, lost to
follow-up before end of survey
Case 4: Married once, died
before end of survey
M
Start of observation period
Event history analysis: Continuous time
O
M
Not married -> not at risk of
divorce -> not part of a spell
M
D
M
L
X
End of observation period
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Calculating number of spells
for each respondent
• Each respondent contributes a spell for each time they are at
risk of the event under study
• If they are never at risk -> no spells
– Thus some respondents in the original data set might not be included in
the event history analysis. E.g., in an analysis of
• Getting married, anyone who was already married throughout the period of
observation is not at risk of becoming married -> no spells
• Getting divorced, the same respondent would be at risk the entire time!
• If they are at risk once -> one spell
– No more than one spell/respondent for non-repeatable events like death
• For repeatable events -> potential for multiple spells
– In an event history analysis of divorce, anyone who is observed during
two periods of marriage contributes two spells
Event history analysis: Continuous time data
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Example spells for a study of divorce
M = Married
D = Divorced
L = Lost to follow-up
O = Censored by end of study.
X = Died
Case 1: Never married ->
Contributes zero spells to the
divorce event history data set
Case 2: Married once, censored
by end of survey. Contributes
one open spell
Case 3: Married twice, lost to
follow-up before end of
survey. Contributes two total
spells: one closed and one
open (censored)
Case 4: Married once, died
before end of survey.
Contributes one open spell
Event history analysis: Continuous time
O
M
M
D
M
L
Event under
study = divorce
X
M
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Event history data: Continuous time
1 record per spell
Case 1 does not contribute
ANY spells at risk of divorce
because she was never married
Cases 2 and 4 each
contribute 1 spell, because
each was married once
Case 3 contributes 2 spells
(periods at risk of divorce),
1 for each time married
Date
Spell #
spell
ID (marriage #) started
2
1
6/22/10
3
1
8/1/90
3
2
10/1/04
4
1
6/1/63
Event history analysis: Continuous time data
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Dependent variables for an event history
• Duration of spell
• Event indicator
– Can be dichotomous
• Occurred or not
– Can be multichotomous
• Differentiate between different reasons for nonevent
– Death
– Lost to follow-up
• Both components must be constructed from
information about the respondent’s timeline
Event history analysis: Continuous time data
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Measures of duration replace most dates
• In the event
history data set,
measures of
duration replace
Date Duration Age first Age at Age last
Spell #
spell of spell observed start of observed
ID (marriage #) started (mos.)
(yrs) spell (yrs) (yrs)
2
1
6/22/10
3.5
16
40
41
3
1
8/1/90
76.5
20
25
45
3
2
10/1/04
6.5
20
39
45
4
1
6/1/63
474.5
43
21
60
Event history analysis: Continuous time
– Dates of onset of
risk period
– Event occurrence
– Censoring
• Calculated from
distance between
dates from event
history
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Duration calculations
• Unless precise dates are known, events or censoring
are assumed to have occurred half-way through the
period, yielding 0.5 person-units of exposure.
– Assuming that an event occurred in the middle of a time
period corresponds to a constant risk of the event during
that time interval (Trussell and Hammerslough, 1983)
• If exact dates are known, fractional person-time units
can be assigned accordingly.
– For instance, if a person was divorced on March 10, they
would be assigned 10/30 or 0.333 person-months at risk in
that month.
Event history analysis: Continuous time data
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Detailed indicator of status at end of spell
Date Duration Status Divorce
Spell #
spell of spell at end event
ID (marriage #) started (mos.) of spell indicator
2
1
6/22/10
3.5
0
0
3
1
8/1/90
76.5
1
1
3
2
10/1/04
6.5
2
0
4
1
6/1/63
474.5
3
0
Coding of status indicator:
0 = censored
1 = divorced
2 = lost to follow-up (LFU)
3 = died
Case 2: Married once, still
married at end of survey
Case 3: First marriage ended
in divorce
Case 3: Married second time,
lost to follow-up in 2005
Case 4: Married once, died in
2002
Coding of divorce event indicator:
0 = censored, LFU, died
1 = divorced
Event history analysis: Continuous time data
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Dependent variable components
Date Duration Status Divorce Age first Age at Age last
Spell #
spell of spell at end event observed start of observed
ID (marriage #) started (mos.) of spell indicator
(yrs) spell (yrs) (yrs) Gender
2
1
6/22/10
3.5
0
0
16
40
41
male
3
1
8/1/90
76.5
1
1
20
25
45
male
3
2
10/1/04
66.5
2
0
20
39
45
male
4
1
6/1/63
474.5
3
0
43
21
60
female
Dependent variable
component #1 –
duration measure
Dependent variable
component #2 –
dichotomous event indicator
Event history analysis: Continuous time data
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Fixed and time-varying covariates
• Covariate = independent variable
• Fixed covariates are those that have the same value
for a given respondent throughout the spell
– E.g., except in rare cases, each person’s gender remains
constant
• Time-varying covariates are those whose values can
change for a respondent between or during spells
– E.g., number of children
• Need to map each of these correctly from the onerecord-per-respondent onto one-record-per-spell
Event history analysis: Continuous time data
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Fixed covariates
Age at start of spell and gender do not
change during the course of a spell
Date Duration Status Divorce
Age at
Spell #
spell of spell at end event
start of
ID (marriage #) started (mos.) of spell indicator spell (yrs) Gender
2
1
6/22/10
3.5
0
0
40
male
3
1
8/1/90
76.5
1
1
25
male
3
2
10/1/04
66.5
2
0
39
male
4
1
6/1/63
474.5
3
0
21
female
Event history analysis: Continuous time data
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Time-varying covariate
None of these respondents had children prior to their first
marriage (though theoretically some cases could).
Respondent #3 had one child during his first marriage, so the
value of that variable changes between his first and second
marriages (first and second spells at risk of divorce).
Date Duration Status Divorce Age first Age at Age last
# kids at
Spell #
spell of spell at end event observed start of observed
start of
ID (marriage #) started (mos.) of spell indicator
(yrs) spell (yrs) (yrs) Gender spell
2
1
6/22/10
3.5
0
0
16
40
41
male
0
3
1
8/1/90
76.5
1
1
20
25
45
male
0
3
2
10/1/04
6.5
2
0
20
39
45
male
1
4
1
6/1/63
474.5
3
0
43
21
60
female
0
Event history analysis: Continuous time data
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Presenting information on event
history construction: Background work
• Most of the gory details of programming creation of
an event history are parts of behind-the-scenes work
– Important to do consistency checks to make sure event
histories were created correctly given
•
•
•
•
Original data source of information for timeline construction
Type of event under study
Fixed covariates
Time-varying covariates
– E.g., correct
• Number of spells for each respondent
• Duration and event indicators for each spell
Event history analysis: Continuous time data
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Presenting information on event
history construction
• In the data and methods section, describe:
– Original data source of information for timeline construction
• Dates, status, duration of events
–
–
–
–
Type of event under study
What constitutes censoring
Fixed covariates
Time-varying covariates
• Source(s) of information for determining timing of changes in those
variables
• See checklist in chapter 17 of Writing about Multivariate
Analysis, 2nd Edition for more detail on what to report
Event history analysis: Continuous time data
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Summary
• A continuous-time event history analysis requires a
separate record for each period at risk of the event
• For each spell, calculate
– Components of the dependent variable
• Duration measure
• Event indicator
– Fixed characteristics
– Time-varying characteristics
• In data and methods section, describe data sources
and variables for the event history
Event history analysis: Continuous time data
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Suggested resources
• Allison, P. D. 2010. Survival Analysis Using the SAS
System: A Practical Guide, 2nd Edition. Cary, NC: SAS
Institute.
• Trussell, James, and Charles Hammerslough. 1983. “A
Hazards-Model Analysis of the Covariates of Infant
and Child Mortality in Sri Lanka.” Demography 20 (1):
1–26.
• Miller, J. E. 2013. The Chicago Guide to Writing about
Multivariate Analysis, 2nd Edition. University of
Chicago Press, chapter 17.
Event history analysis: Continuous time data
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Suggested online resources
• Podcast on data structure for a discrete-time
event history analysis
Event history analysis: Continuous time data
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Suggested exercises
• Study guide to The Chicago Guide to Writing
about Multivariate Analysis, 2nd Edition.
– Question #3a in the problem set for chapter 17
– Suggested course extensions for chapter 17
• “Reviewing” exercises #2a through 2h
• “Applying statistics and writing” exercises #1 and 2a
Event history analysis: Continuous time data
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Contact information
Jane E. Miller, PhD
jmiller@ifh.rutgers.edu
Online materials available at
http://press.uchicago.edu/books/miller/multivariate/index.html
Event history analysis: Continuous time data
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Download