SS17.2

advertisement
Data structure for a
discrete-time event history analysis
Jane E. Miller, PhD
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Overview
• Structure of most survey data: One record per
respondent
• Discrete-time event history analysis requires
separate records for each person-time unit at risk of
the event
• Review: How to create one record per spell
• How to create one record per person-time unit
– Components of the dependent variable
– Fixed characteristics
– Time varying characteristics
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Event history analysis: discrete time data
Data preparation for an event history
• Survey data often contains one record per
respondent
• Continuous-time event history data contain one
record per spell
• Discrete-time event history analysis requires one
record per person-time unit within each spell
– E.g., one record for each person-month at risk of divorce,
within each spell at risk of divorce
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Event history analysis: discrete time data
Source data from survey:
1 record per respondent
ID
Date of
Date of Date of 1st
1st
birth marriage divorce
1
2/1/52.
2
Date of Date of
Date of Date of
2nd
2nd
Date of Date 1st Date last
1st child's 2nd child's
marriage divorce death observed observed Gender birth
birth
.
.
.
.
7/15/85 10/1/10
F
.
.
7/15/69 6/22/10.
.
.
.
9/21/85 11/5/10
M
.
.
1/1/97 10/1/04.
.
10/8/85
5/1/05
M
12/5/95.
10/1/02 12/2/85 10/2/02
F
9/21/64 5/11/67
3
3/1/65
8/1/90
4
3/1/42
6/1/63.
.
.
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Event history analysis: discrete time data
Example timelines for study of divorce
M = Married
D = Divorced
L = Lost to follow-up
O = Censored by end of study.
X = Died
Case 1: Never married -> no spells
Case 2: Married once, censored
by end of survey
Case 3: Married twice, lost to
follow-up before end of survey
Case 4: Married once, died
before end of survey
M
O
M
Not married -> not at risk of
divorce -> not part of a spell
M
D
M
L
X
End of observation period
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Event history analysis: discrete time data
Continuous-time event history data
• One record for each period at risk (spell)
– Duration of overall spell
– Event indicator at end of spell
Date Duration Status Divorce Age first Age at Age last
# kids at
Spell #
spell of spell at end event observed start of observed
start of
ID (marriage #) started (mos.) of spell indicator
(yrs) spell (yrs) (yrs) Gender spell
2
1
6/22/10
3.5
0
0
16
40
41
male
0
3
1
8/1/90
76.5
1
1
20
25
45
male
0
3
2
10/1/04
6.5
2
0
20
39
45
male
1
4
1
6/1/63
474.5
3
0
43
21
60
female
0
Event history analysis: discrete time data
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Event history timeline:
Discrete time specification
Four person-month units
Case 2, Continuous time version:
One four-month spell
Married 6/22/2010
Last surveyed 11/5/2010
Case 2, Discrete-time version: Each person-month unit becomes one record -> unit of
analysis. All records for each spell include respondent ID and other characteristics.
1st person-month
Married
O
O
O
3rd person-month
O
O
4th person-month
O
2nd
person-month
O = Censored
End of survey
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Event history analysis: discrete time data
Discrete-time data set:
ID codes on person-time records
One record per spell
ID
2
3
3
Duration Status
Spell #
of spell at end Divorce
(marriage #) (mos.) of spell indicator
1
4
0
0
1
77
1
1
2
7
2
0
• Each person-month record
carries the respondent ID
• Each record within a given
spell also includes the spell
# for that respondent
One record
per person-month
ID
2
2
2
2
3
3
3
3
3
3
3
3
3
Spell #
Record #
(marriage #) w/in spell
1
1
1
2
1
3
1
4
1
1
1
2
1
3
1
…
1
77
2
1
2
2
2
…
2
7
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Event history analysis: discrete time data
Record number within spell
One record per spell
ID
2
3
3
Duration Status
Spell #
of spell at end Divorce
(marriage #) (mos.) of spell indicator
1
4
0
0
1
77
1
1
2
7
2
0
• Each month in a spell will generate one
person-month record, e.g.,
– respondent #2 is observed for 4
months -> 4 person-month records
– respondent #3 contributes a total of
84 records
• 77 in his first spell
• 7 in his second spell
One record
per person-month
ID
2
2
2
2
3
3
3
3
3
3
3
3
3
Spell #
Record #
(marriage #) w/in spell
1
1
1
2
1
3
1
4
1
1
1
2
1
3
1
…
1
77
2
1
2
2
2
…
2
7
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Event history analysis: discrete time data
Month counter within spell
One record per spell
Duration Status
Spell #
of spell at end Divorce
ID (marriage #) (mos.) of spell indicator
2
1
4
0
0
3
1
77
1
1
3
2
7
2
0
The “month # within spell” counter
indicates the start time of the
person-month at risk for that
record. E.g., the first record for a
given spell starts at baseline (time
point 0).
One record per person-month
ID
2
2
2
2
3
3
3
3
3
3
3
3
3
month #
Spell #
Record # within
(marriage #) w/in spell spell
1
1
0
1
2
1
1
3
2
1
4
3
1
1
0
1
2
1
1
3
3
1
…
…
1
77
76
2
1
1
2
2
2
2
…
…
2
7
6
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Event history analysis: discrete time data
Duration measure for each record within spell
One record per person-month
One record per spell
Duration Status
Spell #
of spell at end Divorce
ID (marriage #) (mos.) of spell indicator
2
1
4
0
0
3
1
77
1
1
3
2
7
2
0
The duration measure will
= 1 time units for all person-time
records within a given spell
EXCEPT = 0.5 for the last month in a
spell
ID
2
2
2
2
3
3
3
3
3
3
3
3
3
PersonSpell # Record month # months
(marriage # w/in within w/in
#)
spell
spell record
1
1
0
1
1
2
1
1
1
3
2
1
1
4
3
.5
1
1
0
1
1
2
1
1
1
3
3
1
1
…
…
1
1
77
76
.5
2
1
1
1
2
2
2
1
2
…
…
1
2
7
6
.5
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Event history analysis: discrete time data
Status indicator for each record within spell
One record per person-month
One record per spell
Duration Status
Spell #
of spell at end Divorce
ID (marriage #) (mos.) of spell indicator
2
1
4
0
0
3
1
77
1
1
3
2
7
2
0
The indicator for status at end of
record will = 0 for all person-time
records within a given spell EXCEPT
the last one because by definition
they end in censoring (the spell is
not yet complete)
ID
2
2
2
2
3
3
3
3
3
3
3
3
3
Person- Status
Spell # Record month # months at end
(marriage # w/in within w/in
of
#)
spell
spell record record
1
1
0
1
0
1
2
1
1
0
1
3
2
1
0
1
4
3
.5
0
1
1
0
1
0
1
2
1
1
0
1
3
3
1
0
1
…
…
1
0
1
77
76
.5
1
2
1
1
1
0
2
2
2
1
0
2
…
…
1
0
2
7
6
.5
2
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Event history analysis: discrete time data
Status indicator for last record within spell
One record per spell
One record per person-month
Duration Status
Spell #
of spell at end Divorce
ID (marriage #) (mos.) of spell indicator
2
1
4
0
0
3
1
77
1
1
3
2
7
2
0
The indicator for status at end
of record for the last persontime record within each spell
will take on the value of the
status indicator for the overall
spell
ID
2
2
2
2
3
3
3
3
3
3
3
3
3
Person- Status
Spell # Record month # months at end
(marriage # w/in within w/in
of
#)
spell
spell record record
1
1
0
1
0
1
2
1
1
0
1
3
2
1
0
1
4
3
.5
0
1
1
0
1
0
1
2
1
1
0
1
3
3
1
0
1
…
…
1
0
1
77
76
.5
1
2
1
1
1
0
2
2
2
1
0
2
…
…
1
0
2
7
6
.5
2
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Event history analysis: discrete time data
Event indicator for each record within spell
One record per person-month
One record per spell
Duration Status
Spell #
of spell at end Divorce
ID (marriage #) (mos.) of spell indicator
2
1
4
0
0
3
1
77
1
1
3
2
7
2
0
ID
2
2
2
2
3
3
3
3
3
3
3
3
3
month # Divorce
Spell #
Record # within indicator
(marriage #) w/in spell spell for record
1
1
0
0
1
2
1
0
1
3
2
0
1
4
3
0
1
1
0
0
1
2
1
0
1
3
3
0
1
…
…
0
1
77
76
1
2
1
1
0
2
2
2
0
2
…
…
0
2
7
6
0
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Event history analysis: discrete time data
Fixed covariates for each person-time record
Age, number of children at start of spell, and gender do not change during
the course of a spell, so they have the same value for each person-time
record within a given spell
ID
2
2
2
2
3
3
3
3
3
3
3
3
3
Spell #
Record #
(marriage #) w/in spell
1
1
1
2
1
3
1
4
1
1
1
2
1
3
1
…
1
77
2
1
2
2
2
…
2
7
Event history analysis: discrete time data
month # Divorce Age at
within indicator start of
spell for record spell (yrs) Gender
0
0
40
male
1
0
40
male
2
0
40
male
3
0
40
male
0
0
25
male
1
0
25
male
3
0
25
male
…
0
25
male
76
1
25
male
1
0
39
male
2
0
39
male
…
0
39
male
6
0
39
male
# children
at start of
spell
0
0
0
0
0
0
0
0
0
1
1
1
1
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Example timelines for number of children
as time-varying covariate in study of divorce
Columns reordered into chronological order
ID
Date of
Date of
2nd
Date of Date 1st Date of 1st 1st child's child's
birth observed marriage birth
birth
3
3/1/65 10/8/85
8/1/90 12/5/95.
4
3/1/42 12/2/85
6/1/63 9/21/64 5/11/67.
Date of Date of Date of
1st
2nd
2nd
divorce marriage divorce
1/1/97 10/1/04.
.
M
C
Date of
death
.
.
D
Date last
observed
5/1/05
10/1/02
M
10/2/02
L
Case 3:
No kids
Case 4:
M
No kids
C
One kid
One kid
X
C
Two kids
M = Married D = Divorced C = Child born
L = Lost to follow-up O = Censored by end of study. X = Died
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Event history analysis: discrete time data
Discrete time with
time-varying covariates
• Case 3 has his first child 64
months into his first
marriage, and no
additional children while
observed. # kids at start of
record is
 0 for his first 63 records of
spell 1
 1 for records 64 through 77
of spell 1
 1 for all records in spell 2
ID
3
3
3
3
3
3
3
3
Divorce
# kids at
month # indicator for start of
Spell # w/in spell
record
spell
1
0
0
0
1
1
0
0
1
…
0
0
1
64
0
0
1
77
1
0
2
0
0
1
2
…
0
1
2
6
0
1
# kids at
start of
record
0
0
0
1
1
1
1
1
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Event history analysis: discrete time data
Discrete time with
time-varying covariates
• Case 4 has her first child
15 months into her
marriage, a second child
in month 47 after
marriage. For her the #
kids at start of record is
 0 for her first 15 records
 1 for records 15 through
46
 2 for records 47 or higher,
all in spell 1
ID
4
4
4
4
4
4
4
Divorce
# kids at
month # indicator for start of
Spell # w/in spell
record
spell
1
0
0
0
1
…
0
0
1
15
0
0
1
…
0
0
1
47
0
0
1
…
0
0
1
474
0
0
# kids at
start of
record
0
0
1
1
2
2
2
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Event history analysis: discrete time data
Presenting information on event
history construction: Background work
• Most of the gory details of creating an event history are part
of behind-the-scenes work
– Important to do consistency checks to make sure event histories
were created correctly given
•
•
•
•
Original data source of information for timeline construction
Type of event under study
Fixed covariates
Time-varying covariates
– E.g., correct
•
•
•
•
Number of spells per respondent
Number of person-time records for each spell
Duration and event indicators for each person-time record
Values of fixed- and time-varying covariates for each person-time record
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Event history analysis: discrete time data
Presenting information on event
history construction
• In the data and methods section, describe:
– Original data source of information for timeline construction
• Dates, status, duration of events
–
–
–
–
–
Type of event under study
Unit of person-time (e.g., person-years, person-months)
What constitutes censoring
Fixed covariates
Time-varying covariates
• Source(s) of information for determining timing of changes in those
variables
• See checklist in chapter 17 of Writing about Multivariate
Analysis, 2nd Edition for more detail on what to report
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Event history analysis: discrete time data
Summary
• A discrete-time event history analysis requires a separate
record for each person-time unit at risk of the event
• For each respondent, create correct number of spells
• For each spell, calculate
– Correct number of person-time units
– Components of the dependent variable
• Duration measure
• Event indicator
– Fixed characteristics
– Time-varying characteristics
• In data and methods section, describe data sources and
variables for the event history
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Event history analysis: discrete time data
Suggested resources
• Allison, P. D. 2010. Survival Analysis Using the
SAS System: A Practical Guide, 2nd Edition.
Cary, NC: SAS Institute.
• Miller, J. E. 2013. The Chicago Guide to Writing
about Multivariate Analysis, 2nd Edition.
University of Chicago Press, chapter 17.
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Event history analysis: discrete time data
Suggested online resources
• Podcast on data structure for a continuoustime event history analysis
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Event history analysis: discrete time data
Suggested exercises
• Study guide to The Chicago Guide to Writing
about Multivariate Analysis, 2nd Edition.
– Question #3a in the problem set for chapter 17
– Suggested course extensions for chapter 17
• “Reviewing” exercises #2a through 2h
• “Applying statistics and writing” exercises #1 and 2a
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Event history analysis: discrete time data
Contact information
Jane E. Miller, PhD
jmiller@ifh.rutgers.edu
Online materials available at
http://press.uchicago.edu/books/miller/multivariate/index.html
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Event history analysis: discrete time data
Download