'Longitudinal workforce analysis using routinely collected data: challenges and possibilities' [ppt, 4.09 MB]

advertisement
Longitudinal Workforce
Analysis using Routinely
Collected Data:
Challenges and Possibilities
Shereen Hussein, BSc MSc PhD
King’s College London
Longitudinal Analysis
General advantages
General challenges
• Can control for
individual heterogeneity
• Subject serve as own
control
• Between-subject
variation excluded from
error
• Can better assess
causality than crosssectional data
• Conventional statistical
methods require
independence between
observations
• Longitudinal data are
likely to violate this
assumption
• Missing data due to
attrition
• Data availability
29/5/2012
2
Workforce Data Example:
NMDS-SC
•
•
•
•
•
•
•
Structure
Design
Coverage
Time span
Type of information collected
Data collection and archiving
size
29/5/2012
3
NMDS-SC data structure
Social care providers in England
Complete NMDSSC returns
Aggregate information
on the workforce
Providers’
Database
29/5/2012
Detailed information on
all or some individual
workers
Linkable
workers’
Database
4
NMDS-SC longitudinal analysis:
potential
• Data coverage
• Wide range of providers and individual workers’
information
• Sector specific- uniqueness
• Hierarchical structure
• Workforce development and business sustainability
• Timely
– Demographics, austerity, unemployment
• Economics
– Care costs, including turnover costs
– Pay
• Linkable to local data characteristics
29/5/2012
5
Challenges in NMDS-SC
longitudinal analysis
• No sampling framework
• No regular intervals for data collection
• Irregularities in data completion by different
providers
• Additions/alterations of variables and fields
• Cumulative nature and consequences on
data size and structure
• Archiving
29/5/2012
6
Challenges in NMDS-SC
longitudinal analysis- continued
Computational
• Data size
– Innovation in system
design and architecture
•
Accumulative property
– Scalability of the system
• Changes in data fields
• Variable additions and
omissions
• Data over-ride and
archiving
– Software and hardware
issues
29/5/2012
Methodological
• Unusual patterns of followup
– Censoring
• Variability in the database
over time
• Unbalanced cohort design
• Missing data
– Update frequency
– Attrition
– True exit
• Other methodological issues
7
Providers’ level longitudinal
mapping
• From December 2007 to March 2011
• Linked 18 separate databases on the providers’
level
• Each has records from 13,095 to 25,266 
421,671 valid records included in the
construction
• Number of updates ranged from 0 to 18 per
provider
• Continuous process, more records added every
3 months
29/5/2012
8
8000
6000
0
2000
4000
N Provider
10000
12000
Meta-data analysis: providers with
different number of events
2
29/5/2012
3
4
5
6
7
8
9
10 11 12 13 14 15 16 17 18
N Events
9
Specific example 1: Providers with 18 updates
29/5/2012
10
Specific example 2: Providers with 2 updates
29/5/2012
11
Density distribution plot of providers with at least 2 updates during the period
December 2007 to March 2011
Density
0.0015
0.0010
0.0005
0.0000
29/5/2012
2008-May-01
2009-Sep-13
Update date
2011-Jan-26
12
density distributions of number of days elapsed between two updated
providers’ events
(1,180]
(180,360]
(360,540]
(540,720]
(720,900]
(900,1.08e+03]
0.025
0.020
Density
0.015
0.010
0.005
0.000
29/5/2012
0
500
1000
N days between 2 update points per worker
13
Simple example
using providers’ database:
workforce stability over time
• Longitudinal changes in care workers’
turnover and vacancy rates over time
– From January 2008 to January 2010
• Changes in reasons for leaving the sector,
identified by employers
– Differentiating between those with improved
(reduced) turnover rates and those with worse
(increased) turnover rates
29/5/2012
14
Pre analysis
• Selecting and constructing providers’ panel
– Including those with at least two updates
within +/- 3 months of T1 and T2
– 2953 providers with mean coverage duration of
602d
• Investigate sample representation
• Data quality checks
• Data manipulation/imputation
29/5/2012
15
Some findings: changes in
turnover rates
29/5/2012
16
Reason for leaving and turnover
rate changes
29/5/2012
17
Analysis expansion: next steps
•
•
•
•
Consider changes over a longer period of time
Examine other providers’ characteristics
Different take on panel inclusion criteria
Link to individual workers’ longitudinal
databases to examine relations with detailed
workforce structure
– Pay, qualifications, profile etc.
• Build economic elements within analyses
models, e.g. specific-turnover costs, within the
longitudinal model
29/5/2012
18
Workers’ level longitudinal
analysis
• A much larger database
– Same period of time- over 11M records
• Providers not required to complete information for ‘all’
workers
– Structural/design missing data
– True missing data
• Linkage issues
– more data fields required for identification and linkage
• Considerably large number of variables and fields
– Careful planning; analysis-tailored data retrieval
• Changes in database
– Amendments, new variables etc.
– Programming intensive and demanding models (may not be
replicable for different databases)
19
29/5/2012
600000
150000
Records available
70000 90000
Records cannot be used
Records with missing worker ID
60000
90000
N
350000
Valid records
100000
115000
Records with no update date
5
29/5/2012
10
Data set index
15
20
Issues to consider
• Suitability of models
– Longitudinal structure
– Competing risks
• Measurement window
– Late entry into risk sets
• Use proxies, other variables in the dataset
• Adopt suitable approach/model
– Censoring (LHS and RHS)
• Assumptions
– Guided by:
• Sector-specific knowledge
• Intelligence from other variables in the data
29/5/2012
21
Current longitudinal research
Watch this space!!
• Workforce mobility within the sector
• Occupation durations
• Characteristic-specific probabilities of
exiting or remaining in the sector
• Characteristic-specific probabilities of
moving employer within the sector or
having multiple jobs
• Career pathways within the sector
29/5/2012
22
Acknowledgments
• Thanks to the Department of Health for
funding this work
• Thanks to Skills for Care for providing the
data on regular basis
• Thanks to Analytical Research Ltd for their
technical and quantitative support
29/5/2012
23
Further information
• Shereen.hussein@kcl.ac.uk
• 02078481669
• See:
• http://www.kcl.ac.uk/sspp/departments/ssh
m/scwru/res/knowledge/nmdslong.aspx
29/5/2012
24
Download