Full powerpoint presentation

advertisement
BUILDING A DDI-BASED (3.2)
HARMONIZED DATA EXTRACT
TOOL FOR MIDUS: AN
UPDATE
North American DDI Conference
Overview of Presentation

Background on MIDUS
 Importance
of DDI for
 Harmonization
 Facilitating


discovery and complex analysis
Current Project Goals
Implementation of Project Goals
 Creating
MIDUS DDI 3.2 Instances
 Upgrading MIDUS-Colectica Repository
Background on
Baseline: 1995-96
• Harvard University
• MacArthur Foundation
• N=7,108
• Ages 25-74
• Twin/Sibling samples
MIDUS: Unique Characteristics


Multiple waves (9-10 year interval)
Multiple samples/cohorts






Multidisciplinary design


1 National (MIDUS Core)
2 Milwaukee
3 Japan (MIDJA)
4 National (MIDUS Refresher)
5 Milwaukee (Refresher)
Aging as integrated bio-psycho-social process
Result


N=11,500
34,000 variables
MIDUS: Unique Characteristics




Multiple waves (9-10 year interval)
Multiple samples/cohorts
Multidisciplinary design
Wide use of MIDUS – Open Data
philosophy
 #1
data download at NACDA
 Top 10 data download at ICPSR
 530+ publications
Status of Current DDI Efforts
MIDUS Metadata Repository/Portal
http://midus.colectica.org/
Current Project goals
Under a DDI 3.2 rubric…

1. Harmonization (internal, post-hoc)
 Clarify
related nature of longitudinal and cross-cohort
survey variables (RepresentedVariable)
 Provide information/procedures for reconciliation

2. Custom Data Extract (CDE)
 Allow
researchers to focus on variables of interest
 Facilitate accurate merges across numerous datasets
Harmonization

Concordance table
 MIDUS
P1 concordance table (Google Spreadsheet)
 Includes “Comparability notes”
 Example: Variable A1PA30 “time since last BP test”
 Comparabililty

notes:
“M1 is not directly comparable with M2, MKE, MR, MKER, M3:
M1 responses were coded as number of months, while other
waves broke out number and unit separately.”
 Offer
code for reconciliation
Custom Data Extract

Customized dataset
 Search
variables, use shopping basket
 Include variables from across all MIDUS projects
 Merge
different datasets
 Different formats (csv, SPSS, SAS, Stata)
 Associated DDI codebook
Development Milestones
1. Metadata Quality Report
2. Harmonization
3. Web-based Discoverability
4. Data Extraction
Step 1. Metadata Quality Report


Compare the harmonization spreadsheet to the
Repository
Check for:
 Missing
information
 Inconsistent labels
 Inconsistent data types

Update the metadata to improve quality
Step 2. Harmonization



Use the harmonization spreadsheet
Create a RepresentedVariable for each row
Store these in the repository
Step 3. Web-based Discoverability

Build on top of Colectica Portal
 Searching


and information retrieval out-of-the-box
Add cross-reference tables for easy discoverability
Choose variables or groups of variables to include
in the data extract
Step 4. Data Extraction


Store master data in Colectica Repository
Based on a user’s selected variables, generate:
 Datasets
 CSV,
 HTML
R, SAS, SPSS, Stata
and PDF codebooks
 DDI XML
Progress
 Complete Metadata Quality Report
In Progress Harmonization
Upcoming Web-based Discoverability
Upcoming Data Extraction
Acknowledgement

This research project is supported by a grant from
the National Institute on Aging (R03-AG046312).
Thank you
Barry Radler – UW-Madison (bradler@wisc.edu)
Jeremy Iverson – Colectica (jeremy@colectica.com)
Dan Smith-Colectica (dan@colectica.com)
midus.wisc.edu
www.colectica.com
Download