Breaking down and cutting across silos Robert M. Goerge, Ph.D. FCSM Statistical Policy Seminar Data Communities Coming Together to Support the Enhanced Use of Administrative Records Tuesday, December 4, 2012, 3:40 – 5:30 pm, Washington Convention Center My charge for this session • Compare the challenges facing the different data communities. • Good news • What we really want • The nature of the challenges • The communities Good news first • States and cities are developing their administrative data sources faster than ever • They are even using the data • And they are making the data public, so that data entrepreneurs are creating apps that inform the public and policymakers • There are a number of federal initiatives that are promoting the development (not necessarily the use) of administrative data Examples • Given the national effort to improve our competitiveness, a focus of the federal government has been in education and workforce development. • In June 2012, the U.S. Department of Education (ED) awarded new Statewide Longitudinal Data Systems (SLDS) grants (started in 2005) and the U.S. Department of Labor (DOL) awarded new Workforce Data Quality Initiatives (WDQI) grants (started in 2011). • Eight states received their first SLDS grants (Delaware, Oklahoma, New Jersey, South Dakota, Vermont, West Virginia, Puerto Rico, and the U.S. Virgin Islands). • Three states (Hawaii, New Jersey, and Rhode Island) have new SLDS grants focused on workforce linkages and WDQI grants. • Of course, the Longitudinal Employer-Household Dynamics (LEHD) program is the premier example of linking data to provide greater intelligence around employment. http://www.dataqualitycampaign.org/files/2012%20SLDS%20a nd%20WDQI%20grants.pdf Employment in 2010 was Highest for CPS Graduates Who Enrolled in College Graduated high school: 57% of cohort 26,696 CPS Cohort 47,006 Dropped out of high school: 37% of cohorts 17,281 Left/transferred out of CPS: 6% of cohorts 2,973 Enrolled PostSecondary 70% Employed any quarter in 2010: 73% For those employed: Avg # quarters employed : 3.4 Avg quarterly earnings: $4,832 No Post-Secondary 30% Employed any quarter in 2010: 65% For those employed: Avg # quarters employed : 3.3 Avg quarterly earnings: $4,721 Enrolled PostSecondary 28% Employed any quarter in 2010: 55% For those employed: Avg # quarters employed : 3.1 Avg quarterly earnings: $3,887.76 No Post-Secondary 72% Employed any quarter in 2010: 45% For those employed: Avg # quarters employed : 3.0 Avg quarterly earnings: $,3826.76 Employed any quarter in 2010: 56% For those employed: Avg # quarters employed : 3.1 Avg quarterly earnings: $4,392.18 However … • It’s happening to different degrees in different states and there is a wide variation in who has access to the data that is being created and the quality of the data that is being built. • It’s also taking many years to develop these efforts in states • Best practices have not been disseminated • States often rely on large corporate vendors, who will only go so far, and government agencies don’t have the skilled staff necessary to take full advantage of the efforts • Much of this exists because of … Silos of all kinds • • • • Across levels of gov’t – fed, state, county, city Within levels of government – agency silos Within agencies and across agencies – program silos Across domains – health, education, workforce/employment, law enforcement, anti-poverty • Academic/professional silos – disciplines have their own interests • Advocacy silos • All work to the detriment of comprehensive data made available in a efficient format conducive to policy research and analysis Characteristics of silos • They are someone’s “turf” • They are someone’s special interest • They have their own set of laws, rules and regulations • They have their own data • They have their own specific reason for having data or not having data that does not cross into other silos “Good luck getting the data sharing agreement through our lawyers….” Everything is related • Special interests want us to believe that problems can be addressed one-by-one • But everyone knows that: – Early nutrition and good parenting is related to learning – Learning is related to getting a job – A parent having a job is related to child well-being – Lack of school success is related to criminal behavior • This is why we believe that “integration” or breaking down the silos is necessary in order to make progress—however you define that. For example • Of all the poor people that the state serves, 23 percent of them use about 86 percent of the dollars in Medicaid, the correctional system and the child welfare system. • We also know that the greatest school failure happens in the areas where these 23 percent live. We also know that there are high levels of family violence within these households. • What we don’t know, given current siloed policy and practice regimes is what to do about it at scale. (We have some evidence-based practices that have shown to work at small scales in rather controlled environments.) • This work, which cuts across all silos, allows us, at least, to know where we need to target our social programs. Data collection -- For each area, different flow Frontline collection Data stays local City/county State Federal Researchers Medicaid Claims made by providers for health care Data stays with provider State MMIS Analysts State/City/County policymakers Federal CMS (MSIS) Policymakers Researchers Federal Policymakers UI wage data Employer to State Agency Employer gets back? LEHD Researchers from state State Researchers through Census Federal? Revenue Elementary school data Teacher to District MIS Teacher gets back? Researchers from district under FERPA State SIS (SLDS) Federal? Researchers Interaction with local public sector • 30 years ago, when there was less data, most public sector agencies had handfuls of analysts • Now the Research Director, if there is one, has few, if any analysts • More of a focus on Quality Assurance/Compliance • However, the federal government is requiring evidence-based practice in many areas of human services, which is a major challenge, given the last of research expertise in these agencies Interaction with public sector (cont’d) • Data sharing agreements – More complicated as identity theft became more prevalent – More complicated as FERPA, HIPPA, CFR 42 … – More complicated as leaders and their lawyers viewed information as power and potential negative media • Contracts – Certainly the easiest way to work with government, even though Universities concerned with academic freedom • Evaluations – Done more and more by private, non-university based organizations (MDRC, Mathematica, SRI …) New world order • Data sits in governmental (or private*) databases either static (Census) or continuously being updated by the transactions completed by the government agency or private entity. • When needed or periodically, data is transferred to an analytic engine that conducts a specified analysis – descriptive, multivariate, mapped … • OR, it is posted on a data portal with API capability for anyone (?) to access and distribute the analysis * Google, Twitter, Facebook, Utility company databases . . . Alignment in Chicago (and a few other cities) • New Mayor, Cook County Board President who believe in information and hired in a way that reflects that • Human capital (researchers and programmers) who can make good use of … • Data – “opening” it up and combining it across domains • Public sector budget crises which leads to the need for more information • But, perhaps private resources that can make up for the public sector problems- Philanthropy and Corporate Sector Questions for government to address (not us, maybe) • What data is going to be open and what isn’t? • How do you make data available to administrators, policymakers, and researchers who need to combine data across agencies? Skepticism about government • Politics matter the most—policy and facts come second • There is not enough human capital in government to link to the researchers who can help – Can they provide enough data? – Can they deal with the legal problems in order to share the data? Skepticism about the data • Most social scientists would rightly recommend the city make decisions based on evidence developed from high quality research. To them, that usually means data that they themselves collected or at least had a big hand in collecting OR is blessed by the discipline AND a research design that fits the research question at hand. The end • There are real barriers that lead to data not flowing to those that need it • The nature of these barriers vary from sector to sector and place to place, but there are common themes • These barriers can be addressed and the federal government has to learn how to learn from those places that have had success • Incentives have to be put into place for all jurisdictions to use their data to get smarter about what they are doing – • Reviewing all federal research projects so that they are effectively using administrative data before placing burdens on respondents Extra slides (if needed) Balance between surveys and administrative data • Surveys – Special case of census data • Administrative data – Register-based data • Combinations – Depends on overlap of the two – Importance of LEHD Criteria • Geographic coverage • Topical coverage – Some things you can’t ask in surveys • Policy purpose • Data quality Survey vs. Administrative data Adapted from Wallgren and Wallgren Advantages Disadvantages Surveys based on data collection: sample surveys and censuses Can choose which questions to ask Can be up-to-date Some respondents .. ... do not understand the question ... have forgotten how it was ... do not respond (nonresponse) ... respond carelessly Burden on respondents can be high Expensive Low quality for estimates for small study domains (for sample surveys) Register-based surveys No further burden on the respondent for the statistics Low costs Almost complete coverage of population Complete coverage of time Respondents answer carefully to important administrative questions Good possibilities for reporting for small areas, regional statistics and longitudinal studies Cannot ask questions Dependent on the administrative system’s population, object and variable definitions The reporting of administrative data can be slow; the time between the reference period and when data are available for statistical purposes can be long Changes in the administrative systems make comparisons difficult Variables that are less important for administrative work can be of lower quality