DART in large infrastructure projects

advertisement
Workshop Improving Data Access and Research Transparency in Switzerland: Bren, November 2014
DART in large infrastructure projects
A case study of the European Social Survey
Michael Breen, Mary Immaculate College, University of Limerick, IRELAND
Rationale for DART (Lupia & Alter, 2014)
To more effectively and rigorously answer questions about the value
of quantitative social science, it is imperative that those of us who
conduct such research take actions that reinforce its credibility and
make it easier for others to interpret our findings accurately. This
means sharing our data whenever possible. It also means making
available a complete description of the steps that we used to convert
data about the social world into quantitative claims about how it does
and does not work. Such commitments will not only help others more
accurately assess our claims about individual events but also increase
the extent to which others will view as credible our attempts to draw
generalizations about people, policies, and institutions from a series of
numerical simplifications and logical transformations.
European Social Survey (ESS)
• The European Social Survey was established in 2001 as an
academically-driven social survey designed to chart and explain the
interaction between Europe's changing institutions and the attitudes,
beliefs and behaviour patterns of its diverse populations.
• Currently in the midst of its seventh round, this biennial crosssectional survey covers more than thirty nations and employs the
most rigorous methodologies.
• All data, questionnaires, an interactive analysis tool (NESSTAR) and an
educational resources for use in HEIs are all available online at
http://www.europeansocialsurvey.org/.
NC meeting:
- Meeting papers and minutes from previous NC
meetings
Help:
- Lists all ESS contact e-mails
- Detailed sitemap: where to find what on the
intranet
General information:
- ESS7 Project specificaton and Timetable
- Information on and list of ESS7 Country
contacts
Prepare for Fieldwork:
-
Fieldwork Questionnaire
Sampling guidelines, with list of assigned sampling expert
Sampling design forms, when these are finalised
Translation and verification
SQP coding
Media Claims
Fieldwork Documents:
- Source Questionnaire
- ESS7 questionnaire, showcards, rotating module consultation
information, questionnaire alerts.
- Rotating module development
-
Response enhancement
Contact forms (end of April).
Interviewer briefings (when finalised).
Fieldwork reporting
Prepare data:
- Data Protocol and variable definitions
- ESS 2014 Data Protocol (due June)
- SPSS/SAS variable definition programs
- Standards for post coded variables, with guidance documents
- International standards (Occupation, Industry, Country, Language)
- ESS Specific Standards (Education, Religion, Ancestry)
- ESS6 Processing Reports (due June)
- Question consultation outcomes
- Religion, Education, Marital Status and Ancestry
- UPCOMING: Alcohol
Survey Documentation – National Technical Summary
- Download National Technical Summary and
Appendices (due September);
-
Education
Income
Marital Status
Political Parties
Deposit Data
- Secure deposit of all data and documentation;
- Lists all deliverables – both data and documents
- Lists all files that have been deposited
View archive processing for your country
- Transparency: Gives you access to your files during processing
- Further information will be provided by you archive contact as
processing develops
- When processing is finished we will send you a Draft file in
confidence for you to validate.
Data deliverables
• Data from Main questionnaire
• Data from Supplementary questionnaires
• Data from Interviewer questionnaire
• Call record/contact form data
• Parents' occupation
• Sample design data file (SDDF)
• Raw data from main and supplementary questionnaires
• Media Claims file
Documentation deliverables
•
•
•
•
•
•
•
•
•
•
•
•
Main questionnaire
Supplementary questionnaire (all versions)
Interviewer questionnaire
Contact form
Show cards (from the main and supplementary questionnaires)
National Technical Summary (NTS) with appendices (education, income, political parties and
marital and relationship status)
Population statistics
Interviewer and fieldwork instructions
Interviewer briefing and training material
Advance letters, brochures etc.
Media landscape
Final (T)VFFs
Processing
• The processing is organised in two main steps, each leading up to
standardised reports. The reports contain a summary of the programmes,
files and output produced during the processing as well as queries that the
Archive will need feedback on to produce the national files that will later
be integrated into the international data file for Round 7.
• When the Archive has completed the processing of the national data file, a
draft file will be provided for NCs to approve of the processing carried out
by the Archive. All NCs are responsible for the validity of their national
data. All national files will be subject to further quality checks by the HQCST and the QDTs when a draft international file is available.
• A complete deposit of all deliverables is a prerequisite for a country to be
included in the integrated released file.
Dire warnings to NCs
• No national data (or interpretations of such data) can
be released, published or reported in any way until
the data has been officially released by the ESS
Archive at NSD. Thereafter, the data will be available
without restriction for non-commercial use, scientific
research, knowledge and policy making in all
participating countries and beyond to quarry at will.
Compliance 1
• The first group of compliance issues are particularly central.
Therefore, all members and observer countries are asked to ensure
that they:
- field the complete ESS Round 7 questionnaires,
- deliver a Sample Design Data File (SDDF) which allows the
calculation of inclusion probabilities,
- make a complete delivery of ESS Round 7 data (including
the contact form data) and documentation to the ESS
Archive at NSD before 1 September 2016.
Compliance 2
• The second group of compliance issues relate to the quality assurance
procedures imposed by the HQ-CST. This means in particular that a
country has to have finalised the following before fieldwork starts:
- the translation, verification and SQP procedures for the ESS
Round 7 questionnaire,
- the sign off procedure for the sampling design,
- the sign off procedure of the fieldwork questionnaire (FWQ);
Compliance 3
The third set of compliance issues arise if quality control analyses
performed by the HQ-CST (or other parties) reveal serious doubts as
regards data quality. This may, for instance, include indications of
- very high design or interviewer effects, indications of
very large nonresponse bias or
- very low measurement quality (reliability/validity) of the
data (including large amounts of missing data).
• Respondent substitution and interviewer fraud are also serious
threats to data quality.
Compliance 4
• The fourth area of compliance relates to data release. ESS data is a
public good. NCs must ensure that no national data is released until
the official data release via the ESS archive. This allows the data to be
properly checked prior to release and ensures equal access to the
data for all.
• In the event of a breach of any of these four key compliance
considerations, the HQ-CST reserves the right not to include the
country data in the integrated file. In these cases, the representative
for that country in the ESS ERIC General Assembly will be informed of
this decision.
Post release issues
• Individual usage of the data
• No control on how the data are used
• Require registration of all users
• Create ongoing citation database of research using ESS data
• Need better resources to review pubished papers
• Considering mentoring process for novice researchers using ESS
Best practice in closed data: LIS
• Luxembourg Income Study
• LIS provides access to the LIS and LWS Databases in three ways: LISSY,
the Web Tabulator, and the LIS Key Figures. Access through LISSY or
the Web Tabulator requires registration. The LIS Key Figures are publically
accessible and provide standard statistics based on the LIS Database.
• LISSY is a remote-execution data access system for the LIS and LWS
microdata. LISSY allows registered users to submit programs using common
statistical software packages, while respecting the confidentiality
restrictions imposed by certain countries.
• Remote execution is enabled through two submission paths, a Job
Submission Interface (JSI) or Email
The scale of the problem
• In a recent large study (Daniele Fanelli, Plos One, May 2009):
• 1.97 per cent of scientists admitted to outright falsification of data
• 33.7 per cent admitted poor practices
- dropping data based on a "gut feeling"
- selectively reporting results that supported their hypotheses
• 70 per cent said they had seen colleagues doing this
• PubMed had 788 papers withdrawn after publication between 2000 and 2010,
70% on error and 30% because of fraud.
• Japanese anesthesiologist Yoshitaka Fujii falsified data in 172 of 212 of his papers
published between 1993 and 2011.
• Shigeaki Kato has notched his 26th, 27th, and 28th retractions, all in Nature Cell
Biology. The three papers have been cited a total of 677 times.
The appeal of using partial data (De Vries, 2014)
• … you could “simplify”. After all, most of your results are in line with
your predictions, so your theory is probably right. Why not leave
those “aberrant” results out of the paper? There is probably a good
reason why they turned out like that. Some anomaly. Nothing to do
with your theory really.
• Nowhere in this process do you feel like you are being deceptive. You
just know what type of papers are easiest to publish, so you chip off
the “boring” complications to achieve a clearer, more interesting
picture. Sadly, the complications are probably closer to messy reality.
The picture you publish, while clearer, is much more likely to be
wrong.
Core problem
• Lack of clarity about methodology
• Cf Shoemaker, Tankard and Lasorsa
• No access to the code/syntax files
• No clarity regarding data recoding
• Ambiguity over interpretation
Proposed criteria for best practice in DART
(ICPSR, Michigan, 2014)
• Cites all the evidence and methods upon which published claims rely
• Makes available all evidence and methods upon which published claims
rely, including numeric data, code, and all other materials necessary to
replicate findings;
• Ensures that cited objects are available at the time of publication, subject
to any ethical or legal limitations, through institutions with demonstrated
capacity to provide long-term access;
• Recognizes that full access to data may not be possible when data are
under external restriction (e.g., the data are classified, require
confidentiality protections, or were obtained under a non-disclosure
agreement);
• Upon request, provides data to editors and reviewers prior to publication
for assessment only and under a strict assurance of confidentiality.
References
• Fanelli D. (2009). How Many Scientists Fabricate and Falsify Research? A Systematic
Review and Meta-Analysis of Survey Data. PLoS ONE 4(5): e5738.
doi:10.1371/journal.pone.0005738
• De Vries, R. (2014). When ‘exciting’ trumps ‘honest’, traditional academic journals
encourage bad science. http://theconversation.com/when-exciting-trumps-honesttraditional-academic-journals-encourage-bad-science-29804
• ICPSR (2014). Research Transparency, Data Access, and Data Citation: A Call to Action for
Scholarly Publications. http://datacommunity.icpsr.umich.edu/research-transparencydata-access-and-data-citation-call-action-scholarly-publications
• Lupia, A. And Alter, G. (2014). Data Access and Research Transparency in the
Quantitative Tradition . PS: Political Science & Politics, 47, pp 54-59.
doi:10.1017/S1049096513001728.
• Shoemaker, P. J., Tankard, J. W., & Lasorsa, D. L. (2004). How to build social science
theories. Thousand Oaks, CA: Sage
Contact Details
• Prof. Michael Breen
Dean of Arts
Mary Immaculate College, University of Limerick
• Address
Mary Immaculate College
South Circular Road
Limerick
Ireland
• Email
michael.breen@mic.ul.ie
• Phone
+353 61 204972
Download