Using Systematic Literature Reviews to build a foundation for Original Research Galway Seminar: 18th November 2010 Sarah Beecham & John Noll Lero-The Irish Software Engineering Research Centre, Univ of Limerick, Ireland Will cover • • • • • The Full Systematic Literature Review (Sarah) Exercise (All) The Focussed Literature Review (John) Break! A Comparison (All) SLR Aims • To synthesize all available research relevant to a particular research question or area of work. • To present an evaluation of the literature relative to a research topic by using a rigorous and auditable methodology [Kitchenham 2004]. Use example of how we applied SLR guidelines to summarise evidence on ‘what’ motivates Software Engineers. Beecham et al 2008 All you need to know http://www.dur.ac.uk/ebse/ Full Systematic Literature Review Plan Review Identify Need for Review Specify Research Questions Develop Review Protocol Evaluate Review Protocol Identify Relevant Research Conduct Review Select Primary Studies Assess Study Quality Extract and Monitor Data Synthesise Data Specify Dissemination Mechanisms Report Results Format Main Report Evaluate Report and Publish 5 Plan the review Plan Review Identify Need for Review Specify Research Questions Develop Review Protocol Evaluate Review Protocol Identify a need The Starting point of any SLR Plan Review Identify Need for Review Specify Research Questions Develop Review Protocol Evaluate Review Protocol RQ Step 1: Free Format – Assess effect of an SE technology – Assess frequency or rate of a project development factor • E.g. Rate of project failures – Identify cost and risk factors – Identify impact of technology on reliability, performance, cost – Identify best practices – Identify problem areas in SE 9 RQ Step 2: Structure Question Population Intervention Comparison Outcomes Context 10 Example breakdown of RQ Does the use of an object-oriented programming language [the intervention] produce better quality code [outcome] compared to a structured programming language [comparison/baseline] when used by experienced programmers [population] when developing a web-database system in a short time-frame with a small development team [context]? OR Does the Telelogic DOORs Requirements Management Tool [intervention] more effectively trace requirements [outcome], when used by experienced requirements analysts [population] in the development of a large software system [context], when compared to the use of no formal requirements management tool [comparison]? Document the Process Plan Review Identify Need for Review Specify Research Questions Develop Review Protocol Evaluate Review Protocol Develop a Review Protocol What is a Protocol? – A set of rules outlining the entire process – Records tailored, auditable, rigorous methodology – Pilot studies required; iterative process – It takes forever! (3 months) Protocol Contents – Rationale – RQ – Construct Search Terms (incl synonyms) – Search Strings – Search Strategy/Resources to search – Search Documentation (lookup, Endnote, forms) – Inclusion/Exclusion Criteria – Quality Assessment – Schedule Search string COMPENDEX SEARCH TERMS LOOKUP TABLE – 21st March 2006 Researcher Name: Researchers use lookup table 1 to A CUT AND PASTE SEARCH STRING INTO DATABASE SEARCH WINDOW and B PLACE SEARCH IDENTIFIER INTO ENDNOTE ‘SEARCH TERMS’ FIELD. Add Rigour Plan Review Identify Need for Review Specify Research Questions Develop Review Protocol Evaluate Review Protocol Externally validated by expert PhD Supervisor can substitute 2. Conduct the review Identify Relevant Research Conduct Review Select Primary Studies Assess Study Quality Extract and Monitor Data Synthesise Data Identify relevant research • • • • • Databases and search strings Authors Journals Conferences and workshops Grey literature Search Engines • • • • • • • IEEE Explore ACM Digital library Google scholar (scholar.google.com) UH University’s electronic library (voyager.herts.ac.uk) Inspec (www.iee.org/Publish/INSPEC/) ScienceDirect (www.sciencedirect.com) EI Compendex (www.engineeringvillage2.org/Controller/Servlet/Athens Service WSESE 2nd July 2007 Results of the iterative search process Number of papers reviewed and validated Selection Process Papers/references Extracted from Databases Sift based on Title and Abstract Papers – full versions available [519-19] Papers accepted (by primary researchers) Papers rejected in validation 1 Papers added in validation 1 Post validation 2 (final list of ‘accepted’ papers) We validated each phase of our selection process 6 researchers were involved WSESE 2nd July 2007 # Papers <2,000 519 500 94 93 95 92 2. Conduct the review Identify Relevant Research Conduct Review Select Primary Studies Assess Study Quality Extract and Monitor Data Synthesise Data Select Primary Studies 2. Conduct the review Identify Relevant Research Select Primary Studies Conduct Review Assess Study Quality Extract and Monitor Data Synthesise Data Assess Study Quality Item Assessment criteria Score Response options for Score 1 Does study report clear, unambiguous findings based on evidence & argument? For empirical studies: 2 Is sample unbiased? 3 Could you replicate study? 4 Number of participants? 5 For a questionnaire, what is the response rate? For theoretical studies: 6 Is the paper well/ appropriately referenced? Total Quality Score 1. 2. 3. Data collection Method Questionnaire/Survey Face to face interviews 4. Observation 5. Focus Groups Yes = 1 /No = 0 Random Sample = 1 Non-random sample = .5 Not representative = 0 Yes = 1/ No = 0 See scores -Table 2 Sample size : No response rate given = 0 Over 80% = 1 Under 20% = 0 Between = .5 Yes = 1 Moderately = .5 No = 0 Enter the % score in Endnote Quality assessment field % 1. 2. 3. 4. 5. 6. 7. 8. Score (Sample No) Unit = 1 person | <=5 = 0; >5<50 =.5; >50 = 1 Unit = 1 person Depends on depth of interview. <3 = 0; ≥3 ≤5 = .5; >5 = 1 Unit = 1 person Depends on depth and time spent. <3 = 0; ≥3 ≤5 = .5; >5 = 1 Unit = Group Depends on depth and time spent. <3 = 0; ≥3 ≤5 = .5; >5 = 1 WSESE 2nd July 2007 Quality results Quality scores of Accepted Papers QUALITY (scores) Total Poor Fair (<26%) (26%-45%) (46%-65%) (66%-85%) ( >86%) Number of 6 Good Very Good Excellent 10 32 32 12 92 ~10.5% ~35% ~35% ~13% 100% Studies Percentage ~6.5% of studies 2. Conduct the review Identify Relevant Research Conduct Review Select Primary Studies Assess Study Quality Extract and Monitor Data Synthesise Data Extract and Monitor Data Endnote Reference Example: • • • • • • • • • • • • • • • Reference Type: Journal/Conference/Report/etc Record Number: 5 Author: Almstrum, V. L. Year: 2003 Title: What is the attraction to computing? Paper ID: (AAYYYYTTT): AL2003WHA Journal/Conference/Report: Communications of the ACM Publisher: ACM, USA. Volume: 46 Issue: 9 Pages: 51-5 Researcher: Sarah Date of Search: 21 5 2006 Search String Lookup Table Ref: INSPEC 1 Exclusion Criteria (a): Is study based on cognitive behaviour? No Endnote continued • • • • • • • • • • • • • Exclusion Criteria (b) Is study external to software engineering? No Exclusion Criteria (c): Is study personal opinion piece or viewpoint? No +Inclusion Criteria (a): Research Question answered? RQ4 (RQ 5* *rejected at synthesis stage: AL2003WHA is not a model of motivation because it’s a table of factors). +Inclusion Criteria (b): Acceptable source? yes ++Quality Criteria(Score) - (Appendix A 3.3)see quality assessment form 40% (short paper doesn't give response rate) +Type of Study (empirical/theoretical/both/based on secondary data, Literature review); Empirical *Type of Empirical study: Questionnaire/survey(self completed); Face to face interviews; Observation ; Focus Groups; Other (state) Questionnaire Decision Based on: (Keywords/Abstract/Introduction/Conclusion/ Methodology/Results/Whole Paper/Peer Review/Arbitration Whole Paper +++repeated study (check for each accepted study) no Extract and Monitor Data Paper studyresults/findings Form(Used onlyfor ACCEPTED papers) Reviewer Name Title of Paper Paper ID THE FOLLOWINGREFERTOOURRQs: 1. Software engineer characteristics (RQ1) 2. Software Engineer motivators (RQ2) 3. Software Engineer ‘de-motivators’ (RQ2) 4. External signs or outcomes of motivated engineers (RQ3) 5. External signs or outcomes of de-motivated engineers (RQ3) 6. SWEngineering as a motivator (RQ4) 7. Models that reflect howsoftware engineers are motivated (RQ5) 8. Other observations RECORDEDINPAPER WSESE 2nd July 2007 Presenting results • A sensitivity analysis helps highlight possible bias in sample. • Each research question is examined separately on results form. Data Synthesis Form 1: Research Question 4 # of papers accepted that relate to this question (completed at end): RQ4: What aspects of Software Engineering motivate Software Engineers? Paper ID Paper ID etc Quality Quality Population Population location location Study year Study year WSESE 2nd July 2007 Study Type Study Type MOTIVATE SW Engin’ing is .. (list) SW Engin’ing is .. (list) 2. Conduct the review Identify Relevant Research Conduct Review Select Primary Studies Assess Study Quality Extract and Monitor Data Synthesise Data Synthesise Data RQ1: What are the characteristics of Software Engineers? SW Engineer Characteristic A # of Paper (identified in Form 1) papers Ids SW Engineer Characteristic B (identified in Form 1) etc # of Paper papers Ids 3. Report Results Specify Dissemination Mechanisms Report Results Format Main Report Evaluate Report and Publish Last section, report results • TRs, PhDs, Conferences, Journals. • Good way to validate (external reviewers) • Lots of scope. Not necessarily and end in itself.. • You have it slightly easier – lots of examples to refer to. Review took 8 months to complete Table 1: Schedule of Activities for Conducting a Systematic Literature Review on SE Motivation Activity Start Date People involved Completion Planning and Preparation Agree to conduct SLR Jan 2006 All team 2.2.2006 Pre-pilot (1) & (2) (record data and construct RQs) 21.2.2006 TH, SB, DJ 8.3.2006 Pilot (test the process developed in pre-pilots) 16.3.2006 HS, SB, DJ 16.3.2006 Protocol developed v1 (circulated, revised, forms drawn up) 20.2.2006 SB, NB, HS, HR 1.4.2006 Prot. v2 circulated for comment (revised accordingly) 4.4.2006 All team 6.4.2006 Protocol v3 sent for independent review (Kitchenham) 7.4.2006 SB, BK 20.4.2006 Produce final v4 of protocol (incorporate feedback from all) 21.4.2006 SB 10.5.2006 Conduct Review Stage 1: Download references based on face value papers 11.5.2006 SB (519 papers) 17.5.2006 Stage 2: Check Excl/inclusion criteria 18.5.2006 SB (500 full papers) 5.6.2006 Stage 3 (quality assessment & results forms completed) 6.6.2006 Allocated to team 11.6.2006 Stage 4 (secondary studies) Check results forms 3.7.2006 SB 10.7.2006 Validate 1 - review process (accepted and rejected papers) 11.7.2006 SB/DJ 12.7.2006 Arbitration (2) (4 papers went to arbitration) 28.7.2006 HR 29.7.2006 Validate2 - Results (ALL 95 accepted papers ) 12.7.2006 NB/SB 30.7.2006 Arbitration (3) 30.7.2006 HR 1.8.2006 Synthesise Data (all 92 studies with no repeated studies) Aug ‘06 SB Aug ‘06 Publish Results Report the review Sept 06 SB et al Sept 06 Report findings Oct 06 SB et al On-going Disadvantages • Require more effort than informal reviews • Difficult for lone researchers – Best practice standards advise two researchers. Minimising individual bias • Incompatible with requirements for short papers – (exceptions, e.g. Voice of Evidence) • Dependent on ‘quality’ of the studies • Not ideally suited to theoretical studies, aimed more at empirical studies 36 SLR - Benefits • SLRs synthesise existing research – Fairly (without bias) – Rigorously (according to a defined procedure) – Openly (ensuring that the review procedure is visible to other researchers - protocol). • Sampling problem overcome through analysis of phenomenon across wide range of settings – Inconsistent results allow sources of variation to be studies – Consistent results provide evidence that phenomena are robust/transferrable new knowledge • Can contradict “common knowledge” – Jørgensen and Moløkken reviewed surveys of project overruns – Standish CHAOS report is out of step with other research » May have used inappropriate methodology – Jørgensen reviewed evidence about expert opinion estimates • No consistent support for view that models are better than human estimators • You will know your field in great depth, if you didn’t start off expert, you will end up one. • You will have opportunity to publish your results, citations! 38 Decision time • Create a firm foundation for future research • Position your own research in the context of existing research • Close areas where no further research is necessary • Uncover areas where research is necessary • Help the development of new theories – Identify common underlying trends – Identify explanations for conflicting results But could you do this though an ad hoc review or focused review? 39 Exercise 1. Write down a research question. 2. Break down RQ into components. Population, intervention, comparison, outcome, context. 3. Create a search string. Over to John Advantages and Disadvantages Full SLR compared to Focused SLR Final thoughts about SLRs Takes a lot of knowledge and skill to abstract what matters… Questions?