Using Systematic Literature Reviews to build a foundation for Original Research

advertisement
Using Systematic Literature Reviews to
build a foundation for Original Research
Galway Seminar: 18th November 2010
Sarah Beecham & John Noll
Lero-The Irish Software Engineering Research Centre, Univ of Limerick, Ireland
Will cover
•
•
•
•
•
The Full Systematic Literature Review (Sarah)
Exercise (All)
The Focussed Literature Review (John)
Break!
A Comparison (All)
SLR Aims
• To synthesize all available research relevant to a
particular research question or area of work.
• To present an evaluation of the literature
relative to a research topic by using a rigorous
and auditable methodology [Kitchenham 2004].
Use example of how we applied SLR guidelines to summarise
evidence on ‘what’ motivates Software Engineers. Beecham et al 2008
All you need to know
http://www.dur.ac.uk/ebse/
Full Systematic Literature Review
Plan Review
Identify Need for Review
Specify Research Questions
Develop Review Protocol
Evaluate Review Protocol
Identify Relevant Research
Conduct Review
Select Primary Studies
Assess Study Quality
Extract and Monitor Data
Synthesise Data
Specify Dissemination Mechanisms
Report Results
Format Main Report
Evaluate Report and Publish
5
Plan the review
Plan Review
Identify Need for Review
Specify Research Questions
Develop Review Protocol
Evaluate Review Protocol
Identify a need
The Starting point of any SLR
Plan Review
Identify Need for Review
Specify Research Questions
Develop Review Protocol
Evaluate Review Protocol
RQ Step 1: Free Format
– Assess effect of an SE technology
– Assess frequency or rate of a project development
factor
• E.g. Rate of project failures
– Identify cost and risk factors
– Identify impact of technology on reliability,
performance, cost
– Identify best practices
– Identify problem areas in SE
9
RQ Step 2: Structure Question
Population
Intervention
Comparison
Outcomes
Context
10
Example breakdown of RQ
Does the use of an object-oriented programming language [the intervention]
produce better quality code [outcome]
compared to a structured programming language [comparison/baseline]
when used by experienced programmers [population]
when developing a web-database system in a short time-frame with a small
development team [context]?
OR
Does the Telelogic DOORs Requirements Management Tool [intervention]
more effectively trace requirements [outcome],
when used by experienced requirements analysts [population]
in the development of a large software system [context],
when compared to the use of no formal requirements management tool
[comparison]?
Document the Process
Plan Review
Identify Need for Review
Specify Research Questions
Develop Review Protocol
Evaluate Review Protocol
Develop a Review Protocol
What is a Protocol?
– A set of rules outlining the entire process
– Records tailored, auditable, rigorous methodology
– Pilot studies required; iterative process
– It takes forever! (3 months)
Protocol Contents
– Rationale
– RQ
– Construct Search Terms (incl synonyms)
– Search Strings
– Search Strategy/Resources to search
– Search Documentation (lookup, Endnote, forms)
– Inclusion/Exclusion Criteria
– Quality Assessment
– Schedule
Search string
COMPENDEX SEARCH TERMS LOOKUP TABLE – 21st March 2006
Researcher Name:
Researchers use lookup table 1 to
A
CUT AND PASTE SEARCH STRING INTO DATABASE SEARCH WINDOW
and
B
PLACE SEARCH IDENTIFIER INTO ENDNOTE ‘SEARCH TERMS’ FIELD.
Add Rigour
Plan Review
Identify Need for Review
Specify Research Questions
Develop Review Protocol
Evaluate Review Protocol
Externally validated by expert
PhD Supervisor can
substitute
2. Conduct the review
Identify Relevant Research
Conduct Review
Select Primary Studies
Assess Study Quality
Extract and Monitor Data
Synthesise Data
Identify relevant research
•
•
•
•
•
Databases and search strings
Authors
Journals
Conferences and workshops
Grey literature
Search Engines
•
•
•
•
•
•
•
IEEE Explore
ACM Digital library
Google scholar (scholar.google.com)
UH University’s electronic library (voyager.herts.ac.uk)
Inspec (www.iee.org/Publish/INSPEC/)
ScienceDirect (www.sciencedirect.com)
EI Compendex
(www.engineeringvillage2.org/Controller/Servlet/Athens
Service
WSESE 2nd July
2007
Results of the iterative
search process
Number of papers reviewed and validated
Selection Process
Papers/references Extracted from Databases
Sift based on Title and Abstract
Papers – full versions available [519-19]
Papers accepted (by primary researchers)
Papers rejected in validation 1
Papers added in validation 1
Post validation 2 (final list of ‘accepted’ papers)
We validated each phase of our selection
process
6 researchers were involved
WSESE 2nd July
2007
# Papers
<2,000
519
500
94
93
95
92
2. Conduct the review
Identify Relevant Research
Conduct Review
Select Primary Studies
Assess Study Quality
Extract and Monitor Data
Synthesise Data
Select Primary Studies
2. Conduct the review
Identify Relevant Research
Select Primary Studies
Conduct Review
Assess Study Quality
Extract and Monitor Data
Synthesise Data
Assess Study Quality
Item
Assessment criteria
Score Response options for Score
1 Does study report clear, unambiguous
findings based on evidence & argument?
For empirical studies:
2 Is sample unbiased?
3 Could you replicate study?
4 Number of participants?
5 For a questionnaire, what is the response
rate?
For theoretical studies:
6 Is the paper well/
appropriately referenced?
Total Quality Score
1.
2.
3.
Data collection Method
Questionnaire/Survey
Face to face interviews
4.
Observation
5.
Focus Groups
Yes = 1 /No = 0
Random Sample = 1
Non-random sample = .5
Not representative = 0
Yes = 1/ No = 0
See scores -Table 2
Sample size :
No response rate given = 0
Over 80% = 1
Under 20% = 0
Between = .5
Yes = 1
Moderately = .5
No = 0
Enter the % score in Endnote Quality
assessment field
%
1.
2.
3.
4.
5.
6.
7.
8.
Score (Sample No)
Unit = 1 person | <=5 = 0; >5<50 =.5; >50 = 1
Unit = 1 person
Depends on depth of interview. <3 = 0; ≥3 ≤5 = .5; >5 = 1
Unit = 1 person
Depends on depth and time spent. <3 = 0; ≥3 ≤5 = .5; >5 = 1
Unit = Group
Depends on depth and time spent. <3 = 0; ≥3 ≤5 = .5; >5 = 1
WSESE 2nd July
2007
Quality results
Quality scores of Accepted Papers
QUALITY (scores)
Total
Poor
Fair
(<26%)
(26%-45%) (46%-65%) (66%-85%) ( >86%)
Number of 6
Good
Very Good Excellent
10
32
32
12
92
~10.5%
~35%
~35%
~13%
100%
Studies
Percentage ~6.5%
of studies
2. Conduct the review
Identify Relevant Research
Conduct Review
Select Primary Studies
Assess Study Quality
Extract and Monitor Data
Synthesise Data
Extract and Monitor Data
Endnote Reference Example:
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
Reference Type: Journal/Conference/Report/etc
Record Number: 5
Author: Almstrum, V. L.
Year: 2003
Title: What is the attraction to computing?
Paper ID: (AAYYYYTTT): AL2003WHA
Journal/Conference/Report: Communications of the ACM
Publisher: ACM, USA.
Volume: 46
Issue: 9
Pages: 51-5
Researcher: Sarah
Date of Search: 21 5 2006
Search String Lookup Table Ref: INSPEC 1
Exclusion Criteria (a): Is study based on cognitive behaviour? No
Endnote continued
•
•
•
•
•
•
•
•
•
•
•
•
•
Exclusion Criteria (b) Is study external to software engineering? No
Exclusion Criteria (c): Is study personal opinion piece or viewpoint?
No
+Inclusion Criteria (a): Research Question answered? RQ4 (RQ 5* *rejected at synthesis
stage: AL2003WHA is not a model of motivation because it’s a table of factors).
+Inclusion Criteria (b): Acceptable source? yes
++Quality Criteria(Score) - (Appendix A 3.3)see quality assessment form
40% (short paper doesn't give response rate)
+Type of Study (empirical/theoretical/both/based on secondary data, Literature review);
Empirical
*Type of Empirical study: Questionnaire/survey(self completed); Face to face interviews;
Observation ; Focus Groups; Other (state)
Questionnaire
Decision Based on: (Keywords/Abstract/Introduction/Conclusion/
Methodology/Results/Whole Paper/Peer Review/Arbitration
Whole Paper
+++repeated study (check for each accepted study) no
Extract and Monitor Data
Paper studyresults/findings Form(Used onlyfor ACCEPTED papers)
Reviewer Name
Title of Paper
Paper ID
THE FOLLOWINGREFERTOOURRQs:
1. Software engineer characteristics (RQ1)
2. Software Engineer motivators (RQ2)
3. Software Engineer ‘de-motivators’ (RQ2)
4. External signs or outcomes of motivated engineers (RQ3)
5. External signs or outcomes of de-motivated engineers (RQ3)
6. SWEngineering as a motivator (RQ4)
7. Models that reflect howsoftware engineers are motivated (RQ5)
8. Other observations
RECORDEDINPAPER
WSESE 2nd July
2007
Presenting results
• A sensitivity analysis helps highlight possible bias in
sample.
• Each research question is examined separately on
results form.
Data Synthesis Form 1: Research Question 4
# of papers accepted that relate to this question (completed at end):
RQ4: What aspects of Software Engineering motivate Software Engineers?
Paper ID
Paper ID
etc
Quality
Quality
Population
Population
location
location
Study year
Study year
WSESE 2nd July
2007
Study Type
Study Type
MOTIVATE
SW Engin’ing is .. (list)
SW Engin’ing is .. (list)
2. Conduct the review
Identify Relevant Research
Conduct Review
Select Primary Studies
Assess Study Quality
Extract and Monitor Data
Synthesise Data
Synthesise Data
RQ1: What are the characteristics of Software
Engineers?
SW Engineer Characteristic A
# of
Paper
(identified in Form 1)
papers Ids
SW Engineer Characteristic B
(identified in Form 1)
etc
# of
Paper
papers Ids
3. Report Results
Specify Dissemination Mechanisms
Report Results
Format Main Report
Evaluate Report and Publish
Last section, report results
• TRs, PhDs, Conferences, Journals.
• Good way to validate (external reviewers)
• Lots of scope. Not necessarily and end in
itself..
• You have it slightly easier – lots of examples to
refer to.
Review took 8 months to complete
Table 1: Schedule of Activities for Conducting a Systematic Literature Review on SE Motivation
Activity
Start Date
People involved
Completion
Planning and Preparation
Agree to conduct SLR
Jan 2006
All team
2.2.2006
Pre-pilot (1) & (2) (record data and construct RQs)
21.2.2006 TH, SB, DJ
8.3.2006
Pilot (test the process developed in pre-pilots)
16.3.2006 HS, SB, DJ
16.3.2006
Protocol developed v1 (circulated, revised, forms drawn up)
20.2.2006 SB, NB, HS, HR
1.4.2006
Prot. v2 circulated for comment (revised accordingly)
4.4.2006
All team
6.4.2006
Protocol v3 sent for independent review (Kitchenham)
7.4.2006
SB, BK
20.4.2006
Produce final v4 of protocol (incorporate feedback from all)
21.4.2006 SB
10.5.2006
Conduct Review
Stage 1: Download references based on face value papers
11.5.2006 SB (519 papers)
17.5.2006
Stage 2: Check Excl/inclusion criteria
18.5.2006 SB (500 full papers)
5.6.2006
Stage 3 (quality assessment & results forms completed)
6.6.2006
Allocated to team
11.6.2006
Stage 4 (secondary studies) Check results forms
3.7.2006
SB
10.7.2006
Validate 1 - review process (accepted and rejected papers)
11.7.2006 SB/DJ
12.7.2006
Arbitration (2) (4 papers went to arbitration)
28.7.2006 HR
29.7.2006
Validate2 - Results (ALL 95 accepted papers )
12.7.2006 NB/SB
30.7.2006
Arbitration (3)
30.7.2006 HR
1.8.2006
Synthesise Data (all 92 studies with no repeated studies)
Aug ‘06
SB
Aug ‘06
Publish Results
Report the review
Sept 06
SB et al
Sept 06
Report findings
Oct 06
SB et al
On-going
Disadvantages
• Require more effort than informal reviews
• Difficult for lone researchers
– Best practice standards advise two researchers. Minimising individual bias
• Incompatible with requirements for short papers
– (exceptions, e.g. Voice of Evidence)
• Dependent on ‘quality’ of the studies
• Not ideally suited to theoretical studies, aimed more
at empirical studies
36
SLR - Benefits
• SLRs synthesise existing research
– Fairly (without bias)
– Rigorously (according to a defined procedure)
– Openly (ensuring that the review procedure is visible to
other researchers - protocol).
• Sampling problem overcome through analysis of
phenomenon across wide range of settings
– Inconsistent results allow sources of variation to be studies
– Consistent results provide evidence that phenomena are
robust/transferrable
new knowledge
• Can contradict “common knowledge”
– Jørgensen and Moløkken reviewed surveys of project
overruns
– Standish CHAOS report is out of step with other research
» May have used inappropriate methodology
– Jørgensen reviewed evidence about expert opinion
estimates
• No consistent support for view that models are better than human
estimators
• You will know your field in great depth, if you didn’t start off
expert, you will end up one.
• You will have opportunity to publish your results, citations!
38
Decision time
• Create a firm foundation for future research
• Position your own research in the context of existing research
• Close areas where no further research is necessary
• Uncover areas where research is necessary
• Help the development of new theories
– Identify common underlying trends
– Identify explanations for conflicting results
But could you do this though an ad hoc review or focused
review?
39
Exercise
1. Write down a research question.
2. Break down RQ into components.
Population, intervention, comparison, outcome,
context.
3. Create a search string.
Over to John
Advantages and Disadvantages
Full SLR compared to Focused SLR
Final thoughts about SLRs
Takes a lot of knowledge and
skill to abstract what matters…
Questions?
Download