Introduction - Darko Marinov

advertisement
Automated Test Generation
and Repair
Darko Marinov
Escuela de Verano de
Ciencias Informáticas RÍO 2011
Rio Cuarto, Argentina
February 14-19, 2011
Why Testing?
• Goal: Increase software reliability
– Software bugs cost US economy $60B/year [NIST’02]
• Approach: Find bugs using testing
– Estimated savings from better testing $22B/year
• Challenge: Manual testing is problematic
– Time-consuming, error-prone, expensive
• Research: Automate testing
– Reduce cost, increase benefit
Topics to Cover
•
•
•
•
•
•
Introduction: about bugs
Randoop: random generation of OO tests
Pex: dynamic symbolic generation of inputs
UDITA: generation of complex data inputs
ReAssert: repair of OO unit tests
JPF: systematic testing of Java code
Introduction
•
•
•
•
Why look for bugs?
What are bugs?
Where they come from?
How to find them?
Some Costly “Bugs”
• NASA Mars space missions
– Priority inversion (2004)
– Different metric systems (1999)
• BMW airbag problems (1999)
– Recall of >15000 cars
• Ariane 5 crash (1996)
– Uncaught exception of numerical overflow
– Sample Video
• Your own favorite example?
Some “Bugging” Bugs
• An example bug on my laptop
– “Jumping” file after changing properties
• Put a read-only file on the desktop
• Change properties: rename and make not read-only
• Your own favorite example?
• What is important about software for you?
– Correctness, performance, functionality
Terminology
•
•
•
•
•
•
•
•
•
•
Anomaly
Bug
Crash
Defect
Error
Failure, fault
Glitch
Hangup
Incorrectness
J…
Dynamic vs. Static
• Incorrect (observed) behavior
– Failure, fault
• Incorrect (unobserved) state
– Error, latent error
• Incorrect lines of code
– Fault, error
“Bugs” in IEEE 610.12-1990
• Fault
– Incorrect lines of code
• Error
– Faults cause incorrect (unobserved) state
• Failure
– Errors cause incorrect (observed) behavior
• Not used consistently in literature!
Correctness
• Common (partial) properties
– Segfaults, uncaught exceptions
– Resource leaks
– Data races, deadlocks
– Statistics based
• Specific properties
– Requirements
– Specification
Traditional Waterfall Model
Requirements
Analysis
Design
Checking
Implementation
Unit Testing
Integration
System Testing
Maintenance
Verification
Phases (1)
• Requirements
– Specify what the software should do
– Analysis: eliminate/reduce ambiguities,
inconsistencies, and incompleteness
• Design
– Specify how the software should work
– Split software into modules, write specifications
– Checking: check conformance to requirements
Phases (2)
• Implementation
– Specify how the modules work
– Unit testing: test each module in isolation
• Integration
– Specify how the modules interact
– System testing: test module interactions
• Maintenance
– Evolve software as requirements change
– Verification: test changes, regression testing
Testing Effort
• Reported to be >50% of development cost
[e.g., Beizer 1990]
• Microsoft: 75% time spent testing
– 50% testers who spend all time testing
– 50% developers who spend half time testing
When to Test
• The later a bug is found, the higher the cost
– Orders of magnitude increase in later phases
– Also the smaller chance of a proper fix
• Old saying: test often, test early
• New methodology: test-driven development
(write tests before code)
Software is Complex
•
•
•
•
•
•
Malleable
Intangible
Abstract
Solves complex problems
Interacts with other software and hardware
Not continuous
Software Still Buggy
• Folklore: 1-10 (residual) bugs per 1000
nbnc lines of code (after testing)
• Consensus: total correctness impossible
to achieve for (complex) software
– Risk-driven finding/elimination of bugs
– Focus on specific correctness properties
Approaches for Finding Bugs
• Software testing
• Model checking
• (Static) program analysis
Software Testing
• Dynamic approach
• Run code for some inputs, check outputs
• Checks correctness for some executions
• Main questions
– Test-input generation
– Test-suite adequacy
– Test oracles
Other Testing Questions
•
•
•
•
•
•
•
•
Maintenance
Selection
Minimization
Prioritization
Augmentation
Evaluation
Fault Characterization
…
Model Checking
• Typically hybrid dynamic/static approach
• Checks correctness for “all” executions
• Some techniques
– Explicit-state model checking
– Symbolic model checking
– Abstraction-based model checking
Static Analysis
• Static approach
• Checks correctness for “all” executions
• Some techniques
– Abstract interpretation
– Dataflow analysis
– Verification-condition generation
Comparison
• Level of automation
– Push-button vs. manual
• Type of bugs found
– Hard vs. easy to reproduce
– High vs. low probability
– Common vs. specific properties
• Type of bugs (not) found
Soundness and Completeness
• Do we find all bugs?
– Impossible for dynamic analysis
• Are reported bugs real bugs?
– Easy for dynamic analysis
• Most practical techniques and tools are
both unsound and incomplete!
– False positives
– False negatives
Analysis for Performance
• Static compiler analysis, profiling
• Must be sound
– Correctness of transformation: equivalence
• Improves execution time
• Programmer time is more important
• Programmer productivity
– Not only finding bugs
Combining Dynamic and Static
• Dynamic and static analyses equal in limit
– Dynamic: try exhaustively all possible inputs
– Static: model precisely every possible state
• Synergistic opportunities
– Static analysis can optimize dynamic analysis
– Dynamic analysis can focus static analysis
– More discussions than results
Current Status
• Testing remains the most widely used
approach for finding bugs
• A lot of recent progress (within last decade)
on model checking and static analysis
– Model checking: from hardware to software
– Static analysis: from sound to practical
• Vibrant research in the area
• Gap between research and practice
Topics Related to Finding Bugs
• How to eliminate bugs?
– Debugging
• How to prevent bugs?
– Programming language design
– Software development processes
• How to show absence of bugs?
– Theorem proving
– Model checking, program analysis
Our Focus: Testing
• More precisely, recent research on automated
test generation and repair
– More info at CS527 from Fall 2010
• Recommended general reading for research
– How to Read an Engineering Research Paper
by William G. Griswold
– Writing Good Software Engineering Research
Papers
by Mary Shaw (ICSE 2003)
• If you have read that paper, read on another area
Writing Good SE Papers Overview
• Motivation
– Guidelines for writing papers for ICSE
• Approach
– Analysis of papers submitted to ICSE 2002
– Distribution across three dimensions
• Question (problem)
• Result (solution)
• Validation (evaluation)
• Results
– Writing matters, know your conferences!
Randoop
• Feedback-directed random test generation
by Carlos Pacheco, Shuvendu K. Lahiri,
Michael D. Ernst, and Thomas Ball
(ICSE 2007)
– (optional) Finding Errors in .NET with Feedbackdirected Random Testing by Carlos Pacheco,
Shuvendu K. Lahiri & Thomas Ball (ISSTA 2008)
– Website: Randoop
• Slides courtesy of Carlos Pacheco
Randoop Paper Overview
• Problem (Question)
– Generate unit tests (with high coverage?)
• Solution (Result)
– Generate sequences of method calls
– Random choice of methods and parameters
– Publicly available tool for Java (Randoop)
• Evaluation (Validation)
– Data structures (JPF is next lecture)
– Checking API contracts
– Regression testing (lecture next week)
Pex
• Pex – White Box Test Generation for .NET
by Nikolai Tillmann and Jonathan de Halleux
(TAP 2008)
– (optional) Moles: Tool-Assisted Environment
Isolation with Closures by Jonathan de Halleux
and Nikolai Tillmann (TOOLS 2010)
– Websites: Pex, TeachPex
• Slides courtesy of Tao Xie (and Nikolai
Tillmann, Peli de Halleux, Wolfram Schulte)
Pex Paper Overview
• Problem (Question)
– Generate unit tests (with high coverage)
• Solution (Result)
– Describe test scenarios with parameterized unit
tests (PUTs)
– Dynamic symbolic execution
– Tool for .NET (Pex)
• Evaluation (Validation)
– Found some issues in a “core .NET component”
UDITA
• Test Generation through Programming in
UDITA
by Milos Gligoric, Tihomir Gvero, Vilas
Jagannath, Sarfraz Khurshid, Viktor Kuncak,
and Darko Marinov (ICSE 2010)
– (optional) Automated testing of refactoring
engines by Brett Daniel, Danny Dig, Kely Garcia,
and Darko Marinov (ESEC/FSE 2007)
– Websites: UDITA, ASTGen
• Slides partially prepared by Milos Gligoric
UDITA Paper Overview
• Problem (Question)
– Generate complex test inputs
• Solution (Result)
– Combines filtering approach (check validity) and
generating approaches (valid by construction)
– Java-based language with non-determinism
– Tool for Java (UDITA)
• Evaluation (Validation)
– Found bugs in Eclipse, NetBeans, javac, JPF...
ReAssert
• ReAssert: Suggesting repairs for broken unit
tests
by Brett Daniel, Vilas Jagannath, Danny Dig,
and Darko Marinov (ASE 2009)
– (optional) On Test Repair using Symbolic
Execution by Brett Daniel, Tihomir Gvero, and
Darko Marinov (ISSTA 2010)
– Website: ReAssert
• Slides courtesy of Brett Daniel
ReAssert Paper Overview
• Problem (Question)
– When code evolves, passing tests may fail
– How to repair tests that should be updated?
• Solution (Result)
– Find small changes that make tests pass
– Ask the user to confirm proposed changes
– Tool for Java/Eclipse (ReAssert)
• Evaluation (Validation)
– Case studies, user study, open-source evolution
Java PathFinder (JPF)
• Model Checking Programs
by W. Visser, K. Havelund, G. Brat, S. Park
and F. Lerda (J-ASE, vol. 10, no. 2, April
2003)
– Note: this is a journal paper, so feel free to
skip/skim some sections (3.2, 3.3, 4)
– Website: JPF
• Slides courtesy of Peter Mehlitz and Willem
Visser
JPF Paper Overview
• Problem
– Model checking of real code
• Terminology: Systematic testing, state-space exploration
• Solution
– Specialized Java Virtual Machine
• Supports backtracking, state comparison
• Many optimizations to make it scale
– Publicly available tool (Java PathFinder)
• Evaluation/applications
– Remote Agent Spacecraft Controller
– DEOS Avionics Operating System
Download