Course 3
EA applications to SE
Evolutionary Computing in Search-Based Software Engineering
Leo Rela
Overview: Analysis
• Record and elicit customer requirements
• Understand customer requirements
• No technical decisions about the system’s
• Includes feasibility study
Overview: Design
• Translate requirements into a representation of
• Focuses on
Data structures
Algorithmic details
• Also include
– Resource and task allocation in a distributed system
Overview: Implementation
• The software design is translated into a
computer program
• Two facets
– Produce computer programs
– Support work of computer programmers
• Typically, GP falls into this class
Overview: Testing
• Validation and Verification
• Tackles problems like:
– Test case generation
– Find inputs that cause failures
– Find inputs that cause long running time
Applications: Analysis
Prediction of software failures
Exploring difficulty of the problem
Software project effort prediction
Project management
Applications: Design
• Multiprocessor scheduling
• Task and resource allocation in distributed
• Hardware/software co-design in embeded
• Protocol construction
• Architecture design
Applications: Implementation
Automatic programming
N-version programming
Search for compiler optimization
Applications: Testing
Structural (White-box) testing
Functional (Black-box) testing
Integration test design
Testing based on mutation analysis
Search for response time extremes
Prediction of Software Failures
• Fixing failures is expensive
– In testing → extra coding effort
– After deployment → even more expensive
• Applying reliability techniques is expensive
• Quality prediction methods identify which
parts of the system need reliability
GP-based software quality
• Modules are clasified as error-prone and not error prone
• GP is used to predict the number of expected faults, but
only the resulted ranking is used as a base for decision
• For each module, a series of metrics are recorded
• For some modules, the reliability has been measured
• Can we predict, using the metrics, which will be the
reliability for an yet untested program?
• Metrics used: n. of operators/operands, LOC, LOEC,
cyclomatic complexity
Genetic programming model for
software quality classification
• Metrics
– number of times the source code was
– number of LOC for different production
– final number of commented code.
• Implementation note: not LISP, but
pointers to functions in C
◙ Automated Knowledge Acquisition and
Application for Software
Development Projects
• Fuzzy system which classifies modules by
probability of containing errors
◙ Combining Software Quality
Predictive Models: An Evolutionary
• Various studies have used various
• Combine the resulting experts into one
which is able to work with partial input
• AdaBoost: Gives weights to how hard is a
specific data instances, and to how good
is an expert.
• GP: Combines decision trees
Neural Networks / GA
Using the genetic algorithm to build
optimal neural networks for fault-prone
module detection
• ◙
– GA generates NN (structure, weights)
Evolutionary neural networks: a robust
approach to software reliability problems
• ◙
Exploring difficulty of the
• In early phases developers are not aware of
potential (hard) problems ahead
• Uncertainty about the real nature of the task
• Bad decisions, made early, are difficult to fix
• Possible solution: collect more knowledge about
the problem to be solved
• software problem exploration using genetic
programming (SPE-GP)
◙ Genetic Programming as an Explorative
Tool in Early Software Development Phases
• GP is used to try to
solve the problem
• For each input data, it
is recorded how often
the resulted programs
fail to work correctly
Software project effort prediction
• Software is the most complex part of the
• It is also the most expensive to obtain
• Goal: estimate cost and effort for a given
• Related: COCOMO (2)
Limits of the Methods in Software
Cost Estimation
• Regression, GP, NN for software cost
• Estimations are not too good!
– Poor methods / need new methods?
– Incomplete/Inconsistent input data?
◙ Can genetic programming improve
software effort estimation? A
comparative evaluation
• Data from 81 Canadian software
• Developer/manager experience, year of
completion, attributes regarding size and
complexity, effort (person-hours).
• ANN and GP performed better.
Software Project Effort Estimation
Using Genetic Programming
• Grammar Guided Genetic Programming
• Classic GP: “closure”: any non-terminal
should be able to handle as an argument
any data type and value returned from a
terminal or non-terminal.
• Other alternative: Strongly Typed GP
Other articles
An evolutionary approach to estimating
software development projects
• ◙
– Combination of Software Project Simulator
and GAs. SPS generates input data which is
then used by the GA for learning and
A validation of the component-based
method for software size estimation.
• ◙
Project management
• Manager has resources (time, budget,
team etc.) and goal.
• Task: meet the goal with the given
• Or: which resources are needed to meet
the goal?
• Or: what can be accomplished with the
given resources
• Gantt charts
Tools (2)
• TPG (Task Dependency Graph)
– MM: Man Month
– SR: Skill Required
Software Project Management Net
• Automatic allocating / scheduling based on
• Input
– Employee/skill database
• Output
– Schedule
Genetic Algorithms for Project
• Many-to-many relation between task and
• Partial commitment
• Objectives
– Validity of job assignment
– Minimum overtime
– Minimum cost
– Minimum time span
Read the survey
Skim over the articles
Like one? Choose it!
You are not supposed to like a 2-page
article, unless you can implement the
techniques described in it.
• Don’t like any? Find your own SBSE
article on the net and talk to me about it.