slides - Department of Computing Science and Mathematics

advertisement
The Necessity of Metabias in
Metaheuristics.
John.woodward@nottingham.edu.cn
www.cs.nott.ac.uk/~jrw
Abstract
Bias is necessary for learning, and is a probability over a search space. This
is usually introduced implicitly. Each time a metaheuristic is executed it
will give a different solution. However, if executed repeated it will give the
same solution on average. In other words, the bias is static (even if we
include a self adaptive component to the search algorithm). One desirable
property of metaheuristics is that they converge. This means that there is
a non-zero probability of visiting each item in the search space. Search
algorithms are intended to be reused on many instances of a problem.
These instances can be consider to be drawn from a probability
distribution. In other words, a search algorithm and problem class can
both be viewed as probability distributions over the search space. If the
bias of a search algorithm does not match the bias of a problem class, it
will under perform, if however, they do match, it will perform well.
Therefore we need some mechanism of altering the initial bias of the
search algorithm to coincide with that of the problem class. This
mechanism can be realized by a meta level which alters the bias of the
base level. In other words, if a search algorithm is to be applied to many
instances of a problem, then meta bias is necessary. This implies that
convergence at the meta level means a search algorithm shift its bias to
any probability distribution. Additionally, shifting bias is equivalent to
automating the design of search algorithms.
Outline
• Need and application of meta heuristics.
• Preliminaries (search space, problem instance,
problem class, meta heuristic)
• Generate-and-Test and Convergence
• Bias and Probability
• Many problem instances.
• Bias and Problem Class
• Summary and Conclusions
The Need for Heuristics
• Many computational problems of industrial
interest are intractable to search.
• The combinatorial explosion associated with
many problems means the search spaces
grows too rapidly for practical purposes.
• For example, with the travelling salesman
problem, the size of the search space grows as
O(n!) where n is the number of cities.
• Therefore we need “non-exact” heuristics.
Examples of MetaHeuristics
•
•
•
•
•
•
Hill Climbing
Simulated Annealing
Genetic Algorithms
Ant colony algorithms
Swam particle optimization
…the list continues to grow…
Applications of Metaheuristis
• Function Optimization
• Function Regression
• Combinatorial Problems
– Bin Packing
– Knapsack
– Travelling Salesman
• Program Induction
• Reinforcement Learning
• …
Preliminaries
• A search space is a finite set of objects.
• A cost function assigns a value to each object
in the search space.
• A problem instance is a search space and cost
function.
• A problem class is a probability distribution
over a set of problem instances with the same
search space.
• A metaheuristic samples the search space in
order to find a “good quality” solution.
Generate-and-Test/Search
• A candidate solution is generated
and tested.
• This is repeated until some
generate
termination condition is met
(usually a fixed number of
iterations).
test
• Generate-and-test is also called
“search”.
• In other words we are sampling the
search space.
Property of Convergence
Convergence: given enough time they will
eventually reach (sample/hit) the global
optima.
There is a non-zero probability of visiting all
points in the search space.
This criteria is usually easy to meet.
“premature convergence” and “getting stuck in
a local optima” are major issues for many
metaheuristics.
Bias and Probability
• In the literature, “the bias of a metaheuristic”.
• Bias is the probability distribution over the
search space (simplified assumption).
• A metaheuristic is a random variable.
• Let us call this “base bias”.
• Sources of bias: Any design decision! E.g.
– Cooling schedule, crossover operator, parameters.
• It is unlikely the designer makes correct
choices or that the choices have much affect
(i.e. tuning one parameter maybe more
effective than tuning a different parameter).
One problem instance to the next one
MetaheuristicA
Problem
instance1
metaheuristicA
Problem
instance2
Metaheuritic A operates in the same way regardless of the
underlying problem instance (1 then 2 then 1). There in no mechanism to alter
the bias from one instance to the next (except case based reasoning).
Mechanisms do exist which alter the bias during the application of the
metaheuristic on a single problem instance, but rarely across problem
instances. We argue this is essential.
The metaheuristic may learn on one problem instance, but is not learning
across problem instances, which is essential if it is applied to many instances.
One problem instance to the next 2
Meta bias
metaheuristicA metaheuristicA metaheuristicA
Problem
instance1
Problem
instance2
Problem
instance1
Meta bias can affect how a metaheuristic operates on different instances.
As a special case, if we return to problem instance 1 (or something similar),
now at least have a mechanism to allow improved performance.
Bias and Problem Class
We want, over a number of
instances, for the algorithm to
learn which are the promising
areas of the search space.
Distribution of quality solutions
over a problem class
0.6
0.5
Probability
If no quality solutions exist in a
certain part of the search space,
there is no need in sampling it,
and therefore we do not require
the property of convergence.
0.4
0.3
0.2
0.1
0
1
2
3
4
5
Items in search space
0.6
Probability
Convergence at the meta level
means for the probability of
sampling to change from an
arbitrary initial distribution for the
best for the class. This cannot be
achieved by parameter tuning
alone.
0.5
0.4
0.3
0.2
0.1
0
1
2
3
4
5
Distribution of sampling
By a metaheuristic
Summarizing our contributions
1. If we apply our metaheuristics to many problem
instances, then meta bias is necessary so the base bias
of the metaheuristic converges towards the global
optima. Altering bias at the meta level is equivalent to
automatically designing metaheuristics.
2. At the meta level, convergence means that the bias can
shift to any base bias (i.e. any probability distribution).
3. Problem classes defined as probability distributions are
an essential part of the machine learning methodology.
A problem class defines a niche in which a suitable
metaheuristic can fit. Therefore algorithms should be
tested on problem instances which are drawn from this
distribution and NOT randomly selected benchmark
instances.
REFERENCES 1
[1] T.M. Mitchell, The Need for Biases in Learning Generalizations, Rutgers
Computer Science Department Technical Report CBM-TR-117, May, 1980.
Reprinted in Readings in Machine Learning, J. Shavlik and T. Dietterich,
eds., Morgan Kaufmann, 1990.
[2] S. Thrun and L. Pratt, Learning To Learn, S. Thrun and L. Pratt, ed., Kluwer
Academic Publishers, 1998, 354 pages.
[3] E. K. Burke, J. Woodward, M. Hyde, G. Kendall, Automatic heuristic
generation with genetic programming: Evolving a Jack of all trades or a
master of one. Genetic and Evolutionary Computation Conference, GECCO
2007.
[4] C. Schumacher, M. D. Vose, and L. D. Whitley. The no free lunch and
problem description length. In proceedings of the Genetic and
Evolutionary Computation Conference, 565-570, California, USA, 7-11 July
2001. Morgan Kaufmann.
[5] T. M. Mitchell. Machine Learning. McGraw-Hill 1997.
[6] Riccardo Poli, William B. Langdon and Nicholas Freitag McPhee, A Field
Guide to Genetic Programming, Lulu.com, freely available under Creative
Commons Licence from www.gp-field-guide.org.uk, March 2008.
REFERENCES 2
[7] William B. Langdon: Scaling of Program Fitness Spaces. Evolutionary Computation
7(4): 399-428 (1999)
[8] Woodward J. Computable and Incomputable Search Algorithms and Functions. IEEE
International Conference on Intelligent Computing and Intelligent Systems (IEEE
ICIS 2009) November 20-22,2009 Shanghai, China.
[9] Woodward, J., Evans A., Dempster, P. 2008, A Syntactic Justification of Occam’s
Razor. October 31 to November 2, 2008 Midwest, A New Kind of Science
Conference Indiana University Bloomington, Indiana
[10] Marcus Hutter, ”Universal Artificial Intelligence: Sequential Decisions based on
Algorithmic Probability” Springer,2004, http://www.hutter1.net/ai/uaibook.htm.
[11] Hartley Rogers, Theory of Recursive Functions and Effective Computability, The
MIT Press (April 22, 1987)
[12] Edmund K. Burke and Graham Kendall (editors), Search Methodologies:
Introductory Tutorials in Optimization and Decision Support Techniques, Springer
2005.
[13] Stefan Droste, Thomas Jansen, Ingo Wegener: Perhaps Not a Free Lunch But At
Least a Free Appetizer, 13-17 July 1999: 833-839 Proceedings of the Genetic and
Evolutionary Computation Conference, Orlando, Florida, USA, Morgan Kaufmann,
The End
•
•
•
•
Thank you
I would be glad to take any questions.
John.woodward@nottingham.edu.cn
www.cs.nott.ac.uk/~jrw/
Download