CS B553 Homework 5: Bayesian Network Modeling and

advertisement
CS B553 Homework 5: Bayesian Network Modeling and Inference
Due date: 3/8/2012
Your boss holds a meeting and assigns you to a project that requires you to apply numerical
optimization. Being an overpaid MBA type her instructions and clarity of the scope of the project are
exceedingly vague: “optimize our strategic partnership throughput to maximize synergies” or some
other nonsense. Consider devising a Bayesian network for recommending the scope of the project and
an optimization algorithm to solve the project.
1. Draw a Bayesian network with the variables:
 Algorithm with domain {GD,QN,SA,GA} indicating gradient descent, quasi newton,
simulated annealing, and genetic algorithms.
 ProblemSize with domain {Small,Medium,Large}.
 ObjectiveType with domain {Convex,Nonconvex,Nonsmooth}.
 A binary variable Derivatives indicating respectively whether derivatives of the objective
function are available.
 LocalMinima with domain {One,Few,Many}.
 Speed with domain {Slow,Medium,Fast} indicating the speed of the algorithm.
 A binary variable Optimality indicating whether a global optimum will be reached.
ProblemSize, ObjectiveType, and Algorithm should have no parents. Derivatives should have the
parent ObjectiveType. LocalMinima should have the parent ProblemSize and ObjectiveType,
because convex problems have one minimum, and nonconvex/nonsmooth problems in highdimensional spaces often have more minima than in low dimensional ones. Speed should have
parents Algorithm, Derivatives, and ProblemSize. Optimality should have parents Algorithm,
LocalMinima, and ObjectiveType.
List the CPTs that need to be entered. How many free parameters are there in this network?
How many free parameters would be needed in a joint distribution table?
2. Enter in some reasonable values for the CPTs given your knowledge of optimization algorithms.
These values should be consistent with facts learned in class. For example, without derivative
information, GD and QN methods are slower than when derivative information is available,
because finite differencing is somewhat costly.
Deterministic quantities should be specified with certainty. For example, nonsmooth objectives
will not have derivatives available, and convex problems will have one local minimum.
When you have no reasonable background information to determine a variable’s value, set its
conditional distribution to be uniform.
3. Show the steps that you would take to calculate the unconditional distribution over Speed by
reasoning with the joint distribution. Show the steps that you would take to calculate the
unconditional distribution over Algorithm. (Do this symbolically, do not substitute the numeric
values you supplied in question 2) Note: ambiguity in how “you would take steps” vs. an
algorithm would take steps.
4. Suppose you ask your boss about the size of the problem. She replies that it’s a small problem –
no more than a dozen variables. Furthermore she overheard you mentioning genetic
algorithms, and she asks you to use it – it sure does sound good. Show the symbolic steps that
variable elimination would take in order to calculate the distribution over Speed given this new
information. (Use the best variable ordering that you can find).
5. Your boss indicates that it is more important to find a solution quickly than to achieve global
optimality. Propose a method for selecting the algorithm that is most likely to solve the
problem with Speed=Fast. You also bring up the point that it may be important to spend a
week’s worth of effort to calculate the derivatives of the objective function, and your boss asks
you “How likely would this work improve the optimization outcomes?” Describe how you would
calculate the increase in probability that Speed=Fast supposing that derivative information was
available.
6. Since your boss is known to have only an imperfect notion of technical matters, you suspect that
the information she tells you during this meeting may be incorrect, or the requirements may
change halfway through the project. Describe how you would change the network in order to
represent the imperfect knowledge about ProblemSize and Speed given in problems 4 and 5.
(The resulting model should still be a rigorous probabilistic representation.)
[Note: this question is ambiguous – what about just changing the CPTs?]
[Also, many students forget Speed. Make a note that you may not need to have it run fast]
Download