slides - Xiannian Fan

advertisement
Finding Optimal Bayesian Network Structures
with Constraints Learned from Data
Xiannian Fan1, Brandon Malone2 and Changhe Yuan1
1City
University of New York
2University
of Helsinki
1/21
2/21
Overview
๏ƒ˜Structure Learning of Bayesian Networks
• Graph Search Formulation
• Reduce Search Space Using Constraints
• Experiments
• Summary
3/21
•
Structure Learning
Graph Search
Reduce Search Space
Experiments
Summary
Learning optimal Bayesian networks
• Very often we have data sets.
• We can extract knowledge from these data.
structure
numerical
parameters
data
4/21
•
Structure Learning
Graph Search
Reduce Search Space
Experiments
Summary
Score-based learning
• Goal: Find a Bayesian network that optimizes a scoring
function, e.g., MDL, BIC.
• NP-hard
• Decomposability
๐’
๐’Š=๐Ÿ ๐’”๐’Š (๐‘ท๐‘จ๐’Š )
where PAi is the parent set of Xi in N
๐’” ๐ =
5/21
•
Structure Learning
Graph Search
Reduce Search Space
Experiments
Summary
Graph Search Formulation
• Formulate the learning task as a shortest path finding problem
– The shortest path solution to a graph search problem corresponds to an
optimal Bayesian network
[Yuan, Malone, Wu, IJCAI-11]
6/21
•
Structure Learning
Graph Search
Reduce Search Space
Experiments
Summary
Search graph (Order graph)
ฯ•
1
1,2
1,3
2
3
2,3
1,4
4
2,4
3,4
Formulation:
Search space: Variable subsets
Start node:
Empty set
Goal node:
Complete set
Edges:
Select parents
Edge cost:
BestScore(X,U) for
edge U๏‚ฎU๏ƒˆ{X}
Task: find the shortest path
between start and goal nodes
1
1,2,3
1,2,4
1,3,4
2,3,4
1,4,3,2 3
2
1,2,3,4
4
4 3
2
[Yuan, Malone, Wu, IJCAI-11]
7/21
•
Structure Learning
Graph Search
Reduce Search Space
Experiments
Summary
Pruning Based on Bounds
• Improved search heuristic [Yuan and Malone, UAI-12]
• Tightening Bounds [Fan, Yuan and Malone, AAA-14]
ฯ•
1
1,3
2
3
2,3
1,4
1,2,4
1,3,4
2,4
2,3,4
1,2,3,4
8/21
•
Structure Learning
Graph Search
Reduce Search Space
Experiments
Summary
Search Space
• Size: O(2n)
ฯ•
1
1,2
1,2,3
1,3
2
3
2,3
1,4
1,2,4
1,3,4
4
2,4
3,4
2,3,4
1,2,3,4
[Yuan, Malone, Wu, IJCAI-11]
9/21
••
Structure Learning
Graph Search
Reduce Search Space
Experiments
Summary
Motivation for our work
• Observation: Most Scores are not optimal for any ordering
• Sparse Parent Graph [Yuan and Malone, UAI-12]
– Bounds on Parents Sets for MDL [Tian, UAI-00])
– Properties of Decomposable Score Functions [de Campos and Ji, JMLR-11]
Most scores can be pruned
10/21
••
Sparse Parent Graph
Structure Learning
Graph Search
Reduce Search Space
Experiments
Summary
Propagate the
best scores.
• Potential Optimal Parent Set for X1
(a)
s1(PA1)
(b)
s1(PA1)
11/21
••
Structure Learning
Graph Search
Reduce Search Space
Experiments
Summary
Spare Parent Graph
• Potential Optimal Parent Set for X1
(b)
s1(PA1)
Prune useless
local scores.
(c)
s1(PA1)
12/21
••
Structure Learning
Graph Search
Reduce Search Space
Experiments
Summary
Motivation for our work
• Potentially Optimal Parent Sets (POPS)
An example:
• Observation: Not all variables can possibly be ancestors of the
others.
– E.g., any variables in {X3,X4,X5,X6} can not be ancestor of X1 or X2
13/21
•
Structure Learning
Graph Search
Reduce Search Space
Experiments
Summary
POPS Constraints
• Parent Child Graph
– Formed by Aggregating the POPS
• Component Graph
– Formed by Strongly Connected Components (SCCs)
– Give the Ancestor Constraints
{1, 2}
{3,4,5,6}
14/21
•
Structure Learning
Graph Search
Reduce Search Space
Experiments
Summary
POPS Constraints
• Decompose the Problem
– Each SCC corresponds to a
smaller subproblem
– Each subproblem can be
solved independently.
{1, 2}
{3,4,5,6}
15/21
•
Structure Learning
Graph Search
Reduce Search Space
Experiments
Summary
POPS Constraints
• Recursive POPS Constraints
– Selecting the parents for one of
the variables has the effect of
removing that variable from the
parent relation graph.
16/21
•
Structure Learning
Graph Search
Reduce Search Space
Experiments
Summary
Top-p POPS Constraints
• Create the parent relation graph using only the best p POPS of
each variable
– Tradeoff between Optimality and Efficiency
– Lossy score pruning
• Sub-optimality Bound
– The most loss by excluding the pruned POPS for Xi
๐œน๐’Š = ๐’Ž๐’‚๐’™ ๐ŸŽ, ๐’”๐’Š (๐‘ท๐‘จ๐’Š − ๐’”๐’Š (๐‘ท๐‘จ′๐’Š ))
where ๐‘ ๐‘– (๐‘ท๐‘จ๐’Š )): Xi selects parents PAi using Top-p Constraints
๐‘ ๐‘– (๐‘ƒ๐ด′๐‘– )): Xi selects parents without constraints.
– Error Bound for score s(N) learned using POPS Constraint:
๐=
๐’”(๐‘ต)
๐’Š (๐’”๐’Š (๐‘ท๐‘จ๐’Š )−๐œน๐’Š )
17/21
Structure Learning
Graph Search
Reduce Search Space
Experiments
Summary
•
Experiment : POPS and Recursive POPS Constraints
Alarm, 37 : # Expanded
Nodes(million)
3
Alarm, 37 : Running
Time(seconds)
90
80
70
60
50
40
30
20
10
0
2.5
2
1.5
1
0.5
0
No
Constraint
POPS
Recursive
POPS
No
Constraint
POPS
Recursive
POPS
18/21
Structure Learning
Graph Search
Reduce Search Space
Experiments
Summary
•
Experiment : POPS and Recursive POPS Constraints
Barley, 48: # Expanded
Nodes(million)
Barley, 48 : # Running
Time(seconds)
0.7
3
0.6
2.5
0.5
2
0.4
1.5
0.3
1
0.2
0.1
0.5
0
0
No
Constraint
POPS
Recursive
POPS
No
Constraint
POPS
Recursive
POPS
19/21
•
Structure Learning
Graph Search
Reduce Search Space
Experiments
Summary
Experiment : POPS and Recursive POPS Constraints
Soybean, 36 : # Running
Time(seconds)
Soybean, 36 : # Expanded
Nodes(seconds)
12
600
10
500
8
400
6
300
4
200
2
100
0
0
No
Constraint
POPS
Recursive
POPS
No
Constraint
POPS
Recursive
POPS
20/21
Structure Learning
Graph Search
Reduce Search Space
Experiments
Summary
•
Experiment :Top-p POPS Constraints on Hailfinder
๐−๐Ÿ
21/21
Summary: Graph Search Formulation
•
Structure Learning
Graph Search
Reduce Search Space
Experiments
Summary
• This Work: Reduce Search Space Using POPS Constraints
22/21
Download