Topic

advertisement
SRL Approaches: Frame-based
Probabilistic models
February 11, 2005
Today’s Outline
• Finish w/ Graphical Models Introduction
• Families of SRL Approaches
• Frame-based Probabilistic approaches
– Probabilistic Relational Models (PRMs)
– Probabilistic Entity Relation (PERs)
SRL History
• In general, SRL combines logic and probabilities
• Historically, there are two general threads of
research
– The first takes graphical models or hierarchical
Bayesian models and adds in some form of
relational/logical representation
• examples: Probabilistic Relational Models (PRMs),
Probabilistic Entity Relation Models (PERs), Object
Oriented Bayesian Networks (OOBNs)
• comes largely from the Uncertainty in AI (UAI}
community
– The second takes a logical representation (first-order
logic, horn clauses, etc) and adds in some form of
probabilities
• examples: Bayesian Logic Programs (BLPs), Stochastic
Logic Programs (SLPs)
• comes largely from the Inductive Logic Programming
(ILP) community
Families of SRL Approaches
1. Frame-based Probabilistic Models
•
•
•
Probabilistic Relational Models (PRMs),
Probabilistic Entity Relation Models (PERs),
Object Oriented Bayesian Networks (OOBNs)
•
•
BLOGs
Relational Markov Logic (RML)
•
•
•
PRISM
Stochastic Logic Programs (SLPs)
IBAL
2. First Order Probabilistic Logic (FOPL)
3. Stochastic Functional Programs
SRL Dimensions
• Syntax – ‘logic-based’ vs. ‘schema-based’
• Logical Semantics –
– relational vs. first-order
– domain closure/closed world vs. open world
• Probabilistic Semantics –
– ‘possible worlds’ vs. ‘domain frequencies’
– directed vs. undirected models
• … others?
Today: PRMs
• Developed by Daphne Koller’s group at
Stanford
– representation: Avi Pfeffer
• builds on work in KBMC (knowledge-based model
construction) by Haddawy, Poole, Wellman and
others…
• Object Oriented Bayesian Networks
• Relational Probability Models
– learning: myself, Nir Friedman, Avi
•
•
•
•
Attribute Uncertainty
Structural Uncertainty
Class Uncertainty
Identity Uncertainty
– undirected models: Ben Taskar
Motivation: Discovering Patterns
in Structured Data
Contact
Strain
Patient
Treatment
Learning Statistical Models
Patient
Traditional approaches
– work well with flat
representations
– fixed length attribute-value
vectors
– assume independent (IID)
sample
Problems:
– introduces statistical skew
– loses relational structure
• incapable of detecting link-based
patterns
– must fix attributes in advance
flatten
Contact
Roadmap
• Background:
» Bayesian Networks (BNs) [Pearl, 1988]
– Probabilistic Relational Models (PRMs)
• Learning PRMs w/ Attribute Uncertainty
• PRMs w/ Structural Uncertainty
• PRMs w/ Class Hierarchies
Bayesian Networks
Pneumonia
nodes = random variables
edges = direct probabilistic
influence
Tuberculosis
Lung Infiltrates
XRay
Sputum Smear
Network structure encodes independence assumptions:
XRay conditionally independent of Pneumonia given Infiltrates
Bayesian Networks
P T P(I |P, T )
p t 0.8 0.2
p t
0.6 0.4
p t
0.2 0.8
Pneumonia
Tuberculosis
Lung Infiltrates
p t 0.01 0.99
XRay
Sputum Smear
• Associated with each node Xi there is a conditional
probability distribution P(Xi|Pai:) — distribution over Xi
for each assignment to parents
– If variables are discrete, P is usually multinomial
– P can be linear Gaussian, mixture of Gaussians, …
BN Semantics
P
T
conditional
local
full joint
independencies + probability = distribution
I
models
over domain
in
BN
structure
X
S
P(p ,t,i, x, s )  P(p ) P(t) P(i | p ,t)
P(x | i) P(s | t)
• Compact & natural representation:
– nodes have  k parents  2k n vs. 2n params
Roadmap
• Background:
• Bayesian Networks (BNs)
» Probabilistic Relational Models (PRMs)
• Learning PRMs w/ Attribute Uncertainty
• PRMs w/ Structural Uncertainty
• PRMs w/ Class Hierarchies
Probabilistic Relational Models
• Combine advantages of relational logic & Bayesian
networks:
– natural domain modeling: objects, properties,
relations;
– generalization over a variety of situations;
– compact, natural probability models.
• Integrate uncertainty with relational model:
– properties of domain entities can depend on
properties of related entities;
– uncertainty over relational structure of domain.
Relational Schema
Infected with
Strain
Classes
Unique
Infectivity
Contact
Relationships
Contact-Type
Close-Contact
Patient
Skin-Test
Homeless
Age
HIV-Result
Interacted with
Ethnicity
Disease-Site
Attributes
• Describes the types of objects and relations in
the database
Probabilistic Relational Model
Strain
Infectivity
Patient
Unique
POB
Homeless
HIV-Result
Contact
Disease Site
H, C P(T | H, C)
f , f 0.9 0.|1
Cont.Transmitted


 Cont.Close-Contact
f, t
0. 8 0. 2
P

t , f 0.7 0.3
 Cont.Contactor.HIV

t, t
0. 6 0. 4
Age
Contact-Type
Close-Contact
Transmitted
Relational Skeleton
Strain
s1
Contact
c1
Patient
p1
Strain
s2
Patient
p2
Contact
c2
Contact
c3
Patient
p3
Fixed relational skeleton 
– set of objects in each class
– relations between them
Uncertainty over assignment of values to attributes
PRM defines distribution over instantiations of
attributes
A Portion of the BN
P1.POB
C1.Age
P1.Homeless
C1.Contact-Type
P1.HIV-Result
true
false
C1.Close-Contact
P1.Disease Site
C1.Transmitted
C2.Age
C2.Contact-Type
true
C2.Close-Contact
C2.Transmitted
P(T||H,
H,C)
C)
H,,CC P(T
H
ff,, ff 00..99 00..11
ff,,tt 00..88 00..22
tt,, ff 00..77 00..33
tt,,tt
00..66 00..44
PRM: Aggregate Dependencies
Patient
Contact
POB
Contact-Type
Homeless
Close-Contact
HIV-Result
Age
Age
Transmitted
Disease Site
Contact
Patient
Jane m
Doe y
POB
y 0.4
US
Homeless m 0.2
no
HIV-Result o 0.1
m
0.4
0.6
0.3
negative
Age
A
???
Disease Site
pulmonary
mode
o
0.2
0.2
0.6
#5077
Contact-Type
coworker
Close-Contact
no
Age
middle-aged
Transmitted
false
.
Contact
#5076
Contact-Type
Contact
spouse
#5075
Close-Contact
Contact-Type
yes
friend
Age
Close-Contact
middle-aged
no
Transmitted
Age
true
middle-aged
Transmitted
false
.
.
sum, min, max,
avg, mode, count
PRM with AU Semantics
Contact
c1
Strain
Strain
s1
Patient
Strain
s2
Contact
Patient
p2
Patient
p1
Patient
p3
+
PRM
Contact
c2
Contact
c3
relational skeleton 
=
probability distribution over completions I:
P(I |  , S, )   P( x. A | parentsS , ( x. A))
x
Objects
x. A
Attributes
Next Time 2/18
•
Structural Uncertainty
•
Class Uncertainty
•
PERS
•
Background on Learning Graphical Models available in CS Library
– Learning Probabilistic Models of Link Structure, L. Getoor, N. Friedman,
D. Koller, B. Taskar. Journal of Machine Learning Research, 2002.
– http://www.cs.umd.edu/class/spring2005/cmsc828g/Readings/jmlr02.p
df
– PRMs with Class Hierarchies, chapter 5 of Learning Statistical Models
of Relational Data, Lise Getoor, PhD Thesis, Stanford University, 2001.
– http://www.cs.umd.edu/class/spring2005/cmsc828g/Readings/thesisch5.pdf
– Probabilistic Models for Relational Data, David Heckerman, Christopher
Meek and Daphne Koller
– ftp://ftp.research.microsoft.com/pub/tr/TR-2004-30.pdf
Today’s Outline 2/18
• Frame-based Probabilistic approaches
– Probabilistic Relational Models (PRMs)
• Learning PRMs
• PRMs w/ Structural Uncertainty
• PRMs w/ Class Hierarchies
– Probabilistic Entity Relation (PERs)
Learning PRMs w/ AU
Strain
Database
Patient
Contact
Strain
Patient
Contact
Relational
Schema
• Parameter estimation
• Structure selection
Parameter Estimation in PRMs
• Assume known dependency structure S
• Goal: estimate PRM parameters q
– entries in local probability models, q x. A| parents( x. A )
• q is good if it is likely to generate the observed
data, instance I .
l (q : I , S )  log P (I | S, q)
• MLE Principle: Choose q* so as to maximize l
As in Bayesian network learning,
crucial property: decomposition
separate terms for different X.A
ML Parameter Estimation
Patient
HIV
Contact
DiseaseSite
CloseContact
Transmitted
q*
N ( C .T  f , P . H  f ,C .C t )
N ( P . H  f ,C .C t )
H, C P(T | H, C)
| 
 f Cont.Transmitted
,f ?
?

 Cont.Close-Contact
PP f , t ?

?
 t , Cont.Contactor.HIV

f
?
?
t, t
?
?
Query for counts:
Count  C.Transmitted
P.HIV
C .CloseContact
Patient
table
Contact
table
Structure Selection
• Idea:
– define scoring function
– do local search over legal structures
• Key Components:
– legal models
– scoring models
– searching model space
Structure Selection
• Idea:
– define scoring function
– do local search over legal structures
• Key Components:
» legal models
– scoring models
– searching model space
Legal Models
• PRM defines a coherent probability model over
a skeleton  if the dependencies between
object attributes is acyclic
author-of
Researcher
Prof. Gump
Reputation
high
sum
Paper
P1
Accepted
yes Paper
P2
Accepted
yes
How do we guarantee that a PRM is acyclic
for every skeleton?
Attribute Stratification
PRM
dependency
structure S
dependency
graph
Paper.Accecpted
if Researcher.Reputation
depends directly on Paper.Accepted
Researcher.Reputation
Attribute stratification:
dependency graph acyclic  acyclic for any 
Algorithm more flexible; allows certain
cycles along guaranteed acyclic relations
Structure Selection
• Idea:
– define scoring function
– do local search over legal structures
• Key Components:
– legal models
» scoring models
– searching model space
Scoring Models
• Bayesian approach:
marginal
likelihood
prior



Score ( S : I )  log P( S | I )  log[ P(I | S )P( S )]
• Standard approach to scoring models;
used in Bayesian network learning
Structure Selection
• Idea:
– define scoring function
– do local search over legal structures
• Key Components:
– legal models
– scoring models
» searching model space
Searching Model Space
Phase 0: consider only dependencies within a class
Strain
Strain
Patient
Patient
Contact
Strain
Patient
Contact
Contact
Phased Structure Search
Phase 1: consider dependencies from “neighboring”
classes, via schema relations
Strain
Strain
Patient
Patient
Contact
Strain
Patient
Contact
Contact
Phased Structure Search
Phase 2: consider dependencies from “further”
classes, via relation chains
Strain
Strain
Patient
Patient
Contact
Strain
Patient
Contact
Contact
Experimental Evaluation
Synthetic Data
• Simple ‘genetic’ domain
• Construct training set of various sizes
• Compare the log-likelihood of test set of
size 100,000
– ‘gold’ standard model
– Learn parameters (model structure given)
– Learn model (learn both structure and
parameters)
(Father)
Blood Type
(Mother)
Person
Blood Type
P-chromosome
Person
P-chromosome
M-chromosome
P-chromosome
M-chromosome
Person
M-chromosome
Blood Type
Contaminated
Result
Blood Test
Error on Test Set
Gold
0
Learned Parameters
Avg Log-Likelihood
-0.5
Learned Models
-1
-1.5
-2
-2.5
-3
0
1000
2000
Dataset Size
3000
4000
Error Variance
2.5
Learned Parameters
Learned Models
Avg Error
2
1.5
1
0.5
0
0
1000
2000
3000
Dataset Size
4000
Number of Learned Models
Errors in Learned Structure
12
10
8
too simple
correct
too complex
6
4
2
0
500 1300 1800 2500 3000 3800 4300
Dataset Size
TB Cases in SF
Patient (2300)
Contact (20000)
Ethnicity
Contact-type
Homeless
Age
Age @ diagnosis
Care
HIV result
Infected
Disease-site
X-ray
Strain (1000)
Unique
Drug-Resistance
TB PRM
Strain
Contact
# infected
hh_oohh
contype
infectivity
closecont
ethnic
homeless
contage
hivres
care
xray
pob
result
ageatdx
Patient
Subcase
smrpos
% infected
disease site
# contacts
transmitted
contype
hh_oohh
closecont
gender
SEC PRM
40,000
Person
Company
 rtn assets
rtn earn assets
 total_assets
# employees
age
retired
20,000
retired
total assets
fired
120,000
salary
salary
# roles
top_role
top_role
PrevRole
Role
Your turn…
• Describe your focus problem
• What would a PRM for (an aspect of) your
focus problem look like?
Roadmap
• Motivation and Background
• PRMs w/ Attribute Uncertainty
» PRMs w/ Structural Uncertainty
• PRMs w/ Class Hierarchies
An Example
Topic
Cornell
Theory
AI
Agent
Scientific Paper
Theory papers
•Attributes of object
•Attributes of linked objects
•Attributes of heterogeneous linked objects
•Collective Classification
Topic
Theory
AI
Structural Uncertainty
• Motivation: relational structure provides
useful information for density estimation
and prediction
• Construct probabilistic models of relational
structure that capture structural
uncertainty
• Two new mechanisms:
– Reference uncertainty
– Existence uncertainty
PRMs w/ AU: another example
Person
Movie
Gender
Genre
Age
Vote
Income
Rank
PRM consists of:
Relational Schema
Dependency Structure
Local Probability Models
Vote.Rank |


 Vote.Movie.Genre,
P

 Vote.Person.Gender, 
 Vote.Person.Age 
PRM w/ Attribute Uncertainty
Movie m1
Movie m2
Vote v1
Movie: m1
Person: p1
Primary Keys
Person p1
Vote v2
Movie: m1
Person: p2
Vote v3
Movie: m2
Person: p2
Person p2
Foreign Keys
Fixed relational skeleton :
– set of objects in each
class
– relations between them
Uncertainty over assignment of values to
attributes
PRM w/ AU Semantics
Person
Movie
Movie
Patient
p2
Vote
Vote
Person
Vote
Person
Movie
Vote
+
PRM
relational skeleton 
=
Ground BN defining distribution over
complete instantiations of attributes I:
P(I |  , S, )   P( x. A | parentsS , ( x. A))
x
Objects
x. A
Attributes
Issue
• PRM w/ AU applicable only in domains
where we have full knowledge of the
relational structure
Next we introduce PRMs which allow
uncertainty over relational structure…
PRMs w/ Structural Uncertainty
Advantages:
– Applicable in cases where we do not have full
knowledge of relational structure
– Incorporating uncertainty over relational structure
into probabilistic model can improve predictive
accuracy
Two approaches:
– Reference uncertainty
– Existence uncertainty
• Different probabilistic models; varying amount
of background knowledge required for each
Citation Relational Schema
Author
Institution
Research Area
Wrote
Paper
Paper
Topic
Word1
Word2
…
WordN
Cites
Citing
Paper
Count
Cited
Paper
Topic
Word1
Word2
…
WordN
Attribute Uncertainty
Author
Institution
P( Institution |
Research Area)
Research Area
Wrote
P( Topic |
Paper.Author.Research Area
Paper
Topic
P( WordN | Topic)
Word1
...
WordN
Reference Uncertainty
Bibliography
1. ----- ?
`
2. ----- ?
3. ----- ?
Scientific Paper
Document Collection
PRM w/ Reference Uncertainty
Paper
Topic
Words
Paper
Cites
Citing
Cited
Topic
Words
Dependency model for foreign keys
Naïve Approach: multinomial over primary key
• noncompact
• limits ability to generalize
Reference Uncertainty Example
Paper
Paper
Paper
P5
Paper
P4
P3
Topic
Paper
M2
Topic
Topic
AIAI
P1
Topic
AI
Topic
AI
Theory
Paper
P5
Topic
AI
Paper
P3
Topic
AI
P1
Paper
P4
Paper
Topic
P2
Topic PaperTheory
Theory P1
Topic
Theory
P2
Paper.Topic = AI
Paper.Topic = Theory
Paper
Topic
Words
Cites
Citing
Cited
P1 P2
P1 P2
Theory
0.1 0.9
0AI.3 0.7 0.99 0.01
Topic
PRMs w/ RU Semantics
Paper
Topic
Words
Paper
Cites
Cited
Citing
Topic
Words
PRM RU
Paper
Paper
P2
P5
Paper
Topic
Paper
Topic
P4Paper
Theory
P3
AI
Topic
P1Topic
Theory
TopicAI
???
Paper
Paper
P2
P5
Paper
Topic
Paper
Topic
P4Paper
Theory
P3
Reg
Reg
AI
Topic
P1Topic
Theory
TopicAI
Reg
Reg
Cites
???
entity skeleton 
PRM-RU + entity skeleton 
 probability distribution over full instantiations I
Structure Search: New Operators
Paper
Paper
Paper
Paper
Paper
Paper
Paper
Paper
Paper
Paper
Paper
Paper
Topic
Words
Cites
Citing
Cited
Paper
Topic
Words
Cited
Paper
Paper
Paper
Paper
Paper
Paper
Paper
Paper
Paper
Paper
Topic = AI
Papers
1.0
PaperPaper
Paper
Institution = MIT
Paper
Paper
Paper
Paper
Paper
Author
Institution
PRMs w/ RU Summary
• Define semantics for uncertainty over
foreign-key values
• Search now includes operators Refine and
Abstract for constructing foreign-key
dependency model
• Provides one simple mechanism for link
uncertainty
Existence Uncertainty
??
?
Document Collection
Document Collection
PRM w/ Exists Uncertainty
Paper
Paper
Topic
Words
Topic
Words
Cites
Exists
Dependency model for existence of relationship
Exists Uncertainty Example
Paper
Topic
Words
Paper
Topic
Words
Cites
Exists
Citer.Topic
Theory
Theory
AI
AI
Cited.Topic
Theory
AI
Theory
AI
False
True
0.995
0.999
0.997
0.993
0005
0001
0003
0008
PRMs w/ EU Semantics
Paper
Topic
Words
Paper
Cites
Exists
Topic
Words
PRM EU
Paper
Paper
P2
P5
Paper
Topic
Paper
Topic
P4Paper
Theory
P3
AI
Topic
P1Topic
Theory
TopicAI
???
???
Paper
Paper
P2
P5
Paper
Topic
Paper
Topic
P4Paper
Theory
P3
AI
Topic
P1Topic
Theory
TopicAI
???
object skeleton 
PRM-EU + object skeleton 
 probability distribution over full instantiations I
Learning PRMs w/ EU
• Idea: just like in PRMs w/ AU
– define scoring function
– do greedy local structure search
• Issues:
– efficiency
• Computation of sufficient statistics
for exists attribute
• Do not explicitly consider relations
that do not exist
Experiment I: EachMovie+
MOVIE
ROLE
action
Movie
animation
art_foreign
classic
thriller
comedy
family
gender
Actor
Size: 35,000
Size: 50,000
†
horror
drama
romance
theater_status
age
education
Movie
video_status
Person
Size: 1600
rank
VOTE
gender
personal_income
household_income
PERSON
Size: 300,000
* © 1999 -2000 Internet Movie Database Limited
†
*
ACTOR
http://www.research.digital.com/SRC/EachMovie
Size: 25,000
EachMovie+ PRM-RU
ROLE
theater_status
MOVIE
video_status
Movie
classic
Actor
ACTOR
gender
Action
true
art_foreign
false
comedy
education
rank
personal_income
drama
romance
Movie
household_income
family
Person
animation
thriller
horror
M F
0.8 0.2
0.7 0.3
action
VOTE
age
gender
PERSON
Typical Voter: male, young adult,
college w/o degree, middle income
EachMovie+ PRM-EU
theater_status
gender
video_status
ACTOR
ROLE
classic
exists
animation
gender
art_foreign
family
comedy
-
age
drama
rank
romance
horror
thriller
MOVIE
action
+
exists
VOTE
household_income
personal_income
education
PERSON
Men much more likely to vote on
action movies
Experiment II: Prediction
Paper
P134
Topic
Reinforcement Learning
Words
Paper
…
P1067
Topic
Reinforcement Learning
Words
…
Citing Papers
Paper
P506
Topic ??
w1
...
wN
Paper
P516
Topic
Reinforcement Learning
Words
Paper
…
P1309
Topic
Probabilistic Reasoning
Words
Paper
…
P289
Topic
Reinforcement Learning
Words
…
Cited Papers
Domains
Paper
Paper
Topic
Topic
Cites
w1 . . . wN
cited paper
Exists
w1 . . . wN
citing paper
Cora Dataset, McCallum, et. al
Web Page
Web Page
Category
Category
Link
w1 . . . wN
From Page
Exists
w1 . . . wN
To Page
WebKB, Craven, et. al
Prediction Accuracy
Naïve-bayesRU Citing RU Cited Exists
0.9
Cora
WebKB
0.75
0.74
0.81
0.78
0.79
0.77
0.85
0.82
Naive-Bayes
RU Citing
RU Cited
Accuracy
0.85
Exists
0.8
0.75
0.7
0.65
Cora
WebKB
Experiment III: Collective
Classification
Author#2
Author#1
Area
Paper#1
Area
Inst
Paper#2
Topic
Inst
Topic
Topic
Paper#3
WordN
Word1
WordN
Word1
...
...
Exists
#1-#3
Exists
#1-#2
Exists
#2-#1
...
WordN
Exists
#3-#1
Exists
#2-#3
Word1
Exists
#3-#2
Inference in Unrolled BN
• Prediction requires inference in “unrolled” network
– Infeasible for large networks
– Use approximate inference for E-step
• Loopy belief propagation (Pearl, 88; McEliece, 98)
– Scales linearly with size of network
– Guaranteed to converge only for polytrees
– Empirically, often converges in general nets (Murphy,99)
• Local message passing
– Belief messages transferred between related instances
– Induces a natural “influence” propagation behavior
• Instances give information about related instances
Web Domain
From-Page
From
Category
Hub
...
Word1
Link
WordN
Anchor
Has
Exists
Word
To-Page
Category
Hub
To
Word1
...
WordN
WebKB Results*
0.7
Naive-Bayes
Exists
0.68
Ex+Hubs+Anchors
Accuracy
0.66
0.64
0.62
0.6
0.58
0.56
0.54
cornell
texas
wisconsin
washington
School
* from “Probabilistic Models of Text and Link Structure for Hypertext
Classification”, Getoor, Segal, Taskar and Koller in IJCAI 01 Workshop
Text Learning: Beyond Classification
Roadmap
• Motivation and Background
• PRMs w/ Attribute Uncertainty
• PRMs w/ Structural Uncertainty
» PRMs w/ Class Hierarchies
From Instances to Classes in
Probabilistic Relational Models
• Compare two approaches
– Probabilistic Relational Models (PRMs)
– Bayesian Network (BNs)
• PRMs with Class Hierarchies (PRM-CH)
– bridge gap between BNs and PRMs
• Learning PRM-CHs
– hierarchy supplied
– discovering hierarchy
PRM for Collaborative Filtering
TV-Program
Genre
Budget
Time-slot
Network
Person
Vote
Age
Program
Gender
Voter
Education
Ranking
Relational Schema
+ Dependency Model
G E
doc hs
doc bs
sitcom hs
Income
l
m
0 . 5 0 .4
h
0 .1
0 .1 0 . 5 0 . 4
0 . 1 0 .4 0 .5
sitcom bs 0.3 0.6
0 .1
BN for Collaborative filtering
Law & Order
Frasier
Beverly Hills 90210
Mad about you
NBC Monday
Night Movies
Breese, et al. UAI-98
Seinfeld
Models Inc.
Melrose Place
Friends
Limitations of PRMs
• In PRM, all instances of the same class
must use the same dependency mode,
it cannot distinguish:
– documentaries and sitcoms
– “60 Minutes” and Seinfeld
• PRM cannot have dependencies that are
“cyclic”
– ranking for Frasier depends on ranking for Friends
Limitations of BNs
• In BN, each instance has its own dependency
model, cannot generalize over instances
– If John tends to like sitcoms, he will probably like
next season’s offerings
– whether a person enjoys sitcom reruns depends
on whether they watch primetime sitcoms
• BN can only model relationships between at
most one class of instances at a time
– In previous model, cannot model relationships
between people
– if my roommate watches Seinfeld I am more
likely to join in
Desired Model
Allows both class and instance dependencies
Soap
TV-Program
Genre
Genre
Budget
Budget
Time-slot
Time-slot
Network
Network
Documentary
Genre
Budget
Time-slot
Network
Sitcom-Vote
Vote
Program
Program
Voter
Voter
Ranking
Ranking
Person
Age
Gender
Education
Income
Doc-Vote
Program
Voter
Ranking
WWWF
PRMs w/ Class Hierarchies
Allows us to:
• Refine a “heterogenous” class into more
coherent subclasses
• Refine probabilistic model along class
hierarchy
– Can specialize/inherit CPDs
– Construct new dependencies that were
originally “acyclic”
Provides bridge from class-based model
to instance-based model
PRM-CH
TV-Program
Genre
Budget
Time-slot
Network
Person
Age
Gender
Education
Income
Vote
Program
Voter
Ranking
TV-Program
SitCom
Relational Schema
BudgetTV -Program
Drama Documentary
Legal-Drama Medical-Drama SoapOpera
Class Hierarchy
BudgetSitCom
Budget
BudgetDrama
Legal-Drama
BudgetDocumentary
BudgetSoapOpera
BudgetMedical-Drama
Dependency Model
Learning PRM-CHs
Vote
TVProgram
Database:
Instance I
Person
Vote
TVProgram
Person
Relational
Schema
• Class hierarchy provided
• Learn class hierarchy
Structure Selection
PRM w/ CHs
• Idea:
– define scoring function
– do phased local search over legal
structures
• Key Components:
– scoring models
unchanged
– searching model space
new operators
Learning PRM-CH
• Scenario 1: Class hierarchy is provided
• New Operators
– Specialize/Inherit
BudgetTV -Program
BudgetSitCom
Budget
Legal-Drama
BudgetDrama
BudgetDocumentary
BudgetSoapOpera
BudgetMedical-Drama
Learning Class Hierarchy
• Issue: partially observable data set
• Construct decision tree for class defined over
attributes observed in training set
• New operator
– Split on class attribute
– Related class attribute
documentary
class1
English
class4
TV-Program.Genre
drama
sitcom
TV-.Network.Nationality
class2
French
class5
class3
American
class6
EachMovie+ PRM
Theater Status
1400 Movies
5000 People
240,000 Votes
Video Status
Classic
Romance
Actio
n
Art/Foreig
n
Comed
y
Animation
Famil
y
Dram
a
Horror
Thriller
MOVIE
VOTE
Rating
Age
PERSON
Gender Household Income
Personal Income
Education
http://www.research.digital.com/SRC/EachMovie
PRM-CH
Animation
Classic
Age
Video Status
FamilyTheater
Status
Art/Foreign
Animation
Theater Status
Family
Video Status
Drama
Theater Status
Thriller
Animation
Drama Classic Horror
Family
Horror
Video
Status
Theater
Status
OTHER-MOVIE
Art/Foreign
Animation
Thriller
Drama Classic Horror
Family Video
Status
COMEDY-MOVIE
Art/Foreign
Drama ClassicThriller
Horror
ACTION-VOTE
ACTION-MOVIE
Art/Foreign
Thriller
Rating
ROMANCE-MOVIE
PERSON
Household Income
Gender
Personal Income
Education
COMEDY-VOTE
Rating
Rating
OTHER-VOTE
Rating
ROMANCE-VOTE
Comparison
• 5 Test Sets: 1000 votes, ~100 movies, ~115
people
– PRM Mean LL: -12,079, std 475.68
– PRM-CH Mean LL: -10558, std 433.10
• Using standard t-test, PRM-CH model
outperforms PRM model with over 99%
confidence
PRM-CH Summary
• PRMs with class hierarchies are a natural
extension of PRMs:
– Specialization/Inheritance of CPDs
– Allows new dependency structures
• Provide bridge from class-based to
instance-based models
• Learning techniques proposed
– Need efficient heuristics
– Empirical validation on real-world domains
Roadmap
• Motivation and Background
• PRMs w/ Attribute Uncertainty
• PRMs w/ Structural Uncertainty
• PRMs w/ Class Hierarchies
Next Time 2/25
• Focus Problems
– Please add your focus problem to the class wiki
• Give a PRM for the problem
• Give a PER for the problem
• Give at least one of the logical-based methods (BLP, LPRM, LBN)
– For each representation, discuss some modeling issue, or some
novelty you used – e.g. structural uncertainty, constraints, etc.
• Readings for next three weeks
– 2/28 – Logic-based approaches
– 3/4 – Advanced Logic-based approaches
– 3/11 – Undirected Models
• Please sign up to lead the discussion for one of the papers
2/28 – 3/11
• For each paper, please post your comments for each paper
on the wiki by midnight Wed before the class in which they
are assigned to be discussed. This gives the discussion
leader some time to synthesize the comments.
Download