IBAL slides

Turning Probabilistic

Reasoning into Programming

Avi Pfeffer

Harvard University

Uncertainty





Uncertainty is ubiquitous









Partial information

Noisy sensors

Non-deterministic actions

Exogenous events

Reasoning under uncertainty is a central challenge for building intelligent systems

Probability





Probability provides a mathematically sound basis for dealing with uncertainty

Combined with utilities, provides a basis for decision-making under uncertainty

Probabilistic Reasoning







Representation: creating a probabilistic model of the world

Inference: conditioning the model on observations and computing probabilities of interest

Learning: estimating the model from training data

The Challenge



How do we build probabilistic models of large, complex systems that are





 easy to construct and understand support efficient inference can be learned from data

(The Programming Challenge)



How do we build programs for interesting problems that are





 easy to construct and maintain do the right thing run efficiently

Lots of Representations





Plethora of existing models



Bayesian networks, hidden Markov models, stochastic context free grammars, etc.

Lots of new models



Object-oriented Bayesian networks, probabilistic relational models, etc.

Goal



A probabilistic representation language that





 captures many existing models allows many new models provides programming-language like solutions to building and maintaining models

IBAL





A high-level “probabilistic programming” language for representing







Probabilistic models

Decision problems

Bayesian learning

Implemented and publicly available

Outline











Motivation

The IBAL Language

Inference Goals

Probabilistic Inference Algorithm

Lessons Learned

Stochastic Experiments









A programming language expression describes a process that generates a value

An IBAL expression describes a process that stochastically generates a value

Meaning of expression is probability distribution over generated value

Evaluating an expression = computing the probability distribution

Simple expressions







Constants

Variables

Conditionals



Stochastic Choice x = ‘hello y = x z = if x==‘bye ff then 1 else 2 w = dist [ 0.4: ’hello,

0.6: ’world ]

Functions fair ( ) = dist [0.5 : ‘heads, 0.5 : ‘tails] x = fair ( ) y = fair ( )

 x and y are independent tosses of a ft fair coin

Higher-order Functions fair ( ) = dist [0.5 : ‘heads, 0.5 : ‘tails] biased ( ) = dist [0.9 : ‘heads, 0.1 : ‘tails] pick ( ) = dist [0.5 : fair, 0.5 : biased] coin = pick ( ) x = coin ( ) y = coin ( )

 x and y are conditionally independent ff given coin

Data Structures and Types



IBAL provides a rich type system



 tuples and records algebraic data types



IBAL is strongly typed

 automatic ML-style type inference

Bayesian Networks

Smart

Good Test

Taker

Diligent

Understands s s

S s

D d

P(U| S, D)

0.9 0.1

s d d d

0.3 0.7

0.6 0.4

0.01 0.99

Exam

Grade

HW

Grade nodes = domain variables edges = direct causal influence

Network structure encodes conditional independencies:

I( HW-Grade , Smart | Understands)

BNs in IBAL smart = flip 0.8

diligent = flip 0.4

G

S understands =

E case <smart,diligent> of

H

# <true,true> : flip 0.9

# <true,false> : flip 0.6

…

U

D

First-Order HMMs

H

1

H

2

H t-1

H t

O

1

O

2

O t-1

O t

Initial distribution P(H

1

Transition model P(H i

)

|H i-1

)

Observation model P(O i

|H i

)

What if hidden state is arbitrary data structure?

HMMs in IBAL init : () -> state trans : state -> state obs : state -> obsrv sequence(current) = { state = current observation = obs(state) future = sequence(trans(state)) } hmm() = sequence(init())

SCFGs

S -> AB (0.6)

S -> BA (0.4)

A -> a (0.7)

A -> AA (0.3)

B -> b (0.8)

B -> BB (0.2)



Non-terminals are data generating functions

SCFGs in IBAL append(x,y) = if null(x) then y else cons (first(x), append (rest(x),y) production(x,y) = append(x(),y()) terminal(x) = cons(x,nil) s() = dist[0.6:production(a,b),

0.4:production(b,a)] a() = dist[0.7:terminal(‘a),…

Probabilistic Relational Models

Actor

Chaplin

…

Appearance

Movie Role-Type

Mod T.

…

Actor

Gender Actor

Chaplin

…

Role-Type  Actor.Gender, Movie.Genre

Movie

Genre Movie

Mod T.

…

PRMs in IBAL movie( ) = { genre = dist ... } actor( ) = { gender = dist ... } appearance(a,m) = { role_type = case (a.gender,m.genre) of

(male,western) : dist ... } mod_times = movie() chaplin = actor() a1 = appearance(chaplin, mod_times)

Other IBAL Features









Observations can be inserted into programs

 condition probability distribution over values

Probabilities in programs can be learnable parameters, with Bayesian priors

Utilities can be associated with different outcomes

Decision variables can be specified

 influence diagrams, MDPs

Outline











Motivation

The IBAL Language

Inference Goals


Lessons Learned

Goals











Generalize many standard frameworks for inference

 e.g. Bayes nets, HMMs, probabilistic CFGs

Support parameter estimation

Support decision making

Take advantage of language structure

Avoid unnecessary computation

Desideratum #1:

Exploit Independence

Smart

Diligent

Good Test

Taker

Understands

Exam

Grade

HW

Grade



Use Bayes net-like inference algorithm

Desideratum #2:

Exploit Low-Level Structure



Causal independence (noisy-or) x = f() y = g() z = x & flip(0.9) | y & flip(0.8)

Desideratum #2:

Exploit Low-Level Structure



Context-specific independence x = f() y = g() z = case <x,y> of

<false,false> : flip 0.4

<false,true> : flip 0.6

<true> : flip 0.7

Desideratum #3:

Exploit Object Structure







Complex domain often consists of interacting objects weakly

Objects share a small interface

Objects are conditionally independent given interface

Student 1

Course Difficulty

Student 2

Desideratum #4:

Exploit Repetition





Domain often consists of many of the same kinds of objects

Can inference be shared between them?

f() = complex x1 = f() x2 = f()

… x100 = f()

Desideratum #5:

Use the Query





Only evaluate required parts of model

Can allow finite computation on infinite model f() = f() x = let y = f() in true



A query on x does not require f





Lazy evaluation is required

Particularly important for probabilistic languages, e.g. stochastic grammars

Desideratum #6

Use Support



The support probability of a variable is the set of values it can take with positive



Knowing support of subexpressions can simplify computation f() = f() x = false y = if x then f() else true

Desideratum #7

Use Evidence



Evidence can restrict the possible values of a variable



It can be used like support to simplify computation f() = f() x = flip 0.6

y = if x then f() else true observe x = false

Outline











Motivation

The IBAL Language

Inference Goals


Lessons Learned

Two-Phase Inference



Phase 1: decide what computations need to be performed



Phase 2: perform the computations

Natural Division of Labor





Responsibilities of phase 1:



 utilizing query, support and evidence taking advantage of repetition

Responsibilities of phase 2:

 exploiting conditional independence, lowlevel structure and inter-object structure

Phase 1

IBAL Program

Computation graph

Computation Graph







Nodes are subexpressions

Edge from X to Y means “Y needs to be computed in order to compute X”

Graph, not tree

 different expressions may share subexpressions

 memoization used to make sure each subexpression occurs once in graph

Construction of

Computation Graph

1.

2.

Propagate evidence throughout program

Compute support for each node

Evidence Propagation



Backwards and forwards let x = <a:flip 0.4, b:1> in observe x.a = true in if x.a then ‘a else ‘b

Construction of

Computation Graph

1.

2.

Propagate evidence throughout program

•

Compute support for each node

•

• this is an evaluator for a nondeterministic programming language lazy evaluation memoization

Gotcha!







Laziness and memoization don’t go together

Memoization: when a function is called, look up arguments in cache

But with lazy evaluation, arguments are not evaluated before function call!

Lazy Memoization







Speculatively evaluate function without evaluating arguments

When argument is found to be needed

 abort function evaluation





 store in cache that argument is needed evaluate the argument speculatively evaluate function again

When function evaluates successfully

 cache mapping from evaluated arguments to result

Lazy Memoization let f(x,y,z) = if x then y else z in f(true,’a,’b) f(_,_,_) f(true,_,_) f(true,’a,_)

Need x

Need y

‘a

Phase 2

Computation Graph

Microfactors

Solution

P(Outcome=true)=0.6

Microfactors



Representation of function from variables to reals

X Y Value







True 1 is the indicator function of XvY

More compact than complete tables

Can represent low-level structure

Producing Microfactors





Goal: Translate an IBAL program into a set of microfactors F and a set of variables X such that the P(Output) =



X f



F Similar to Bayes net f

Can solve by variable elimination

 exploits independence

Producing Microfactors







Accomplished by recursive descent on computation graph

Use production rules to translate each expression type into microfactors

Introduce temporary variables where necessary

Producing Microfactors if e

1 then e

2 else e

3 e

1 e

2 e

3

X e

1

X=True e

2

X=False e

3

Phase 2

Computation Graph

Microfactors

Structured

Variable

Elimination

Solution

P(Outcome=true)=0.6

Learning and Decisions







Learning uses EM

 like BNs, HMMs, SCFGs etc.

Decision making uses backward induction

 like influence diagrams

Memoization provides dynamic programming

 simulates value iteration for MDPs

Lessons Learned









Stochastic programming languages are more complex than they appear

Single mechanism is insufficient for inference in a complex language

Different approaches may each contribute ideas to solution

Beware of unexpected interactions

Conclusion







IBAL is a very general language for constructing probabilistic models

 captures many existing frameworks, and allows many new ones

Building an IBAL model = writing a program describing how values are generated

Probabilistic reasoning is like programming

Future Work





Approximate inference







 loopy belief propagation likelihood weighting

Markov chain Monte Carlo special methods for IBAL?

Ease of use



Reading formatted data



Programming interface

Obtaining IBAL www.eecs.harvard.edu/~avi/ibal

IBAL slides

Related documents

Products

Support

IBAL slides

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib