Bayesian Methods

Bayesian Methods

What they are and how they fit into

Forensic Science

3 2 1 0 1 2 3

Outline

• Bayes’ Rule

• Bayesian Statistics (Briefly!)

• Conjugates

• General parametric using BUGS/MC software

• Bayesian Networks

• Some software: GeNIe, SamIam, Hugin, gR R-packages

• Bayesian Hypothesis Testing

• The “Bayesian Framework” in Forensic Science

• Likelihood Ratios with Bayesian Network software

A little about conditional probability

Pr( A | B )

=

Pr( A

Ç

B )

Pr( B )

Pr( B | A )

=

Pr( B

Ç

A )

Pr( A )

Pr( A )

Pr( A

Ç

B )

=

Pr( B

Ç

A )

Pr( B )

Bayes’ Rule:

Pr( A | B )

=

Pr( B | A )

Pr( A )

Pr( B )

Probability

•

Frequency : ratio of the number of observations of interest ( n i

) to the total number of observations ( N ) frequency of observation i

= n i

N

•

It is EMPIRICAL!

•

Probability (frequentist): frequency of observation i in the limit of a very large number of observations

Probability

•

Belief

: A “Bayesian’s” interpretation of probability.

• An observation (outcome, event) is a “measure of the state of knowlege” Jaynes .

•

Bayesian-probabilities reflect degree of belief and can be assigned to any statement

•

Beliefs (probabilities) can be updated in light of new evidence (data) via Bayes theorem.

Bayesian Statistics

• The basic Bayesian philosophy:

Prior Knowledge × Data = Updated Knowledge

A better understanding of the world

Prior × Data = Posterior

Bayesian Statistics

• Bayesian-ism can be a lot like a religion

• Different “sects” of (dogmatic) Bayesians don’t believe other

“sects” are “true Bayesians”

Parametric

BUGS ( B ayesian U sing G ibbs S ampling)

MCMC (Markov-Chain Monte Carlo)

Andrew Gelman (Columbia)

David Speigelhalter (Cambridge)

Bayes Nets

Graphical Models

Steffan Lauritzen (Oxford)

Judea Pearl (UCLA)

The major Bayesian “churches”

Empirical Bayes

Data-driven

Brad Efron (Stanford)

Bayesian Statistics

• What’s a Bayesian…??

• Someone who adheres ONLY to belief interpretation of probability?

• Someone who uses Bayesian methods?

• Only uses Bayesian methods?

• Usually likes to beat-up on frequentist methodology…

Bayesian Statistics

• Actually DOING Bayesian statistics is hard!

Parametric Bayesian Methods

• All probability functions are “parameterized”

Bayesian Statistics

• We have a prior “belief” for the value of the mean

• We observe some data

• What can we say about the mean now?

We need Bayes’ rule:

YUK!

And this is for an “easy” problem

Bayesian Statistics

• So what can we do????

• Until ~ 1990, get lucky…..

• Sometimes we can work out the integrals by hand

• Sometimes you get posteriors that are the same form as the priors ( Conjugacy )

• Now there is software to evaluate the integrals.

• Some free stuff:

• MCMC: WinBUGS, OpenBUGS, JAGS

• HMC: Stan

Bayesian Networks

• A “ scenario

” is represented by a joint probability function

• Contains variables relevant to a situation which represent uncertain information

• Contain “dependencies” between variables that describe how they influence each other.

• A graphical way to represent the joint probability function is with nodes and directed lines

• Called a

Bayesian Network Pearl

Bayesian Networks

• (A Very!!) Simple example Wiki :

• What is the probability the

Grass is Wet ?

• Influenced by the possibility of

Rain

• Influenced by the possibility of Sprinkler action

• Sprinkler action influenced by possibility of Rain

• Construct joint probability function to answer questions about this scenario:

• Pr (Grass Wet , Rain , Sprinkler )

Pr(Sprinkler | Rain) yes Rain:

Sprinkler

: was on was off

40%

60% no

1%

99%

Bayesian Networks

Pr(Rain)

Rain: yes 20% no 80%

Grass

Wet:

Sprinkler:

Rain: yes no

Pr(Grass Wet | Rain, Sprinkler) was on yes

99%

1% was on no

90%

10% was off yes

80%

80% was off no

0%

100%

Bayesian Networks

Pr(Sprinkler) Pr(Rain)

Other probabilities are adjusted given the observation

Pr(Grass Wet)

You observe grass is wet.

Bayesian Networks

• Areas where Bayesian Networks are used

• Medical recommendation/diagnosis

• IBM/Watson, Massachusetts General Hospital/DXplain

• Image processing

• Business decision support

• Boeing, Intel, United Technologies, Oracle, Philips

• Information search algorithms and on-line recommendation engines

• Space vehicle diagnostics

• NASA

• Search and rescue planning

• US Military

• Requires software. Some free stuff:

• GeNIe (University of Pittsburgh) G ,

• SamIam (UCLA) S

• Hugin (Free only for a few nodes) H

• gR R-packages gR

Bayesian Statistics

Bayesian network for the provenance of a painting given trace evidence found on that painting

Bayesian Statistics

• Frequentist hypothesis testing:

• Assume/derive a “null” probability model for a statistic

• E.g.: Sample averages follow a Gaussian curve

Say sample statistic falls here

“Wow”! That’s an unlikely value under the null hypothesis

(small p-value)

Bayesian Statistics

• Bayesian hypothesis testing:

• Assume/derive a “null” probability model for a statistic

• Assume an “alternative” probability model p (x|null) p (x|alt)

Say sample statistic falls here

The “Bayesian Framework”

• Bayes’ Rule Aitken, Taroni :

•

H p

•

H d

= the prosecution’s hypothesis

= the defences’ hypothesis

• E = any evidence

•

I = any background information

Pr( H p

| E , I )

=

Pr( E | H p

, I )

Pr( H p

, I )

Pr( E )

Pr( H d

| E , I )

=

Pr( E | H d

, I )

Pr( H d

, I )

Pr( E )


• Odd’s form of Bayes’ Rule:

Pr( H p

| E , I )

=

Pr( H d

| E , I )

Pr( E | H p

, I )

´

Pr( E | H d

, I )

Pr( H p

, I )

Pr( H d

, I )

Posterior odds in favour of prosecution’s hypothesis

Likelihood Ratio

Prior odds in favour of prosecution’s hypothesis

Posterior Odds = Likelihood Ratio × Prior Odds


• The likelihood ratio has largely come to be the main quantity of interest in their literature:

LR

=

Pr( E | H p

, I )

Pr( E | H d

, I )

• A measure of how much “weight” or “support” the “evidence” gives to one hypothesis relative to the other

• Here,

H p relative to H d

• Major Players: Evett, Aitken, Taroni, Champod

• Influenced by Dennis Lindley


LR

=

Pr( E | H p

, I )

Pr( E | H d

, I )

• Likelihood ratio ranges from 0 to infinity

• Points of interest on the LR scale:

• LR = 0 means evidence TOTALLY DOES

NOT SUPPORT H p in favour of H d

• LR = 1 means evidence does not support either hypothesis more strongly

• LR = ∞ means evidence TOTALLY SUPPORTS

H p in favour of H d


LR

=

Pr( E | H p

, I )

Pr( E | H d

, I )

• A standard verbal scale of LR “weight of evidence”

IS IN NO WAY, SHAPE OR

FORM , SETTLED IN THE STATISTICS

LITERATURE!

• A popular verbal scale is due to Jefferys but there are others

• READ British R v. T footwear case!

Bayesian Networks

• Likelihood Ratio can be obtained from the BN once evidence is entered

• Use the odd’s form of Bayes’ Theorem:

Probabilities of the theories after we entered the evidence

Probabilities of the theories before we entered the evidence


• Computing the LR from our painting provenance example:

How good of a “match” is it?

Efron Empirical Bayes’

•

An I.D. is output for each questioned toolmark

• This is a computer “match”

• What’s the probability the tool is truly the source of the toolmark ?

•

Similar problem in genomics for detecting disease from microarray data

•

They use data and Bayes’ theorem to get an estimate

No disease genomics

= Not a true “match” toolmarks

Empirical Bayes’

• We use Efron’s machinery for “empirical

Bayes’ two-groups model” Efron

•

Surprisingly simple!

•

Use binned data to do a Poisson regression

•

Some notation:

•

S , truly no association, Null hypothesis

•

S + , truly an association, Non-null hypothesis

• z , a score derived from a machine learning task to I.D. an unknown pattern with a group

• z is a Gaussian random variate for the Null

Empirical Bayes’

• From Bayes’ Theorem we can get Efron :

Estimated probability of not a true

“match” given the algorithms' output z -score associated with its

“match”

( )

=

ˆ

( )

ˆ ( )

Names: Posterior error probability (PEP) Kall

Local false discovery rate (lfdr) Efron

•

Suggested interpretation for casework:

•

We agree with Gelaman and Shalizi Gelman :

“…posterior model probabilities …[are]… useful as tools for prediction and for understanding structure in data, as long as these probabilities are not taken too seriously.”

1

-

( )

= Estimated “ believability

” of machine made association

Empirical Bayes’

•

Bootstrap procedure to get estimate of the KNM distribution of

“Platt-scores” Platt,e1071

• Use a “Training” set

•

Use this to get p -values/ z -values on a “Validation” set

• Inspired by Storey and Tibshirani’s Null estimation method Storey

• Use SVM to get KM and KNM “Platt-score” distributions

• Use a “Validation” set

From fit histogram by Efron’s method get:

ˆ ( )

“mixture” density

ˆ

( ) z -density given KNM => Should be Gaussian

( )

Estimate of prior for KNM

What’s the point??

z -score

We can test the fits to

ˆ ( ) and

ˆ

( )

!

Posterior Association Probability: Believability Curve

12D PCA-SVM locfdr fit for

Glock primer shear patterns

+/- 2 standard errors

Bayes Factors/Likelihood Ratios

• In the “Forensic Bayesian Framework”, the

Likelihood

Ratio is the measure of the weight of evidence.

•

LRs are called Bayes Factors by most statistician

•

LRs give the measure of support the “evidence” lends to the “prosecution hypothesis” vs. the “defense hypothesis”

•

From Bayes Theorem:

LR

=

Pr

Pr

(

(

E

E |

| H

H p d

)

)

=

Pr

Pr

(

(

H

H p d

| E

| E

)

)

Pr

Pr

( )

H p

( )

=

Posterior Odds

Prior Odds


• Once the “fits” for the Empirical Bayes method are obtained, it is easy to compute the corresponding likelihood ratios.

o Using the identity:

Pr( H p

| E )

=

1

-

( ) the likelihood ratio can be computed as:

LR( z )

=

1

-

( )

1

-

( )

ˆPr S

( ) z

( )


•

Using the fit posteriors and priors we can obtain the likelihood ratios Tippett, Ramos

Known match LR values

Known non-match LR values

Acknowledgements

• Professor Chris Saunders (SDSU)

• Professor Christophe Champod (Lausanne)

• Alan Zheng (NIST)

• Research Team:

•

Dr. Martin Baiker

•

Ms. Helen Chan

•

Ms. Julie Cohen

•

Mr. Peter Diaczuk

•

Dr. Peter De Forest

•

Mr. Antonio Del Valle

•

Ms. Carol Gambino

•

Dr. James Hamby

•

Ms. Alison Hartwell, Esq.

•

Dr. Thomas Kubic, Esq.

•

Ms. Loretta Kuo

•

Ms. Frani Kammerman

•

Dr. Brooke Kammrath

•

Mr. Chris Luckie

•

Off. Patrick McLaughlin

•

Dr. Linton Mohammed

•

Mr. Nicholas Petraco

•

Dr. Dale Purcel

•

Ms. Stephanie Pollut

•

Dr. Peter Pizzola

•

Dr. Graham Rankin

•

Dr. Jacqueline Speir

•

Dr. Peter Shenkin

•

Ms. Rebecca Smith

•

Mr. Chris Singh

•

Mr. Peter Tytell

•

Ms. Elizabeth Willie

•

Ms. Melodie Yu

•

Dr. Peter Zoon

Bayesian Methods

Bayesian Methods

What they are and how they fit into

Forensic Science

Prior Knowledge × Data = Updated Knowledge

A better understanding of the world

Prior × Data = Posterior

• What’s a Bayesian…??

=

´

=

How good of a “match” is it?

Bayes Factors/Likelihood Ratios

Bayes Factors/Likelihood Ratios

Bayes Factors/Likelihood Ratios

Acknowledgements

Related documents

Products

Support

Bayesian Methods

Bayesian Methods

What they are and how they fit into

Forensic Science

Prior Knowledge × Data = Updated Knowledge

A better understanding of the world

Prior × Data = Posterior

• What’s a Bayesian…??

=

´

=

How good of a “match” is it?

Bayes Factors/Likelihood Ratios

Bayes Factors/Likelihood Ratios

Bayes Factors/Likelihood Ratios

Acknowledgements

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib