Cornelia Parker Colder Dark Matter Introduction to Carlos Mana Benasque-2014 Astrofísica y Física de Partículas CIEMAT What is the Monte Carlo method ? numerical method stochastic process Monte Carlo problem solution Theory of Probability and Statistics How does it work ? Design the experiment Do it …or simulate on a computer Outcome problem What kind of problems can we tackle ? Why Monte Carlo ? incapability? What do we get ? Magic Box?, …, more information?... Particle physics, Bayesian inference,… Important and General Feature of Monte Carlo estimations: Example: Estimation of (… a la Laplace) S C Probability that a draw falls inside the circumference N trials X {Point falls inside the circumference} X ~ Bi ( x | N , ) event Throws (N) ~ Be ( | n 1 / 2, N n 1 / 2) 4 (C ) (S ) 4 100 1000 10000 100000 1000000 Accepted (n) 83 770 7789 78408 785241 ~(4 ~ ) 12 n ~ ~ 4 N N 3.32 3.08 3.1156 3.1363 3.141 0.1503 0.0532 0.0166 0.0052 0.0016 … for large N: 4n ~ N 1.6 ~ ( ) N 1 The dependence of the uncertainty with N is a general feature of the Monte Carlo estimations regardless the number of dimensions of the problem Basis of Monte Carlo simulations Get a sequence z1 , z2 ,...zn of random numbers Design the experiment Get a sampling of size n from a stochastic process Need sequences of random numbers Do not do the experiment but…. generate sequences of (pseudo-)random numbers on a computer… Do it …or simulate on a computer “Anyone who considers arithmetical methods of producing random digits is, of course, in a state of sin” J. von Neumann (∞ Refs.) X ~ UnD x | 0,1 Sampling moments: 1 n k mk x j n j 1 1 nk ck x j x j k n k j 1 Sample size: 106 events Unx 0,1 0.5000 2 0.0833 112 1 0.0 2 1.2 6 5 m1 0.5002 2 sam 0.0833 1 sam 0.0025 2 sam 1.2029 0.0003 ( xi , xi 1 ) 0.00076 ( xi , xi 2 ) 0.00096 ( xi , xi 3 ) 0.00010 We can generate pseudo-random sequences z1, z2 ,... zk with Z ~ Un( z 0,1) Great! But life is not uniform… In Physics, Statistics,… we want to generate sequences from X {x1 , x2 ,..., xn } ~ p( x | θ ) How? Inverse Transform Acceptance-Rejection (“Hit-Miss”) “Importance Sampling”,… + Usually a combination of them Tricks Normal Distribution M(RT)2 (Markov Chains) Gibbs Sampling,…. Method 1: Inverse Transform X ~ p( x ) We want to generate a sample of P( X (, x] | ) Y F ( X ) : X 0,1 x x p( x | )dx dF ( x | ) F ( x | ) How is Y distributed? FY ( y ) PY y PF ( X | ) y P X F 1 ( y | ) F 1 ( y| ) dF ( x | ) y Y F ( X | ) ~ Un(x 0,1) FY ( y) y Algorithm i) Sample n) ii) Transform Y ~ Un( y 0,1) yi ~ Un( y 0,1) X F 1 (Y ) xi F 1 ( yi ) x1, x2 ,... xn ~ p( x ) Examples: X ~ Ex( x ) Sampling of p( x ) e x Ex(x 1) F ( x ) 1 e x i ) ui ~ Un(u 0,1) ii ) xi F (ui ) 1 ln ui X ~ We( x 1 , 2 ) Sampling of p( x θ ) 1 2 x 2 1 1 x2 e We(x 0.2,1.7) 1[ 0, ) ( x) 2 F ( x θ ) 1 e 1x 1[ 0, ) ( x) i ) ui ~ Un(u 0,1) 1 ii ) xi F (ui | θ ) ln ui 1 1 1 2 …some fun… Problem: 1) Generate a sample : X ~ Un( x 0,1) x1 , x2 ,...xn n 106 3 2 X ~ p ( x) x I[ 1,1] ( x) 2 2) Get for each case the sampling distribution of 1 1 X ~ Ca( x 0,1) 1 x2 1 m Ym X k m k 1 m {2, 5, 10, 20, 50} 3) Discuss the sampling distribution of Ym in connection with the Law of Large Numbers and the Central Limit Theorem n 4) If X ~ Un( x 0,1) and Wn U k [0,1] How is Z n log Wn distributed? k 1 5) If X k ~ Ga( x 0, k ) How is Z n 6) If X i ~ Ga( x | , i ) How is Xn distributed? Xn Xm Y X i distributed? i (assumed to be independent random quantities) 1 m Ym X k m k 1 m {1, 2, 5, 10, 20, 50} X ~ p( x) X ~ Un( x 0,1) X ~ Ca( x 0,1) 1 1 1 x2 3 2 x I[ 1,1] ( x) 2 X ~ p( X k ) pk For discrete random quantities k 0, 1, 2, ... F0 P( X 0 ) p0 F1 P( X 1 ) p0 p1 … n Fn P( X n ) pk k 1 … p1 p0 0 p2 F0 F1 1 F2 Algorithm i) Sample n) ii) Find U ~ Un( x 0,1) k Fk 1 ui Fk ( F1 0) x1, x2 ,... xn ~ ui ~ Un(u 0,1) xi k p( x ) X ~ Po( X | ) Example: Sampling of 50,000 events e k P( X k ) P( X k 1 ) k! k i ) ui ~ Un(u 0,1) ii ) a ) pk pk 1 k b) Fk Fk 1 pk Po(k 2) p0 e from k 0 until k 50,000 events Bi (n 10,0.5) Fk 1 ui Fk Sampling of N n N n X ~ Bi ( X | N , ) 1 n N n 1 P( X n | N , ) 1 n P( X n 1 | N , ) i ) ui ~ Un(u 0,1) ii ) a ) pn N n 1 pn 1 n 1 b) Fn Fn 1 pn n p0 1 N from n 0 until Fn1 ui Fn Examples: Sampling of N N n X ~ Bi ( X | N , ) n 1 n N n 1 P( X n | N , ) 1 n 50,000 events Bi (n 10,0.5) P( X n 1 | N , ) i ) ui ~ Un(u 0,1) ii ) a ) pn N n 1 pn 1 n 1 p0 1 N b) Fn Fn 1 pn from n 0 until Fn 1 ui Fn n Sampling of X ~ Po( X | ) e k P( X k ) P( X k 1 ) k! k i ) ui ~ Un(u 0,1) ii ) a ) pk pk 1 k b) Fk Fk 1 pk k p0 e from k 0 until Fk 1 ui Fk 50,000 events Po(k 2) Even though Discrete Distributions can be sampled this way, usually one can find more efficient algorithms… Example: X ~ Po( X | ) U i ~ Un(u | 0,1) Wn ~ p( wn | n) n Wn U k [0,1] 1 ( log wn ) n 1 ( n) k 1 1 P (Wn a) ( n) e x n 1 x dx a e log a n 1 e n nk (n) m e (n) k 1 (n k 1) m 0 (m 1) i) generate ui ~ Un(u | 0,1) ii) multiply them wn u1 un and go to i) while wn e iii) deliver x n 1 ~ Po( x | ) P (Wn e ) e n 1 m (m 1) Po( X n 1) m 0 PROBLEMS: Gamma Distribution: 1) Show that if X ~ Ga( x | a, b) 2) Show that if b n N then Y aX ~ Ga( y | 1, b) n Y Z i ~ Ga( y | 1, n) i 1 Z i ~ Ga( z | 1,1) Ex( z | 1) ( generate exponentials) Beta Distribution: X 1 ~ Ga( x1 | a, b1 ) X1 1) Show that if then Y X X ~ Be ( y | b1 , b2 ) 1 2 X 2 ~ Ga( x2 | a, b2 ) Generalization to n dimensions … trivial but … F ( x1, x2 ,..., xn ) Fn ( xn xn1,..., x1 ) Fn1 ( xn1 xn2 ,..., x1 )F2 ( x2 x1 ) F1 ( x1 ) and p( x1, x2 ,..., xn ) pn ( xn xn1,..., x1 ) pn1 ( xn1 xn2 ,..., x1 ) p2 ( x2 x1 ) p1 ( x1 ) n if p( x1 , x2 ,..., xn ) pi ( xi ) i 1 n X k are independent F ( x1 , x2 ,..., xn ) Fi ( xi ) i 1 … but if not ... there are n! ways to decompose the pdf and some are easier to sample than others ( X , Y ) ~ p( x, y ) 2 exp x y Example: ; x 0, ; y 0,1 Marginal Densities: 1 p y ( y) p( x, y) dx 2 y px ( x ) p( x, y ) dy 2 x e 0 p ( x, y ) p x ( x ) p y ( y ) 0 u x u 2 du not independent Conditional Densities: p ( x, y ) p ( x | y ) p y ( y ) Fy ( y ) y 2 exp x y p ( x, y ) p( x y ) py ( y) y Fx ( x y ) 1 exp x y easy p( x, y) p( y | x) px ( x) difficult Properties: Direct and efficient in the sense that one ui ~ Un(u 0,1) one xi ~ p( x ) Useful for many distributions of interest (Exponential, Cauchy, Weibull, Logistic,…) …but in general, F ( x ) difficult to invert numeric approximations are slow,… “Hit-Miss” J. Von Neuman, 1951 Method 2: Acceptance-Rejection Sample X X [a, b] 0 p( x | ) max x p( x | ) X ~ p( x ) 1) Enclose the pairs such that: ( x, y p ( x | )) in a domain , 0, k a, b , (not necessarily hypercube) and 0, max p( x | ) 0, k 2) Consider a two-dimensional random quantity Z (Z1 , Z 2 ) uniformly distributed in the domain , 0, k g ( z1 , z 2 | ) dz1dz 2 k max p( x | ) p( x | ) a dz1 dz 2 k 3) Which is the conditional distribution P(Z x Z p( x | )) ? 1 0 ; that is: b 2 P(Z1 x Z 2 p( x | )) p ( z1 | ) x dz g ( z , z 2 dz g ( z , z 2 1 1 1 1 P ( Z1 x, Z 2 p ( x | )) P2 ( Z 2 p ( x | )) x p( z | ) dz2 1 0 p ( z1 | ) p( z | ) dz2 1 0 | ) [ a ,b ] ( z1 ) dz1 | ) [ a ,b ] ( z1 )dz1 Algorithm i) Sample and Z1 Z2 ii) Get X ~ p( x ) n) k 0 max p( x | ) F (x ) z1i ~ Un( z1 , ) z2i ~ Un( z2 0, k ) if z2i p( z1i ) if z2i p( z1i ) rejected accept reject and go back to i) p( x | ) x1, x2 ,... xn ~ accepted a b xi z1i xi z1i p( x ) Example: (pedagogical; low efficiency) X ~ Be ( x a, b) p ( x | a, b) x density: a 1 1 x b 1 ; x 0,1 1 (normalisation: px a, bdx Be(a, b) …not needed ) 0 Covering adjusted for maximum efficiency: max p( x a, b) p( xmax ( x, y ) a 1 b 1 ( a 1 ) ( b 1 ) a 1 | a, b) ab2 (a b 2) a b2 generated in the domain , 0, k 0,1 0, max p( x a, b) Algorithm i) Sample n) xi ~ Un( x 0,1) yi ~ Un( y 0, maxp ( x a , b)) ii) Accept-Reject if y p( x a, b) i i if y p( x a, b) i a 2.3 ; b 5.7 i accept xi reject xi and go to i) 0,1 0, max p( x a, b) ngen 2588191 e nacc 0.3864 ff ngen nacc 1000000 nacc p( x )dx () ngen 0.016791 (0.000013) X Be (2.3,5.7) 0.01678944 () max p( x a, b) Be (x | 2.3,5.7) … weighted events ??? We can do the rejection as follows: i) Assign a weight to each generated pair 0, max p( x | ) 0, k ii) Acceptance-rejection ui ~ Un(u 0,1) wi ( xi , yi ) 0 wi 1 k (if we know max p( x | ) k max p( x | ) ) xi if ui wi xi if ui wi accept reject p ( xi ) Obviously, events with a larger weight are more likely to be accepted After step ii), all weights are: if rejected 0 wi 1 if accepted Sometimes, it is interesting not to apply step ii) and keep the weights g ( x) p( x ) dx g x x p 1 w n g x w i 1 i i n ; w wi i 1 And suddenly… Many times we do not know max p( x ) After having generated we find a value Nt and start with an initial guess events in the domain k1 , 0, k1 xm | p( xm | ) k1 incorrect estimation of k max p( x ) k1 p( x | ) 0 b a xm Don't throw away all generated events… The pairs ( x, y ) have been generated in with constant density so , 0, k1 Nt Nt Ne with ( ) k1 ( ) k 2 k2 p( xm | ) k1 we have to generate: k N e N t 2 1 k1 additional pairs ( x, y ) … in the domain , k1 , k2 with the pseudo-distribution g x px k1 1p x k1 ( x) … and proceed with 2 , 0, k2 Properties: Easy to implement and generalize to n dimensions Efficiency ef depends on the covering domain area under p( x ) area covering domain , 0, k # accepted events # generated events 1 N acc p( x )dx () e f () N gen X k max p( x | ) ef 1 rejected is equivalent to the Inverse Transform p( x | ) The better adjusted the covering is, the higher the efficiency accepted 0 a b Straight forward generalization to n dimensions Algorithm: i) Generate n+1 Un( x ) distributed random quantities ( xi1) , xi2 ) ,..., xin ) ; yi ) ~ Un( x1) 1 ) Un( x 2 ) 2 ) ... ... Un( x n ) n ) Un( y 0, k ) k) ii) Accept or reject accept ( xi1) , xi2 ) ,..., xin ) ) if yi p ( xi1) , xi2 ) ,..., xin ) | ) reject if not and go back to i) (pedagogical; low efficiency) 3D Example: ri ro 0,0,0 Points inside a torus of radius ( ri , ro ) centred at 0,0,0 Cover the torus by a parallelepiped: (ri ro ), (ri ro ) (ri ro ), (ri ro ) ri , ri Algorithm: n) i) Generate xi ~ Un( x ( ri ro ), ( ri ro )) yi ~ Un( y ( ri ro ), ( ri ro )) zi ~ Un( y ri , ri ) ii) Reject if 2 2 xi yi ( ri r o )2 2 2 or xi yi ( ri r o )2 2 2 12 2 2 2 z ( r ( x y ) ) r or i 0 i i i And go back to i) otherwise accept ( xi , yi , zi ) ( x1 , y1 , z1 ), ( x2 , y2 , z2 ),..., ( xn , yn , zn ) T ( ro , ri ,0,0,0) ri 1 ; ro 3 N generated 10,786 N accepted 5,000 N accepted 0.4636 N generated 2 2 r0 ri volume toroid ef volume parallelep iped 8ri r0 ri 2 2 knowing that: V parallelepiped 128 we can estimate: Vtoroid e f V parallelepipedo 59,34 0,61 Vtoroid 59,218 Problem 3D: Sampling of Hidrogen atom wave functions n ,l ,m r , , (3,1,0) ( x, y ) (3,2,0) Rn ,l r Yl ,m , Pr, , n, l , m Rn,l r Yl ,m , r 2 sin 2 Evaluate the energy using Virial Theorem T 1 V 2 E n1 V n 2 e2 1 V (r ) 40 r (3,2,±1) 2 ( x, z ) ( y, z ) (problem for convergence of estimations,…) Problems with low efficiency Example in n-dimensions: n-dimensional sphere of radius i) Sample x 1) i ,, xin ) y 2 1 accept centred at (0,0,...,0) ~ Un x1) 1,1 Un x n ) 1,1 2 1) ii) Acceptance-rejection: y xi if r 1 x 1) i 2 xi2 ) , , xin ) 2 xin ) 2 as inner point of the sphere xi1) xin ) or ,, y y over the surface if y 2 1 reject the n-tuple and go back to i) V ( Sn ) 2 n / 2 / n(n / 2) ef V (Hn ) 2n n 0 e f ( n 4) 31% e f ( n 5) 16% Why do we have low efficiency? Most of the samplings are done in regions of the sample space that have low probability p( x ) Example: Sampling of X ~ p( x ) ~ e x ; x [0,1] Generate values of x uniformly in x 0,1 … do a more clever sampling… usually used in combination with “Importance Sampling” 1 e e x 1 e x F ( x ) 1 e Example of inefficiencies: Usual problem in Particle Physics… (2 ) 4 d | iM fi |2 d n ( p1 , pn | p) F i) Sample phase-space variables d n ( p1 , pn | p) ii) Acceptance-rejection on | iM fi |2 iii) Estimate cross-section sample of events with dynamics of the process E[| iM fi |2 ] “Naïve” and simple generation of n-body phase-space d n ( p1 ,, pn | p) d j ( p1 ,, p j | q) d n1 j (q, p j 1 ,, pn | p)2 dmq 3 2 In p rest-frame: n S k mi p2 p3 mq0 M p ; mqn1 mn i k d n ( p1 ,, pn | p) d n1 ( p2 ,, pn | q1 ) d 2 (q1 , p1 | p) dmq1 pn (m2 mn ) mq1 M p m1 q1 mq1 S 2 u1 (mq0 S1 ) u1 ~ Un(0,1) p p3 d ( p ,, p | q ) d ( p ,, p | q ) d (q , p | q ) dm n 1 2 n 1 n2 3 n 2 2 2 2 1 q p4 p1 pn (m3 mn ) mq2 mq1 m2 p2 2 2 mq2 S3 u2 (mq1 S 2 ) u2 ~ Un(0,1) q2 q1 2 (mk 1 mn ) mqk mqk 1 mk mqk S k 1 uk (mqk 1 S k ) uk ~ Un(0,1) for the k 1,, n 2 intermediate states and then weight each event… 1 | p1 | d 2 ( p1 , p2 | p) d1 4 (2 )6 m p n2 WT 1/ 2 (mq , mn k , mq ) k 0 + Lorentz boosts to overall centre of mass system k 1 k mqk ( x, y, z ) [ x 2 ( y z ) 2 ][ x 2 ( y z ) 2 ] …but | iM fi |2 , cuts,…! usually inefficient (“very”) Method 3: Importance Sampling Sample 1) Express X ~ p( x | ) 0 ; x X p( x | ) g ( x | 1 ) h( x | 2 ) 0 ; x X i) g ( x 1 ) 0 ; h( x 2 ) 0 ; x X ii) h( x 2 ) probability density p( x | )dx g ( x | 1 ) h( x | 2 ) dx g ( x | 1 ) dH ( x | 2 ) In particular, take a convenient 2) Consider a sampling of h( x | 2 ) 0 (easiness) and define g ( x | 1 ) X ~ h( x | 2 ) and apply the acceptance-rejection algorithm to p( x | ) h( x | 2 ) g ( x 1 ) 0 How are the accepted values distributed? x P X x | Y g ( x | 1 ) X Y () ( X ) (Y ) 1 h( x | 2 )dx g ( x|1 ) dy P X x, Y g ( x | 1 ) g ( x|1 ) PY g ( x | 1 ) 1 h( x | 2 )dx dy x x h( x ) g ( x )dx p( x )dx 2 1 h( x 2 ) g ( x1 )dx p( x )dx F (x | ) Algorithm i) Sample xi ~ h( x 2 ) ii) Apply Acceptance-Rejection to g ( x 1 ) x1 , x2 ,...xn sample drawn from p( x ) g ( x 1 ) h( x 2 ) …∞ manners to choose h( x 2 ) Easy to invert so we can apply the Inverse Transform Method Easiest choice: h( x | 2 ) 1 1 ( x) () … but this is just the acceptance – rejection procedure We would like to choose h( x | 2 ) such that, instead of sampling uniformly the whole domain X , we sample with higher probability from those regions where p( x | ) is “more important” Take p( x | ) g ( x | 1 ) h( x | 2 ) h( x | 2 ) as “close” as possible to p( x | ) will be ”as flat as possible” (“flatter” than p( x | ) ) and we shall have a higher efficiency when applying the acceptance-rejection algorithm Example: density: X ~ Be ( x a, b) a 2.3 ; b 5.7 p( x | a, b) x a 1 1 x b 1 1[ 0,1] ( x) 1) We did “standard” acceptance-rejection 2) “Importance Sampling” m [a] 2 a m 1 ; 1 0.3 n [b] 5 b n 2 ; 2 0.7 b 1 x 1 x n 1 m 1 p( x | a, b) m 1 x 1 x n 1 x 1 x a 1 [ x1 1 x 2 ][ x m1 1 x ] n 1 dP( x | a, b) ~ [ x1 1 x 2 ]dBe ( x | m, n) p( x | a, b) [ x1 1 x 2 ][ x m1 1 x ] n 1 2.1) Generate X ~ Be ( x m, n) U k ~ Un( x 0,1) m Ym log( U k ) ~ Ga( x 1, m) Z Ym ~ Be ( x m, n) Ym Yn k 1 2.2) Acceptance-Rejection on g (x ) concave on x 0,1 g ( x) x1 1 x 2 0 1 , 2 1 1 2 max x {g ( x)} (1 2 ) 1 2 1 p( x | ) 2 P( X x) Un(x 0,1) Be (x 2,5) Be (x 2.3,5.7) Be (x 2,5) Un(x 0,1) acceptance-rejection ngen 2588191 nacc 1000000 eff 0.3864 Be (2.3,5.7) eff f max 0.016791 0.000013 Importance Sampling ngen 1077712 nacc 1000000 eff 0.9279 Be (2.3,5.7) eff f max Be (2,5) 0.0167912 0.0000045 Be (x 2.3,5.7) Be (2.3,5.7) 0.01678944 Method 4: Decomposition (… trick) Idea: Decompose the pdf as a sum m p ( x | ) ai pi (x | ) of simpler densities ai 0 ; pi ( x | ) 0 i 1 a i 1 i Normalisation: m m p( x | ) dx a p ( x | ) dx a i 1 Each density i 1, m x x pi ( x | )has Sample i i i 1 a relative weight i 1 ai X ~ pi ( x | ) with probability pi ai … thus, more often from those with larger weight Algorithm n) i) Generate ui ~ Un(u 0,1) ii) Sample to select pi ( x | ) with probability ai X ~ pi ( x ) Note: Sometimes the pdf pi ( x | ) can not be easily integrated … normalisation unknown generate from Evaluate the normalisation integrals Ii integration) and assign eventually a weight m p( x | ) ai I i i 1 f i x pi x during the generation process (numeric wi ai I i f i x Ii to each event ai I i pi x m i 1 3 2 X ~ p ( x ) 1 x Example: 8 g1 ( x) 1 normalisation 2 g 2 ( x) x p1 ( x) 1 22 p ( x) 3x 2 3 1 p( x) p1 ( x ) p2 ( x ) 4 4 ; x [ 1,1] Algorithm: 2 i) Generate ui ~ Un(u 0,1) ui 3 4 3 ui 1 4 ii) Sample from: p1 ( x) const p2 ( x ) x 2 75% of the times 15% of the times p2 ( x ) x 2 p( x) 1 x 2 p1 ( x) const Generalisation: m Extend g (x ) to a continuous family g ( y ) dy and consider i 1 i p( x | ) p ( x | ) as a marginal density: p( x, y | ) dy p( x y, ) y Algorithm: n) p( y | ) dy y i) Sample yi ~ p( y | ) ii) Sample xi ~ p( x yi , ) Structure in bayesian analysis Experimental resolution p ( | x) p ( x | ) ( ) pobs ( y | , ) R ( y | x, ) pt ( x | ) dx Y p( x, y | , ) R( y | x, ) pt ( x | ) The Normal Distribution Standardisation Z X ~ N ( x , ) ( x ) 1 e 2 X Z ~ N ( z 0,1) 2 1 z e 2 2 X Z but... i) Inverse Transform: F ( z ) is not an elementary function entire function ii) Central Limit Theorem: series expansion convergent but inversion slow,… 12 Z U i 6 U i ~ Un(u 0,1) Z [6, 6] ; E[ Z ] 0 ; V ( Z ) 1 approx i 1 Z ~ N ( z 0,1) iii) Acceptance - Rejection: not efficient although… iv) Importance Sampling: easy to find good approximations Sometimes, going to higher dimensions helps: in two dimensions… 2 2 2 Inverse Transform in two dimensions (G.E.P.Box, M.E. Muller; 1958) i) Consider two independent random quantities 1 p( x1 , x2 ) e 2 X 1 , X 2 ~ N ( x 0,1) ii) Polar Transformation: X1 R cos X 2 R sin 1 2 2 x1 x2 2 ( X1 , X 2 ) ( R, ) [0, ) [0,2 ) 1 p(r , ) e 2 r2 2 r p(r ) p( ) ( R, ) independent iii) Marginal Distributions: 2 Fr ( r ) p(r, ) d 1 e r2 2 0 F ( ) p( r, ) dr 2 0 …trivial inversion Algorithm i) Generate n) u1 , u2 ~ Un(u 0,1) ii) Invert Fr (r ) F ( ) to get iii) Obtain x1 , x2 ~ N ( x 0,1) as and x1 2 ln( u1 ) cos2 u1 r 2 ln u1 2u2 x2 2 ln( u1 ) sen 2 u1 (independent) Example: 2-dimensional Normal random quantity Sampling from Conditional Densities ( X1 , X 2 ) ~ N ( x1 , x2 1 , 2 , 1 , 2 , ) p(w1 , w2 | ) p(w2 | w1 , ) p(w1 ) N ( w2 | w1 , (1 ) 2 Z2 W2 W1 1 2 i) Generate ~ N ( z 2 | 0,1) 1 2 ) N ( w1 | 0,1) Z1 ( W1 ) ~ N ( z1 | 0,1) z1 , z2 ~ N ( z 0,1) z1 1 x1 1 ii) 2 x2 2 2 ( z1 z2 1 ) Standardisation X i Wi i i … different ways n-dimensional Normal Density Factorisation Theorem: (Cholesky) C 1 V nn symmetric and positive defined unique, lower triangular and with positive diagonal elements C V C if 1 T 1 x μT V 1 x μ C1 (x μ)T C1 (x μ) if X ~ N ( x μ,V ) take C V C CT such that X μ CY and Y ~ N ( y 0, I ) Ci1 such that Cij 0 ; j i since it is triangular inferior matrix Vi1 V11 ;1 i n j 1 Cij Vij CikC jk ;1 j i n k 1 C jj Ci1 Vii Cik2 k 1 i 1 1 2 ;1 i n Algorithm: 0) Determine matrix C from V z1, z2 ,..., zn i) Generate n independent random quantities each as zi ~ N ( z 0,1) n) n ii) Get xi i C ij z j j 1 z C1 (x μ) x μC z x ~ N (x μ, V) Example: 2-dimensional Normal random quantity 12 V 1 2 1 C 2 1 2 2 2 2 1 0 2 z1 , z2 ~ N ( z 0,1) C11 V11 1 V11 C 21 V21 2 C11 x μC z C12 0 2 C 22 V22 C 21 2 1 2 1 2 Example: Interaction of photons with matter (Compton scattering) Interaction of photons with matter Photoelectric Rayleigh (Thomson) Compton Pairs e+e- Photonuclear Absorption Radiography Photographic film Iron Carbon Photon beam 1) What kind of process? pCompton t Compton pairs p pairs = cross-section = pairs pairs t “probability of interaction” with atom expressed in cm2 u ~ Un(u 0,1) Compton Compton t u F1 F1 u F2 F1 pCompton F2 pCompton p pairs Fi pCompton p pairs Fn 1 2) Where do we have the first/next interaction? In a thin slab of surface S and thickness Volume Density S cm3 We have g cm-3 S NA atoms A “probability of interaction” with atom expressed in cm2 NA “Interaction surface” covered by atoms S eff cm2 S A = cross-section = S Total surface S Probability to interact with an atom in a thin slab S eff NA p PI ( ) S A POne I ( x ) PNo I ( x ) PI ( ) x has m x 1 slabs of thickness A N A p POne I ( x ) (1 p) p e mp p e x / 1dx m p( x | ) e x ; 1 x 0, “mean free path” E[ x ] Get the “mean free path” A cm t NA t Compton pairs Generate distance until the next interaction x pint ( x | ) e 1 Fint ( x | ) 1 e u ~ Un(u 0,1) x … Inverse Transform x ln u cm (2) (1) In this example, we shall simulate only the Compton process so I took the “Mean Free Path” for Compton Interaction only d 1 dx dx 1 0 ( E ) eatomic e free Thomson 1 a 2(1 a ) 4 1 1 3a 1 ln( 1 2a ) ln( 1 2a ) 2 2 a 1 2 a a 2 a ( 1 2 a ) atom atom e free a E me t ( E ) Z 0 ( E ) A cm [ Z 0 ( E )] N A (2) (1) 3) Compton Interaction easy: 2 variables (8-2-4) , Eout ’ in E Ein a me e Eout 1 in E 1 a (1 cos ) 0 1 1 1 2a Eout maxE Ein Eout Ein minE 1 2a Perturbative expansion in Relativistic Quantum Mechanics d 3 Thomson f ( x) x cos 1,1 dx 8 1 f ( x) 1 a (1 x )2 0, a 2 (1 x ) 2 2 1 x 1 a (1 x ) Thomson 0.665 barn 0.665 1024 cm 2 0,2 (integrated) 3.1) Generate polar angle for the outgoing photon () x cos 1,1 1 f ( x) 1 a (1 x )2 a 2 (1 x ) 2 2 1 x 1 a (1 x ) complicated to apply inverse transform… straight-forward acceptance-rejection very inefficient... Use: Decomposition + inverse transform + acceptance-rejection f ( x ) f1 ( x ) f 2 ( x ) f 3 ( x ) g ( x ) 1 fn ( x) 1 a (1 x )n ( 2 x 2 ) f1 ( x ) g ( x) 1 1 f1 ( x ) f 2 ( x ) f n ( x) 0 easy to invert… Inverse Transform g ( x) 0 “fairly flat”… Acceptance-rejection f ( x) f1 ( x) f 3 ( x) f 3 ( x) g ( x) Probability Densities 1 pi ( x ) fi ( x) wi 1 pi ( x ) dx 1 1 wi f i ( x ) dx 1 b 1 2a 1 fn ( x) 1 1 a (1 x )n 1 ln b a 2 w2 2 2 a (b 1) 2b w3 3 2 a (b 1) 2 w1 f ( x) f1 ( x) f 3 ( x) f 3 ( x) g ( x) w1 p1 ( x ) w2 p2 ( x ) w3 p3 ( x ) g ( x ) wT w1 w2 w3 w i i wT 1 2 3 1 wT 1 p1 ( x ) 2 p2 ( x ) 3 p3 ( x ) g ( x ) wT h( x) g ( x) f ( x) … we have everything... 1) Sample X g ~ wT 1 p1 ( x) 2 p2 ( x) 3 p3 ( x) g ( x) u 1 Decomposition u ~ Un(u 0,1) Inverse transform x g ~ p1 ( x ) 1 u 1 2 1 2 u x g ~ p2 ( x ) x g ~ p3 ( x ) x F ( x )i p ( s) ds i u ~ Un(u 0,1) 1 ln 1 a (1 x ) F1 ( x ) 1 ln b 2 b 1 1 F2 ( x ) 2(b x ) 2a (b 1)2 1 F3 ( x ) 1 2 4a (1 a ) (b x ) xg xg xg 1 a bu a a (b2 1) b 1 2au b 1 b 1 4a(1 a )u12 acceptance-rejection u ~ Un(u 0, g M ) g M maxg (x) g ( x 1) 1 b 1 b b2 u g ( xg ) accept xg u g ( xg ) reject xg 2-body kinematics xg ~ f ( x ) N in out E e e … for the electron: 1 Eeout Ein 1 a E 1 a (1 xg ) g ~ Un( 0,2 ) tan e in , 0, 0 x cos out e 2 1 a , g acos( xg ), g E E ’ cot g with respect to the direction of incidence of the photon !! … rotation Eout Ein 100,000 generated photons Ein 1 MeV trajectory of one photon C : ( Z 6, A 12, 2.26 gr cm 3 ) Fe : ( Z 26, A 55.85, 7.87 gr cm 3 ) END OF FIRST PART To come: Markov Chain Monte Carlo (Metropolis, Hastings, Gibbs,…) Examples: Path Integrals in Quantum Mechanics Bayesian Inference Method 5: Markov Chain Monte Carlo X ~ Bi ( k | n, p ) P ( X k | n, p ) n k P( X k | n, p) p (1 p) n k k k 0,1,2,, n k 0,1, ,10 n 10 p 0.45 nbins 11 k 0,1, , n k 1 P( X k ) π 1 , 2 ,, n1 Probability vector i 0 ; n 1 i 1 i 1 Step 0) Generate N=100,000 number of bins = 11 UnD (k 1,, n 1 11) P( X k ) 1 n 1 1 ; k 11 d1 , d2 ,, d n1 d n 1 i 1 i N Sampling probability vector π (0) 1( 0) , 2( 0) ,, n(0)1 d1 , d 2 ,, d n1 N N N First step done: We have already the N=100,000 events But we want this: Second step: redistribute all N generated events moving each event from its present bin to a different one (or the same) to get eventually a sampling from P( X k ) k HOW? In one step: an event in bin i goes to bin j with probability … But this is equivalent to sample from P( X j | n, p ) Bi ( k | n, p ) Sequentially: At every step of the evolution, we move all events from the bin where they are to a new bin (may be the same) with a migration probability ( P ) ij P(i j ) a( j | i) populations step ( 0) (1) (i ) (i 1) d d (0) 1 (1) 1 d d (i ) 1 probability vector , d 2( i ) , , d n( i)1 ( i 1) 1 π (1) , d 2( i 1) , , d n( i11) π ( 0 ) 1( 0 ) , 2( 0 ) , , n( 0)1 , d 2( 0 ) , , d n( 01) , d 2(1) , , d n(1)1 (1) 1 , 2(1) , , n(1)1 π ( i ) 1( i ) , 2( i ) , , n( i)1 π ( i 1) 1( i 1) , 2( i 1) , , n( i11) Transition Matrix among states Transitions among states of the system p11 p 21 P p n1 π(0) π (1) π ( 0 ) P π ( 2 ) π(1) P π ( 0 )P2 p12 p 22 pn2 p1n p2n p nn ( P ) ij P(i j ) a( j | i) π ( k ) π ( k 1) P π ( k 2) P 2 π ( 0) P k nn that allows PR π ( 0 ) to desired π 1 , 2 ,, N Goal: Find a Transition Matrix to go from π ( i 1) depends upon π (i ) and not on π ( ji ) Markov Chain with transition matrix P Transition Matrix is a Probability Matrix n p j 1 n ij P(i j ) 1 j 1 the probability to go from state i to whatever some other state is 1 Remainder: If the Markov Chain is… irreducible : all the states of the system communicate among themselves and ergodic: that is, the states of the system are recurrent : being at one state, we shall return to it with p 1 positive: we shall go to it in a finite number of steps aperiodic: the system is not trapped cycles i) There is a unique stationary distribution ii) Starting at any arbitrary state iii) π with π πP (unique fix vector) π ( 0 ) , the sequence π ( 0) , π (1) π (0) P , , π ( n ) π ( n1) P π ( 0) P n , tends asymptotically to the fix vector π πP 1 2 N 1 2 N lim n P 1 2 N ways to choose the Probability Transition Matrix A sufficient condition (not necessary) for the Detailed Balance relation Why? It assures that π to be a fix vector of P i ( P )ij j ( P ) ji is that is satisfied π Pπ N N N π P i P i1 , i P i 2 ,, i P iN π i 1 i 1 i 1 If i ( P )ij j ( P ) ji N N P P i 1 i ik i 1 k k 1,2,, N ki k Procedure to follow After specifying For all the 11 bins (i 1) according to the Detailed Balance condition probability vector (state of the system) π ( i ) 1( i ) , 2( i ) , , n( i)1 populations step (i ) (P ) ij P(i j ) aij d (i ) 1 , d 2( i ) , , d n( i)1 i) For each event at bin i 1,...,11 choose a candidate bin disc to go (j) among the 11 possible ones as Un (k | 1, 11) aij ii) Accept the migration i j with probability Accept the migration i i with probability π ( i 1) 1( i 1) , 2( i 1) , , n( i11) d ( i 1) 1 , d 2( i 1) , , d n( i11) 1 aij How do we take aij a( j | i ) so that Detailed Balance condition lim i π (i ) π ( p0 , p1 ,, p10 ) i ( P ) ij i aij j a ji j ( P ) ji j j i j a( j | i) min 1, i i i a(i | j ) min 1, 1 j j i j a( j | i) min 1, 1 i i i a(i | j ) min 1, j j i (P)ij j (P) ji limi π (i ) π i a( j | i) j a(i | j ) For instance, at step t … For an event in bin i7 Choose a bin j to go as J ~ UnD ( j 1, 11) j2 2 P( X 1) a72 a(7 2) min 1, 0.026 0.026 7 P( X 6) u ~ Un(u 0,1) j6 u 0.026 Move event to bin 2 u 0.026 Leave event in bin 7 6 P( X 5) a76 a(7 6) min 1, 1.47 1. 7 P( X 6) Move event to bin 6 72 76 n 10 p 0.45 After 20 steps… X ~ Bi (k | n 10, p 0.45) Convergence? p DKL [ p | ~ p ] pk log ~k pk k E[ X ] N 4.5 V [ X ] N (1 ) 2.475 Watch for trends, correlations, “good mixing”,… Detailed Balance Condition Trivial election: (P)ij j Still Freedom to choose the Transition Matrix P i j j i Select a new state migrating events from one bin to another with probability P( X k) k Basis for Markov Chain Monte Carlo simulation (P)ij q( j i ) aij Simple probability to choose a new possible bin j for an event that is at bin i probability to accept the proposed new bin j for an event at bin i taken in such a way that the Detailed Balance Condition is satisfied i (P)ij j (P) ji Metropolis-Hastings algorithm q (i j ) j q (i j ) aij min 1, j i q( j i ) i q( j i ) i q( j i ) a ji min 1, 1 q ( i j ) ji j q (i j ) aij min 1, i q( j i ) j q (i j ) i q( j i ) j q(i j ) j q(i j ) a ji j (P) ji i (P)ij i q( j i ) aij i q( j i ) Can take: q( j i) q(i j ) (not symmetric), even q( j i) q( j ) but better as close as possible to desired distribution for high acceptance probability q(i j ) i aij 1 If symmetric: q( j i) q(i j ) Metropolis algorithm j aij min 1, i q( j i ) q(i j ) (symmetric) For absolute continuous distributions Probability density function X ~ ( x | θ) ( x | θ) p( x x ) q( x | x) a( x x ) ( x | θ ) q( x | x ) a( x x ) min 1, ( x | θ ) q ( x | x ) x, x X Example: step=0 X ~ Be ( x 4,2) X ~ Un( x 0,1) (s) s3 (1 s) p(s s) q(s s) a(s s) Metropolis-Hastings Metropolis q( s s) 2 s q( s s) Un( s 0,1) s3 (1 s) a( s s) min 1, 3 s (1 s) s2 (1 s) a( s s) min 1, 2 s (1 s) step=1 step=1 … and after 20 steps.. In practice, we do not proceed as in the previous examples, (meant for illustration) Once equilibrium reached… different samplings from “same” distribution different steps 1) At step t 0 , choose an admissible arbitrary state t t 1 2) Generate a proposed new value q( x | x (t 1) , ) q( x (t 1) | x , ) x a( x , x ( t 1) x (t ) x Metropolis (0) x ; ( x (0) | θ ) 0 q( x | x (t 1) , ) q( x (t 1) | x , ) (not symmetric) x with probability ( x | θ) ) min 1, ( t 1) ( x | θ ) 4) If accepted, set x from the distribution (symmetric) 3) Accept the new value x (0) a( x , x ( t 1) ( x | θ ) q( x (t 1) | x , ) ) min 1, ( t 1) | θ ) q( x | x (t 1) , ) (x Otherwise, set x ( t ) x (t 1) Metropolis-Hastings 1) Need some initial steps to reach “equilibrium” (stable running conditions) 2) For samplings, take draws every few steps to reduce correlations (if important) Properties: “Easy” and powerful Virtually any density regardless the number of dimensions and analytic complexity Changes of configurations depend on ( s) q( s s) r ( s ) q( s s ) Normalization is not relevant Sampling “correct” asymptotically Need previous sweeps (“burn-out”, “thermalization”) to reach asymptotic limit Correlations among different states If an issue, additional sweeps … nevertheless, in many circumstances is the only practical solution... And Last… 1) Some (“many”) times, we do not have the explicit expression of the pdf but know the conditional densities 2) Usually, conditionals densities have simpler expressions 3) Bayesian Structure: p( , | x) p( x | , ) p( | ) p( ) (and in particular hierarchical models) …Sampling from Conditional densities Method 5: Markov Chain Monte Carlo with Gibbs Sampling EXAMPLE: Sampling from Conditionals Sampling of X ~ St ( x | ) ; F ( x | ) p( x | ) N 1 x 2 1 2 1 (( 1) / 2) xF (1 / 2, ( 1) / 2,3 / 2, x 2 / 2) 2 ( / 2) 0) Consider that au b 1 b e u du ( b ) a 0 1) Introduce a new random quantity U ~ Ga(u | a, b) (extra dimension) with a 1 x2 b 1 2 N p ( x, u | ) u b 1e au (b) Marginals: 0 dx p( x, u)du 1 Conditionals: p( x | ) p( x, u | )du a b 0 p (u | ) ~ St ( x | ) p( x | u, ) p( x, u | ) N ( x | 0, 2 u 1 ) p(u | ) p(u | x, ) p ( x, u | ) Ga(u | a, b) p( x | ) p ( x, u | ) dx u b 3 / 2 e u ~ Ga(u | a, b) Sampling: 1) Start at t=0 with x0 R 2) At step t Sample U | X ~ Ga(u | a, b) b 1 2 a 1 1 xt21 X | U ~ N ( x | 0, 2 u 1 ) X ~ St ( x | 5) E[ X ] 0 5 E[ X 2 ] 1.667 Yet another EXAMPLE… Trivial marginals for the Behrens-Fisher problem (lect. 2) W 1 2 ~ p(w | x1 , x2 ) ~ s1 ( x1 w u ) 2 2 ( n1 1/ 2 ) s 2 2 ( x2 u ) 2 ( n2 1/ 2 ) du However: p(1, 2 | x1, x2 ) p(1 | x1 ) p(2 | x2 ) i xi ~ St (ti | ni 1) Ti ni 1 si i 1,2 so, instead of sampling from p(w | x1 , x2 ) Algorithm n) 1) ti ~ St (ti | ni 1) 2) w 1 2 i xi ti si (ni 1) 1/ 2 i 1,2 Basic Idea: We want a sampling of marginal densities X X 1 , X 2 ,, X n ~ px1 , x2 ,, xn p( si ) p( x1 , x2 ,, xn ) dxi conditional densities si {x1 ,..., xi 1 , xi 1 ,, xn } p( x1 , x2 ,, xn ) p( xi | si ) p( si ) 2) Sample the n-dimensional random quantity from the conditional distributions X p x1 | x2 , x3 , , xn p x2 | x1 , x3 , , xn p xn | x1 , x2 , , xn 1 First, sampling from conditional densities: p( x x ) q( x | x) a( x x ) Consider: 1) The probability density function 2) The conditional densities qx1 , x2 , x3 ,, xn qx1 | x2 , x3 ,, xn qx2 | x1 , x3 ,, xn qxn | x1 , x2 ,, xn 1 3) An arbitrary initial state ( x1 (0), x2 (0), , xn (0)) X (usually less than n are needed) 4) Sampling of the X k ; k 1,, n random quantities at step t step t 1 ( x1 (t 1), x2 (t 1), , xn (t 1)) Z step t xk (t ) Generate a possible new value from the conditional density Xk of qxk | x1 (t ), x2 (t ), , xk 1 (t ), xk 1 (t 1), , xn (t 1) t t 1 Propose a change of the system from the state to the state s x1 (t ), x2 (t ),, xk 1 (t ), xk (t 1), xk 1 (t 1),, xn (t 1) s x1 (t ), x2 (t ),, xk 1 (t ), xk (t ), xk 1 (t 1),, xn (t 1) Metropolis-Hastings: px1 , x2 ,, xn Desired density Acceptance factor p x1 (t ), x2 (t ), , xk 1 (t ), xk (t ), xk 1 (t 1), , xn (t 1) px1 (t ), x2 (t ), , xk 1 (t ), xk (t 1), xk 1 (t 1), , xn (t 1) qxk (t 1) | x1 (t ), x2 (t ), , xk 1 (t ), xk 1 (t 1), , xn (t 1) qxk (t ) | x1 (t ), x2 (t ), , xk 1 (t ), xk 1 (t 1), , xn (t 1) so we shall accept the change with probability At the end of step the sequences t x1, x2 ,, xn will converge towards the stationary p.d.f a( s s) min( 1, ) ( x1 (t ), x2 (t ), , xn (t )) X px1 , x2 ,, xn regardless the starting sampling ( x1 (0), x2 (0), , xn (0)) X In particular,… Gibbs algorithm… Take as conditional densities to generate a proposed value the desired conditional densities Gibbs Algorithm: qxk | x1 (t ), x2 (t ), , xk 1 (t ), xk 1 (t 1), , xn (t 1) pxk | x1 (t ), x2 (t ), , xk 1 (t ), xk 1 (t 1), , xn (t 1) acceptance factor: qxk (t 1) | x1 (t ), x2 (t ), , xk 1 (t ), xk 1 (t 1), , xn (t 1) pxk (t 1) | x1 (t ), x2 (t ), , xk 1 (t ), xk 1 (t 1), , xn (t 1) px1 (t ), x2 (t ), , xk 1 (t ), xk (t 1), xk 1 (t 1), , xn (t 1) px1 (t ), x2 (t ), , xk 1 (t ), xk 1 (t 1), , xn (t 1) qxk (t ) | x1 (t ), x2 (t ), , xk 1 (t ), xk 1 (t 1), , xn (t 1) pxk (t ) | x1 (t ), x2 (t ), , xk 1 (t ), xk 1 (t 1), , xn (t 1) px1 (t ), x2 (t ), , xk 1 (t ), xk (t ), xk 1 (t 1), , xn (t 1) px1 (t ), x2 (t ), , xk 1 (t ), xk 1 (t 1), , xn (t 1) 1 So we accept the “proposed” change with probability a( s s ) 1 After enough steps to erase the effect of the initial values in the samplings and to achieve a good approximation to the asymptotic limit, we shall have sequences x 1) 1 , x12) , , x1n) , , x1m ) , x2m ) , , xnm ) ~ p( x1 , x2 ,, xn ) Example: X { X 1 ,..., X n } ~ Di ( x | α) Conjugated prior for Multinomial: p( x | α ) ( 0 ) 1 1 2 1 x1 n ( k x2 n 1 xn ) k 1 n xk [0, 1] x k 0 k 1 k k 1 0 k k 1 p ( x k | s k , α ) xk (1 S k xk ) n 1 sk {x1 , x2 ,..., xk 1 , xk 1 , xn1} n Sk x j j 1 jk n5 α 1, 2, 3, 4, 5 x (0) 0.211, 0.273, 0.262, 0.101, 0.152 x k (0) k 1 xk zk ~ Be ( z | k , n ) 1 Sk Example: Dirichlet and Generalised Dirichlet Conjugated priors for Multinomial: p( x | α ) D(α ) x11 1 x2 2 1 xn n 1 X { X 1 ,..., X n } ~ Di ( x | α) k 0 D (α ) ( 0 ) n 0 k n ( k ) k 1 k 1 xk [0, 1] n x k 1 Degenerated Distribution: 1 xn 1 xk k 1 n 1 n 1 i 1 p( x | α ) D(α ) xi (1 xk ) n 1 k 1 i 1 i 0 ij i j V[Xi , X j ] 02 ( 0 1) E[ X i ] i 0 Z k ~ Ga(1, k ) k n 1 Xj Zj X ~ { X 1 ,..., X n } ~ Di ( x | α) n Z k 1 k X ~ { X 1 ,..., X n } ~ GDi( x | α, β ) n 1 p( x | α ) i 1 n 1 x 0 xi 1 k 1 k 0 E[ X i ] i k xn 1 xk k 1 i i i 1 i 1 ; i 1,2,..., n 2 n1 1 ; i n 1 i 1 Si Si k ( k k ) 1 i i i ij V [ X i ] E[ X i ] Ti E[ X i ] i i 1 X1 ~ Z1 X 2 | X1 ~ Z 2 (1 X1 ) X 3 | X 1 , X 2 ~ Z 3 (1 X 1 X 2 ) i n 1 1 k 0 ( i i ) i 1 xi 1 xk ( i )( i ) k 1 i k 1 i 1 Ti ( k 1)( k k 1) 1 k 1 k 1 X k Z k 1 X j j 1 n 1 X n 1 X j Z k ~ Be ( k , k ) k 1,..., n 1 j 1 X ~ { X 1 ,..., X n } ~ GDi( x | α, β ) Quantum Mechanics Path Integral formulation of Q.M. (R.P. Feynman) Probability Amplitude (xf ,tf ) e K ( x f , t f xi , ti ) Dx (t ) paths x (t ) ( xi , ti ) Propagator i S x ( t ) S x t tf Lx, x; t dt ti x f , t f K x f , t f x f , t f xi , ti dxi ; t f ti Local Lagrangians (additive actions) K x f , t f xi , ti K x f , t f x, t K x, t xi , ti dx Feynman-Kac Expansion Theorem K ( x f , t f xi , ti ) e Expected Value of an operator Ax Chapman-Kolmogorov i En ( t f ti ) n ( x f )n ( xi ) n A Ax(t )e e i S x (t ) i S x (t ) Dx(t ) Dx(t ) L x , x; t One Dimensional Paticle Wick Rotation i e 2 1 m x (t ) 2 V x (t ) 2 t i t it f 1 S xt i m x (t ) 2 V [ x(t )] dt i 2 Feynman-Kac expansion theorem K ( x f , f xi , i ) e En ( f i ) n ( x f )n ( xi ) n x f xi 0 i 0 K (0, f 0,0) e En f n (0)n (0) n E1 e f 1 (0)1 (0) Discretised time D[ x(t )] AN tn t fin tn 1 tn2 N 1 dx j 1 j 1 x j x j 1 2 S N x0 , x1 ,, xN m V (xj ) j 1 2 N K ( x f , t f xi , ti ) AN dx1 x1 N 1 A dx j 1 j dx N 1 exp S N ( x0 , x1 ,, xN ) x N 1 A( x0 , x1 ,, xN ) exp S N ( x0 , x1 ,, xN ) N 1 dx j 1 j exp S N ( x0 , x1 ,, xN ) ti1 ti t2 t1 t0 tin xn 1 xi 1 xn2 xi x2 x1 tn t0 n N 1 A dx j 1 j A( x0 , x1 ,, xN ) exp S N ( x0 , x1 ,, xN ) N 1 dx j 1 Goal: generate N tray exp S N ( x0 , x1 ,, xN ) j as p( x0 , x1 ,, xN ) exp S N ( x0 , x1 ,, xN ) and estimate A (importance sampling) N tra y (k ) (k ) A ( x , x , , x 0 1 N 1 , x N ) k 1 Very complicated… Markov Chain Monte Carlo 1 V x (t ) k x (t ) 2 2 Harmonic Potential x j x j 1 2 2 S N x0 , x1 ,, xN m k xj 2 j 1 N Parameters x0 x(tin ) 0 x N x (t fin ) 0 xj R ; j 1, , N 1 x j 10,10 0.25 ( t continuo) N 2000 ( N fin N ) To isolate fundamental state Termalisation and correlations tn t fin tn 1 tn2 ti 1 ti t2 t1 tin 0 N term 1000 N tray 3000 xn 1 xi 1 xn2 xi x2 x1 xin x fin 0 N util 1000 Generación de trayectorias 1) Generate initial trajectory ( x0 , x1( 0 ) ,, x N( 0)1 , x N ) 2) Sweep over x1 ,, xN 1 xj ~ Un( x 10,10) P( x j xj ) exp S N ( x0 , x1 ,, xj ,, x N ) P( x j x j ) exp S N ( x0 , x1 ,, x j ,, xN ) P ( x j xj ) a ( x j xj ) min 1, P ( x x ) j j min1, exp S N ( x j 1 , x j x j 1 ) S N ( x j 1 , xj x j 1 ) 3) Repeat 2) N term 1000 times 4) Repeat 2) N tray 3000 times and take one trajectory out of 3 N util 1000 trajectories x0 0, x1,, xN 1, xN 0 T Virial Theorem 1 x V ( x ) 2 Harmonic Potential V x (t ) T 1 k x (t ) 2 2 V E 1 k x2 2 T V k x2 X4 Potential 2 a V x(t ) 4 2 2 a x x 1 2 a a 2 T x 1 a 2 2 E 3 4 x 4a 2 x 2 a2 4 Harmonic Potential k 1 E E 0 x2 exact 0 0 0.486 0 .5 X4 Potential 0.25 N 9000 a2 V x(t ) 4 x(t ) 2 1 a 3 4 2 E 0 x x 0 4a 2 exact E 0 0.697 a2 0.668 4 0 a5