Model 1: Repeatable independent trials with a free (unknown) success rate, p. With a given success rate, p, and the data having k successes out of n trials, the probability of getting them in a specific order is: k Pr(data | p, model 1) = p (1-p) n-k General outcomes: Xi = outcome on day i (either 0 or 1 in our case) General outcomes: Xi = outcome on day i (either 0 or 1 in our case) Dataset: D={X1,X2,X3,...,Xn} Pr(D)=Pr(X1,X2,X3,...,Xn)= Pr(X2,X3,...,Xn|X1) * Pr(X1)= Pr(X3,...,Xn|X2,X1)* Pr(X2|X1) * Pr(X1)= ... Pr(X1) * Pr(X2|X1)Pr(X3|X1,X2) * ... * Pr(Xn | Xn-1, ..., X2, X1) Low prediction strength for the data compared to other models => Model probability goes down. Markov chains: Pr(Xn | Xn-1, Xn-2, .... , X2, X1) = Pr(Xn | Xn-1) Markov chains: Pr(Xn | Xn-1, Xn-2, .... , X2, X1) = Pr(Xn | Xn-1) Pr(X10, X12 | X11) = Pr(X12 | X11, X10) Pr(X10 | X11)= Pr(X12 | X11) Pr(X10 | X11) Using general conditional probability Using Markov property X1 X2 X3 ... Xn-1 Xn X1 X2 X3 Pr(X1,X2,X3,...,Xn)= ... Xn-1 Xn In general Pr(X1) * Pr(X2|X1)Pr(X3|X1,X2) * ... * Pr(Xn | Xn-1, ..., X2, X1) = Pr(X1) * Pr(X2 | X1)Pr(X3 | X2) * ... * Pr(Xn | Xn-1) Because of Markov property Stationarity: Pr(outcome X on day i | outcome Y on day i-1) = Pr(outcome X on day j | outcome Y on day j-1) for any two days i and j. Stationarity: Pr(outcome X on day i | outcome Y on day i-1) = Pr(outcome X on day j | outcome Y on day j-1) for any two days i and j. Can now define: p1=Pr(rain | rain the previous day) p2=Pr(rain | no rain the previous day) Problem: What happens on the first day? Start with rule 6 to get the unconditional probability of rain a given day: Pr(rain on day i) = Pr(rain on day i | rain on day i-1) Pr(rain on day i-1) + Pr(rain on day i | no rain on day i-1) Pr(no rain on day i-1) Problem: What happens on the first day? Start with rule 6 to get the unconditional probability of rain a given day: Pr(rain on day i) = Pr(rain on day i | rain on day i-1) Pr(rain on day i-1) + Pr(rain on day i | no rain on day i-1) Pr(no rain on day i-1) Assume stationary unconditional probability, p'=Pr(rain any given day) p' = p1 * p' + p2 * (1-p') => p' = p2/ (1-p1+p2) Data likelihood under Markov model (model 2) given the parameters p1 and p2: Pr(X2, X3, ... , Xn | X1) = Pr(X2 | X1) Pr(X3 | X2) ... Pr(Xn | Xn-1)= k1 n1-k1 k2 n2-k2 p1 (1-p1) p2 (1-p2) when we have k1 rainy days out of n1 days with rain the previous day and k2 rainy days of of n2 days with no rain the previous day. Data likelihood under Markov model (model 2) given the parameters p1 and p2: Pr(X2, X3, ... , Xn | X1) = Pr(X2 | X1) Pr(X3 | X2) ... Pr(Xn | Xn-1)= k1 n1-k1 k2 n2-k2 p1 (1-p1) p2 (1-p2) Pr(X1, X2, X3, ... , Xn ) = k1 n1-k1 k2 n2-k2 p' p1 (1-p1) p2 (1-p2) if it rained on day 1 (1-p') p1k1 (1-p1)n1-k1 p2k2 (1-p2)n2-k2 if no rain on day 1 Data likelihood under Markov model (model 2) given the parameters p1 and p2: Pr(X2, X3, ... , Xn | X1) = Pr(X2 | X1) Pr(X3 | X2) ... Pr(Xn | Xn-1)= k1 n1-k1 k2 n2-k2 p1 (1-p1) p2 (1-p2) Pr(X1, X2, X3, ... , Xn ) = k1 n1-k1 k2 n2-k2 p' p1 (1-p1) p2 (1-p2) if it rained on day 1 (1-p') p1k1 (1-p1)n1-k1 p2k2 (1-p2)n2-k2 if no rain on day 1 X1 = p' (1-p') 1-X1 k1 p1 (1-p1) n1-k1 k2 p2 (1-p2) n2-k2 Data: k=596 rainy days of n=3652 days in total Data: k=596 rainy days of n=3652 days in total Data for the Markov chain (model 2): X1=0 k1=202 of n1=596 days of rain when it rained the previous day k2=393 of n2=3055 days of rain with no rain the previous day Data likelihoods: Pr(D | Model 1) = ∑p Pr(D | p,M1) Pr(p | M1) Pr(D | Model 2) = ∑p1,p2 Pr(D | p1,p2,M2) Pr(p1|M2) Pr(p2|M2) Data likelihoods: Pr(D | Model 1) = ∑p Pr(D | p,M1) Pr(p | M1) Pr(D | Model 2) = ∑p1,p2 Pr(D | p1,p2,M2) Pr(p1|M2) Pr(p2|M2) Bayes factor: Pr(D|M1) -29 B = ------------ = 2.32*10 Pr(D|M2) Rain the previous day Pr(p1 | D) P1 No rain the previous day Pr(p2 | D) P2 Pr(p2-p1 | D) 0.10 0.15 0.20 0.25 0.30 p1-p2 Covariance between outcomes two consecutive days (under the Markov chain): Cov(Xi, Xi-1) = E((Xi-E(Xi))(Xi-1-E(Xi-1))) Covariance between outcomes two consecutive days (under the Markov chain): Cov(Xi, Xi-1) = E((Xi-E(Xi))(Xi-1-E(Xi-1))) = E((Xi-p')(Xi-1-p')) = E(Xi* Xi-1)-E(Xi)p'-E(Xi-1)p'+p'2= 2 E(Xi* Xi-1) - p' = 2 Pr(Xi* Xi-1=1) – p' = 2 Pr(Xi=1 | Xi-1=1) Pr(Xi-1=1) – p' = 2 p1*p' – p' = p' (p1-p') Stop here if you want to go through the algebra Definition of correlation: Cov(X,Y) Corr(X,Y) = ---------------sd(X) sd(Y) where sd() is the standard deviation, the square root of the variance. In our case: Var(Xi)=p'(1-p') (see clip 18) Cov(Xi, Xi-1) = p' (p1-p') => Corr(Xi, Xi-1) = (p1-p')/(1-p') Note: p1=p2=p' => zero correlation Pr(correlation | D) Correlation Pr(correlation | D) Expectancy 95% credibility interval Correlation