Basic properties: Probability and Expectation: • • • • • • The conditional probability of A given B: Expectation for discrete case: Expectation for continuous case: The law of total expectation: Bayes’ rule: The law of total probability: P(A | B) = P(A∩B) / P(B) E(h(X)) = ∑xh(x) * p(x) E(h(X)) = ∫h(x) * f(x)dx E(X ) = E(E(X |Y)) P(A|B)=P(B|A)*P(A) / P(B) P(A) = ∑yP(A |Y = y) * P(Y = y) P(A) = ∫P(A |Y = y ) * fy(y)dy • Variance: Var(X) = E(Var(X |Y )) + Var(E(X |Y)) Var(X |Y ) = E(X² |Y ) − (E(X |Y)) Exponential distribution: • The probability density function (pdf) of an exponential distribution is: • The cumulative distribution function of an exponential distribution is: • Properties: โถ ๐ธ(๐ ) = 1 ๐ 1 โถ ๐๐๐(๐ ) = ๐2 ๐ โถ ๐๐บ๐น: ๐(๐ก) = ๐−t , ๐ก < ๐ • Memoryless property: ๐(๐ > ๐ + ๐ก |๐ > ๐ก) = ๐(๐ > ๐ ) ๐๐๐ ๐๐๐ ๐ , ๐ก ≥ 0. • Hazard/failure rate is equal to ๐ (only distribution with constant hazard rate). ๐1 • ๐(๐1 < ๐2 ) = ๐1 + ๐2 • Let X1,...,Xn be iid exponential with common mean 1/λ. Then the random variable Y =∑i Xi has a gamma distribution with parameters n and λ. • min Xj has an exponential distribution with rate ∑j λj. • The random variable mini Xi and the rank ordering of the Xi (i.e., Xi1 < Xi2 < ···< Xin ) are independent. Poisson distribution: • The probability mass function of a Poisson distribution is: • Properties: โถ ๐ธ(๐ ) = ๐๐๐(๐) = ๐ โถ Als ๐๐ ~๐๐๐๐ (๐๐ ) ๐๐๐ ๐๐๐๐๐๐๐๐๐๐๐ก, ๐กโ๐๐ ∑๐๐=1 ๐๐ ~๐๐๐๐ (∑๐๐=1 ๐๐ ) โถ๐๐บ๐น: ๐(๐ก) = ๐ ๐(๐ ๐ก −1) 1 Discrete time Markov Chain: Markov Chain definition: • A stochastic process is a collection of (infinitely many) random variables: - Discrete time Stochastic process is of the form {Xn ,n ≥0} - Continuous time Stochastic process is of the form {X (t),t ≥0} • A stochastic process {Xn ,n≥0} with state space S is called a discrete time Markov chain if for all states i ,j ,s0,...,sn−1 ∈ S: P(Xn+1 = j |Xn = i ,Xn−1 = sn−1,...,X0 = s0) = P(Xn+1 = j |Xn = i) (Markov property). • In time homogeneous Markov Chains we have: P(Xn+1 = j |Xn = i) = P(X1 = j |X0 = i) = Pij. • A random walk is a Markov Chain where if the chain is in state I, it only can got to i+1 or i-1. Recurrent and Transient States: • Recall the notation Pijk: the probability that starting from state i, after k steps, arriving at state j. Then j is called accessible from i if Pijk > 0 for some k ≥ 0. • States i and j are said to communicate if they are accessible from each other. We denote this by i ↔ j. • Communicating states form a class. If there is only 1 class, the MC is ‘irreducible’, otherwise it’s ‘reducible’. • A state is recurrent if fi = 1, and transient if fi < 1. (fi is probability that if you start in i, that you ever come back to i). • โถ State i is recurrent if ∑Piin is infinite. โถ State i is transient if ∑Piin is finite. • Recurrence and transience are class properties. • In a MC not all states are transient, and in a irreducible MC all states are recurrent. • Two types of recurrence (both are class properties): Denote Nj = min{ n > 0, Xn = j} โถ Positive recurrent if the expected time until the process returns to the same state is finite: E(Nj |X0 = j) < +∞, in a finite state MC, alle recurrent states are positive recurrent. โถ Null recurrent if the expected time until the process returns to the same state is infinite: E(Nj |X0 = j) = +∞ • Period d of state i is (class property): d = gcd{n > 0 : Piin > 0}, with ‘gcd’ greatest common divisor: โถ A state is periodic if d > 1. โถ A state is aperiodic if d = 1. • An aperiodic, positive recurrent state is called ergodic. Long run limit: • For an irreducible ergodic Markov chain: lim ๐๐๐๐ exists and is independent of i. ๐→∞ • Denote π๐ = lim ๐๐๐๐ for j ∈ S and the limiting distribution π = (πj) for each j ∈ S. ๐→∞ • Denote the stationary distribution with w = (wj) j ∈ S which is the unique solution of the steady-state ๐ค๐ = ∑๐ ๐ค๐ ๐๐๐ , ๐ ∈ ๐ ๐ = ๐๐ ∗ ๐ equations: { or { ∑ ∑๐ ๐ค๐ = 1 ๐ ๐ค๐ = 1 • Once the MC starts from w, we will always have P(Xn = j) = wj. • For an irreducible ergodic Markov chain, the limiting distribution π coincides with the stationary distribution w. • Let {Xn, n ≥ 1} be an irreducible Markov chain with stationary probabilities πj , j ≥ 0, with r(j) as reward of being in state j. Then: ∑๐ ๐(๐)π๐ is called the average reward per unit time. 2 Standard questions: • Denote T = {1,…,t} as the transient states and {t+1,…,s} as the recurrent states. ๐ 11 โฏ ๐ 1๐ก ๐11 โฏ ๐1๐ก โฑ โฎ ) โฑ โฎ )and S = ( โฎ • Let Pt = ( โฎ ๐ ๐ก1 โฏ ๐ ๐ก๐ก ๐๐ก1 โฏ ๐๐ก๐ก Notation: • fi = probability that, starting in state i, the process will ever re-enter state i. • sij = expected number of time periods the MC is in j, given that it started in i (mean time spent). Note: i and j are transient states. • fij = probability that, starting in state i, the process will ever enter state j. Note: i and j are transient states. • miR = expected number of steps to enter recurrent class R, given that it started in i (mean time it takes to entry R). Note: i is a transient state and R is the only recurrent class. • fiR1 = probability that, starting in state i, the process will ever enter recurrent class R1. Note: i is transient and there can be multiple recurrent classes. Solutions: 1 + ∑๐ก๐=1 ๐๐๐ ๐ ๐๐ , ๐๐๐ ๐ = ๐ • ๐ ๐๐ = { or ๐ = (๐ผ − ๐๐ )−1 ∑๐ก๐=1 ๐๐๐ ๐ ๐๐ , ๐๐๐ ๐ ≠ ๐ (๐ ๐๐ −1) • ๐๐๐ = ๐๐๐ + ∑๐ก๐=1,๐≠๐ ๐๐๐ ๐๐๐ or ๐๐๐ = { ๐ ๐๐ ๐ ๐๐ ๐ ๐๐ ๐๐๐ ๐ = ๐ ๐๐๐ ๐ ≠ ๐ • ๐๐๐ = 1 + ∑๐ก๐=1 ๐๐๐ ๐๐๐ or ๐ = S ∗ ๐ • ๐๐๐ 1 = ๐๐๐ 1 + ∑๐ก๐=1 ๐๐๐ ๐๐๐ 1 or ๐๐ 1 = S ∗ ๐ท๐ 1 with ๐ท๐ 1 = ( 3 ∑๐∈๐ 1 ๐1๐ โฎ ) ∑๐∈๐ 1 ๐๐ก๐ Continuous time Markov Chain: Counting process: • A stochastic process {N(t),t ≥ 0} is a counting process whenever N(t) denotes the total number of events that occur by time t. It should satisfy the following: โถ N(t) ≥ 0. โถ N(t) is integer valued. โถ For s < t, N(s) ≤ N(t). • For s < t: N(t) − N(s) represents the number of events that occur in the interval (s,t]. • A counting process has independent increments whenever the number of events that occur in one time interval is independent of the number of events that occur in another (disjoint) time interval. โถ That is, N(s) is independent of N(s + t) − N(s). • A counting process has stationary increments whenever the number of events that occur in any interval depends only on the length of the interval. โถ That is, the number of events in the interval (s,s + t] has the same distribution for all s. Poisson Process: First definition: • The counting process {N(t),t ≥ 0} is a Poisson process with rate λ, λ > 0 when: 1. N(0) = 0. 2. The process has independent increments. 3. The number of events in any interval of length t is Poisson distributed with mean λt. In other words, for all s,t ≥ 0: ๐(๐(๐ก + ๐ ) − ๐(๐ ) = ๐) = ๐ −๐๐ก (๐๐ก)๐ ๐! , ๐ = 0, 1, … • Note that the last condition implies that a Poisson process: โถ has stationary increments. โถ E(N(t)) = λt. Second Definition: • The counting process {N(t),t ≥ 0} is a Poisson process with rate λ, λ > 0 when: 1. N(0) = 0. 2. The process has stationary and independent increments. 3. ๐(๐(โ) = 1) = ๐โ + ๐(โ), ๐๐ โ → 0 4. ๐(๐(โ) ≥ 2) = ๐(โ), ๐๐ โ → 0 ๐(โ) โ→๐ โ • A function g(·) is said to be o(h) if lim = 0, so g(h) goes faster to zero than h. Third Definition: • For a Poisson process, let Tn, n > 1 be the nth interarrival time: the time elapsed between the (n − 1)th event and the nth event. It follows that Ti is exponential with rate λ for every i. • The arrival time of the nth event, Sn, is also called the waiting time until the nth event. Clearly, ๐ ๐๐ = ∑ ๐๐ , ๐ ≥ 1. ๐=1 • Thus, Sn has a gamma distribution with parameters with n and λ yielding (๐๐ก)๐−1 ๐ ๐ ๐๐๐ (๐ก) = ๐๐ −๐๐ก (๐ − 1)! , ๐ก ≥ 0. ๐ธ(๐๐) = λ . ๐๐๐(๐๐) = λ2 • Note that ๐(๐ก) ≥ ๐ ⇐⇒ ๐๐ ≤ ๐ก. • If we denote N(t) by: ๐(๐ก) ≡ ๐๐๐ฅ{๐ โถ ๐๐ ≤ ๐ก}, with ๐๐ = ∑๐๐=1 ๐๐ and Ti i.i.d. exponential random variables with rate λ. It then follows that {N(t), t ≥ 0} is a Poisson process with rate λ. 4 Merging two Poisson Process: • Suppose that {N1(t),t ≥ 0} and {N2(t),t ≥ 0} are independent Poisson processes with respective rates λ1 and λ2, where Ni(t) corresponds to type i arrivals. Let N(t) = N1(t) + N2(t), for t ≥ 0. Then the following holds: โถ The merged process {N(t),t ≥ 0} is a Poisson process with rate λ = λ1 + λ2. โถ The probability that an arrival in the merged process is of type i is ๐๐ /(๐1 + ๐2 ). Decomposing a Poisson Process: • Consider a Poisson process {N(t),t ≥ 0} with rate λ. Suppose that each event in this process is classified as type I with probability p and type II with probability (1 − p) independently of all other events. Let N1(t) and N2(t) respectively denote the type I and type II events occurring in time (0,t]. Then, the counting processes {N1(t),t ≥ 0} and {N2(t),t ≥ 0} are two independent Poisson processes with respective rates λp and λ(1 − p). Conditional arriving process: • If Y1, … , Yn are i.i.d. with density f, then the joint density of the order statistics Y(1), . . . , Y(n) is: ๐ f(y1 , . . . , y๐ ) = n! ∏ f(y๐ ) , y1 ≤ · · · ≤ y๐ . ๐=1 • Given that N(t) = n, the n arrival times S1, ... , Sn have the same distribution as the order statistics corresponding to n independent random variables uniformly distributed on the interval (0,t). So: (S1 , S2 , . . . , S๐ ) =๐ (U(1) , U(2) , . . . , U(๐) ) With U(1) ≤ U(2) ≤. . . ≤ U(๐) order statistics of i.i.d. random variables from U(0,t). • For any function f (with symmetric operation): ∑๐๐=1 ๐(๐๐ ) =๐ ∑๐๐=1 ๐(๐(๐) ) =๐ ∑๐๐=1 ๐(๐) Nonhomogeneous Poisson process: • The counting process {N(t),t ≥ 0} is said to be a nonhomogeneous Poisson process with intensity function λ(t), t ≥ 0, if 1. N(0) = 0. 2. {N(t),t ≥ 0} has independent increments. 3. P(N(t + h) − N(t) = 1) = λ(t)h + o(h). 4. P(N(t + h) − N(t) ≥ 2) = o(h) • For a nonhomogeneous Poisson process N(t) with intensity function λ(t) holds that ๐(๐ + ๐ก) – ๐(๐ก) is a Poisson random variable with mean ๐(๐ + ๐ก) − ๐(๐ก), with: ๐ก ๐(๐ก) = ∫ ๐(๐ฆ)๐๐ฆ 0 General CTMC: • โถ Let {X(t),t ≥ 0} be a continuous-time stochastic process taking values in {0, 1, 2, . . .}. โถ Let {x(t),t ≥ 0} be any deterministic function taking values in {0, 1, 2, . . .}. • The process {X(t),t ≥ 0} is called a continuous-time Markov chain if: ๐(๐(๐ก + ๐ ) = ๐ | ๐(๐ ) = ๐, ๐(๐ข) = ๐ฅ(๐ข), 0 ≤ ๐ข < ๐ ) = ๐(๐(๐ก + ๐ ) = ๐ | ๐(๐ ) = ๐) for all s, t ≥ 0, functions {x(t), t ≥ 0} and i, j = 0, 1, 2, … • If a continuous-time Markov chain {X(t),t ≥ 0} satisfies: ๐(๐(๐ก + ๐ ) = ๐ | ๐(๐ ) = ๐) = ๐(๐(๐ก) = ๐ | ๐(0) = ๐) for every s,t ≥ 0, then {X(t),t ≥ 0} is stationary or time-homogeneous. • Let Ti denote the time the process {X(t),t ≥ 0} spends in state i before making a transition into a different state. Then by Markov property it must have exponential distribution with rate vi. • Let Pij denote the probability of next entering state j, given that the current state is i, then: ๐๐๐ = 0 ๐๐๐ ∑ ๐๐๐ = 1 ๐๐๐ ๐๐ฃ๐๐๐ฆ ๐ = 0,1,2, … ๐ 5 Birth and death process: • • • • • If only the transition from i to i + 1 is allowed, then the process is called a pure birth process. If only transitions from i to i − 1 or i + 1 are allowed, then the process is called a birth and death process. A pure birth process starting at zero is a counting process. Arrivals occur with rate λi . That is, the time until the next arrival is exponentially distributed with mean 1/λi . Departures occur with rate µi . That is, the time until the next departure is exponentially distributed with mean 1/µi . • The Poisson process is a pure birth process with all λi equal to common arrival rate λ. • ๐ธ(๐๐ ) = 1 , ๐๐ + µ๐ ๐๐,๐−1 = ๐ µ๐ , ๐๐,๐+1 ๐ + µ๐ =๐ ๐๐ ๐ + µ๐ Transition probabilities: • The transition probability function of the continuous-time Markov chain is given by ๐๐๐ (๐ก) = ๐(๐(๐ก + ๐ ) = ๐ | ๐(๐ ) = ๐). • The rate of transition from state i into state j is given by ๐๐๐ = ๐ฃ๐ ๐๐๐ . โถ The qij values are called the instantaneous transition rates. โถ The vi values are the rates of the time until next transition given that you are currently in state i. โถ Note that qii = 0 as a consequence of the fact that Pii = 0. ๐๐๐ • It follows that: ๐ฃ๐ = ∑๐≠๐ ๐๐๐ and ๐๐๐ = ∑๐≠๐ ๐๐๐ • It can be proven that: lim โ→0 ๐๐๐ (โ) โ = ๐๐๐ , which shows that the instantaneous transition rate qij is the derivative Pij’(0) of the transition probability Pij(t) with respect to t, evaluated in t = 0. • It can be proven that: lim โ→0 1−๐๐๐ (โ) โ = ๐ฃ๐ , which shows −vi is the derivative Pii’(0) of the transition probability Pii(t) with respect to t, evaluated in t = 0. Kolmogorov equations: • Chapman-Kolmogorov equations: for all ๐ ≥ 0, ๐ก ≥ 0 ∞ ๐๐๐ (๐ก + ๐ ) = ∑ ๐๐๐ (๐ก)๐๐๐ (๐ ) ๐=0 • Kolmogorov backward-equations: ∞ ๐๐๐′ (๐ก) = ∑ ๐๐๐ ๐๐๐ (๐ก) − ๐ฃ๐ ๐๐๐ (๐ก) ๐≠๐ • Kolmogorov forward-equations: ∞ ๐๐๐′ (๐ก) = ∑ ๐๐๐ ๐๐๐ (๐ก) − ๐ฃ๐ ๐๐๐ (๐ก) ๐≠๐ Limiting probabilities: ๐ฃ๐ ๐๐ = ∑๐≠๐ ๐๐๐ ๐๐ ๐ฃ๐๐๐ ๐๐๐๐ ๐ • Balance equations: { ∑๐ ๐๐ = 1 • Limiting probabilities exist if and only if all states of MC communicate and the MC is positive recurrent, i.e. the MC is ergodic. Like in discrete-time, Pj are also called stationary probabilities. • For a birth and death process follows that the balance equations are: ๐0 ๐0 = ๐1 ๐1 ๐ฃ๐๐๐ ๐ = 0 { (๐๐ + ๐๐ )๐๐ = ๐๐+1 ๐๐+1 + ๐๐−1 ๐๐−1 ๐ฃ๐๐๐ ๐ > 0 • It follows: ๐0 = 1/(1 + ∑∞ ๐=1 ๐0 …๐๐−1 ) ๐1 …๐๐ 6 Queuing theory: • Kendall’s notation: Queueing systems are often indicated via two letters followed by one or two numbers: M/M/1, M/M/2/5. โถ The first letter indicates the arrival process: D: Deterministic: clients arrive at equidistant time points. M: Markovian: clients arrive according to a Poisson process. G: General: clients arrive according to a general arrival process. โถ The second letter indicates the type of service times: D: Deterministic: service times are fixed. M: Markovian: service times S1, S2, . . . are independent exponential random variables with common rate. G: General: service times S1, S2, . . . are independent and identically distributed (i.i.d.) random variables. They may have any distribution. โถ The first number indicates the number of servers. โถ The second number indicates the capacity of the system, that is, the maximum number of clients in the system. The capacity is equal to the number of servers plus the maximum number of waiting clients. • M/M/1 queue: โถ Customers arrive at the server according to a Poisson process with rate λ. โถ The service needs some time to complete. โถ The successive service times are independent exponential random variables with mean 1/µ โถ The number of clients in the system is a birth and death process with common arrival rate λ and common departure rate µ. ๐ ๐ ๐ ๐ ๐ ๐ ๐ โถ Limiting probabilities are ๐๐ = ( ) (1 − ) ๐๐๐๐ฃ๐๐๐๐ ๐กโ๐๐ก < 1. Little’s law: • N(t) is number of arrivals up to time t. • Overall arrival rate into the system: ๐ = lim ๐ก→∞ ๐(๐ก) . ๐ก • Let Vn denote sojourn time of client n, that is, the time client n spends in the system. • Then average sojourn time W (the average time client n spends in the system) is ๐ = lim 1 ๐ก 1 ๐→∞ ๐ ∑๐๐=1 ๐๐ • Let X(t) denote the number of clients in the system at time t. Then, ๐ฟ = lim ๐ก ∫0 ๐(๐ ) ๐๐ is the average ๐ก→∞ number of clients in the system (over time). (sometimes ๐ฟ = ∑๐ ๐=0 ๐๐๐ if there are s+1 states) • Little’s law: ๐ฟ = ๐๐ ๐ 1 • For M/M/1 queue: ๐ฟ = ๐−๐, ๐ = ๐−๐ PASTA Principle: • โถ Define the long-run or steady-state probability of exactly n clients in the system by: ๐๐ = lim ๐(๐(๐ก) = ๐), Pn is often also the long-run proportion the system contains exactly n clients. ๐ก→∞ โถ an long-run proportion of clients that find n in the system upon arrival. โถ dn long-run proportion of clients that leave n in the system upon departure. • In systems in which clients arrive and depart one at a time, the two probabilities an and dn coincide. • The PASTA property: Poisson arrivals see time averages. If the arrival process is a Poisson process: โถ The arrivals occur homogeneously over time โถ The averages over time and over clients are the same Then it holds that ๐ท๐ = ๐๐ 7 Gaussian Processes: 1 • A random variable R which satisfies P(๐ = −1) = ๐(๐ = +1) = 2 is called a Rademacher random variable. Properties: ๐ธ(๐ ) = 0, ๐๐๐(๐ ) = 1. Brownian Motion: First definition: • 1. Brownian motion starts at zero: W (0) = 0. 2. Brownian motion has stationary and independent increments. 3. Brownian motion evaluated at a fixed time t1 is a normal random variable with mean zero and variance t1. Second definition: • 1.Brownian motion starts at zero: W (0) = 0. 2. For t1 ≤ t2, W (t1) and W (t2) have a bivariate normal distribution with mean zero and covariance t1. Properties: • W(0) = 0. • Cov(W(t1),W(t2)) = min(t1,t2). • For t1 ≤ t2: Cov(W (t1), W(t2) − W(t1)) = 0. 1 ๐ √ ๐→∞ [๐๐ก] • Brownian motion is the limit of aa random walk: ๐(๐ก) = lim ๐๐ (๐ก) = lim ( ) ∑๐=1 ๐ ๐ ๐→∞ Reflection Principle: • General case if Sn is the sum of n Rademacher variables (n and k different parity): ๐(๐๐๐ฅ๐=1,…,๐ ๐๐ ≥ ๐) = 2๐(๐๐ ≥ ๐) • Brownian motion case: ๐(๐ ๐ข๐0≤๐ก≤๐ ๐(๐ก) > y) = 2๐(W(b) > y) • Let Ta and Tb be two hitting times (first time BM hits the level y) with a, b > 0. The probability that ๐(๐๐ < ๐๐ ) is: โถ 0 ๐๐ ๐ > ๐ > 0 โถ 1 ๐๐ ๐ > ๐ > 0 โถ −๐/(๐ − ๐) ๐๐ ๐ > 0 > ๐ • Boundary crossing from both sides: ∞ ๐(๐ ๐ข๐0≤๐ก≤๐ |๐(๐ก)| > y) = 2 ∑(−1)๐+1 ๐(๐ ๐ข๐0≤๐ก≤๐ ๐(๐ก) > (2j − 1)y) ๐=1 • Butler test: use to test if sample Y1,…,Yn is symmetric around 0. โถ Rearrange the sample so as to satisfy |๐(1)| ≤ |๐(2)| ≤ . . . ≤ |๐(๐) |. 1 ๐๐ ๐(๐) > 0 โถ Define random variables R1, R2, . . . , Rn by ๐ ๐ = { . −1 ๐๐ ๐(๐) < 0 1 [๐๐ก] โถ Butler’s test statistic ๐๐ = ๐ ๐ข๐0≤๐ก≤1 |( ๐) ∑๐=1 ๐ ๐ | . √ โถ Under null hypothesis, Tn converges in distribution to the absolute supremum of the Brownian motion on the unit interval. Use critical values to perform test and reject null for high values of Tn. ๐(๐ก) 2 • Linear boundaries (Doob): ๐(๐ ๐ข๐๐ก≥0 1+๐๐ก > ๐ฆ) = ๐ −2๐๐ฆ . โถ Can be used for Brownian motion with drift: ๐(๐ก) = ๐(๐ก) + ๐๐ก. It follows that ๐(๐ ๐ข๐๐ก≥0 ๐(๐ก) > ๐ฆ) = ๐(๐ ๐ข๐๐ก≥0 ๐(๐ก) ๐ ๐ฆ 1− ๐ก > ๐ฆ) = ๐ 2๐๐ฆ 8 Brownian Bridge: Empirical Process: 1 ๐ • Empirical distribution function: ๐น๐ (๐ฅ) = ∑๐๐=1 ๐ผ{๐๐≤๐ฅ} . • Empirical process: √๐(๐น๐ (๐ฅ) − ๐น(๐ฅ)). • CLT gives as n goes to infinity: √๐(๐น๐ (๐ฅ0 ) − ๐น(๐ฅ0 )) ๐→ ๐ (0, ๐น(๐ฅ0 )(1 − ๐น(๐ฅ0 ))) . 1 • Uniform empirical process: ๐ต๐ (๐ข) = √๐(๐ ∑๐๐=1 ๐ผ{๐๐≤๐ข} − ๐ข) ๐ฃ๐๐๐ 0 ≤ ๐ข ≤ 1, met U1,…,Un i.i.d. U[0,1]. • If random variable X has CDF F(x), then ๐น(๐)~๐[0,1]. Because of this follows: √๐(๐น๐ (๐ฅ) − ๐น(๐ฅ)) ๐= ๐ต๐ (๐น(๐ฅ)) • Property: the uniform empirical process at two points s and t converges to a bivariate normal distribution with mean zero and: ๐ถ๐๐ฃ(๐ต๐ (๐ก), ๐ต๐ (๐ )) = min(๐ , ๐ก) − ๐ ๐ก Definition: • A Brownian Bridge is the limiting process of {๐ต๐ (๐ข), 0 ≤ ๐ข ≤ 1}, and is denoted by {๐ต(๐ข), 0 ≤ ๐ข ≤ 1} • Definition: 1. ๐น๐๐ ๐๐ฃ๐๐๐ฆ 0 ≤ ๐ข ≤ 1, ๐ต(๐ข) ๐๐ ๐ ๐๐๐๐๐๐ ๐๐๐๐๐๐ ๐ฃ๐๐๐๐๐๐๐ ๐ค๐๐กโ ๐๐๐๐ ๐ง๐๐๐ ๐๐๐ ๐ฃ๐๐๐๐๐๐๐ ๐ข(1 − ๐ข) 2. ๐น๐๐ ๐๐ฃ๐๐๐ฆ 0 ≤ ๐ข1 , ๐ข2 ≤ 1, (๐ต(๐ข1 ), ๐ต(๐ข2 )) ๐๐ ๐ ๐๐๐ฃ๐๐๐๐๐ก๐ ๐๐๐๐๐๐ ๐ฃ๐๐๐ก๐๐ ๐ค๐๐กโ ๐ถ๐๐ฃ(๐ต(๐ข1 ), ๐ต(๐ข2 )) = ๐๐๐ (๐ข1 , ๐ข2 ) − ๐ข1 ๐ข2 Asymptotic statistics: • As the sample size n → ∞, the general empirical process {√๐(๐น๐ (๐ฅ) − ๐น(๐ฅ)), ๐ฅ ∈ ๐ } converges to a limiting process {๐ต(๐น(๐ฅ)), ๐ฅ ∈ ๐ }. • A more rigorous way to formulate the convergence is as follows: ๐ ๐ข๐๐ฅ∈๐ |√๐(๐น๐ (๐ฅ) − ๐น(๐ฅ)) − ๐ต(๐น(๐ฅ))| ๐→ 0 • Delta method(univariate): ๐ผ๐ √๐(๐๐ − ๐) ๐→ ๐(0, ๐ 2 ) ๐กโ๐๐ √๐(๐(๐๐ ) − ๐(๐)) ๐→ ๐′ (๐)๐(0, ๐ 2 ) ๐๐๐ ๐๐ฃ๐๐๐ฆ ๐๐๐๐๐๐๐๐๐ก๐๐๐๐๐ ๐(๐) • Delta method (bivariate): Δ๐ Δ๐ ๐ฟ ๐ ๐ ๐ผ๐ √๐ (( ๐ ) − ( )) ๐→ ( 1 ) ๐กโ๐๐ √๐(๐(๐๐ , ๐๐ ) − ๐(๐, ๐)) ๐→ (๐, ๐)๐ฟ1 + (๐, ๐)๐ฟ2 ๐๐๐ ๐(๐ฅ, ๐ฆ) ๐๐ ๐ ๐ฟ2 Δ๐ฅ Δ๐ฆ Brownian Motion to Brownian Bridge: • Let {๐ (๐ก), 0 ≤ ๐ก ≤ 1} be a Brownian motion. The process {๐(๐ก), 0 ≤ ๐ก ≤ 1} defined by ๐(๐ก) = ๐ (๐ก) − ๐ก๐ (1) for 0 ≤ t ≤ 1 is a Brownian bridge on the unit interval. • By conditioning on the event {๐ (1) = 0}, the Brownian motion {๐ (๐ก), ๐ก ≥ 0} becomes a Brownian bridge on the unit interval. Brownian Bridge to Brownian Motion: • Let Z be a standard normal random variable, independent of the Brownian bridge {๐ต(๐ก), ๐ก ≥ 0}. Then, the process {๐(๐ก), ๐ก ≥ 0} defined by ๐(๐ก) = ๐ต(๐ก) + ๐ก๐, 0 ≤ ๐ก ≤ 1 is a Brownian motion on the unit interval. Kolmogorov-Smirnov test: • Suppose we have a random sample Y1, Y2, . . . , Yn drawn from an unknown distribution, with the aim of testing the null hypothesis that the unknown distribution has some given CDF F0(y). The Kolmogorov statistic ๐พ๐ = √๐ ๐ ๐ข๐๐ฆ∈๐ |๐น๐ (๐ฆ) − ๐น0 (๐ฆ)| can be used. • Under null: √๐ ๐ ๐ข๐๐ฆ∈๐ |๐น๐ (๐ฆ) − ๐น0 (๐ฆ)| ๐→ ๐ ๐ข๐๐ฆ∈๐ |๐ต(๐น0 (๐ฆ))| = ๐ ๐ข๐0≤๐ข≤1 |๐ต(๐ข)| • For Brownian Bridge: ∞ ๐(๐ ๐ข๐0≤๐ข≤1 ๐ต(๐ข) > ๐ฆ) = ๐ −2๐ฆ 2 ๐๐๐ ๐(๐ ๐ข๐0≤๐ข≤1 |๐ต(๐ข)| > ๐ฆ) = 2 ∑(−1)๐+1 ๐ −2๐ ๐=1 9 2๐ฆ2 • Now, suppose we have drawn the sample. One way to perform the Kolmogorov-Smirnov test is to draw Fn(y) first. Determine a maximum allowed distance by multiplying critical value kα by ๐น๐ (๐ฆ) ± ๐α √๐ 1 . √๐ Draw the two lines red: โถ If F0(y) falls completely between the red lines, do not reject the null hypothesis. โถ If F0(y) exceeds one of the red lines, reject the null hypothesis. Other CTCS Processes: Ornstein Uhlenbeck: • Let {๐ (๐ก), ๐ก ≥ 0} be Brownian motion on the interval [0, ∞), and let α ≥ 0. The stochastic process ๐๐ก {๐(๐ก), ๐ก ≥ 0} defined by ๐(๐ก) = ๐ − 2 ๐(๐ ๐๐ก ) is called the Ornstein-Uhlenbeck process. • The Ornstein-Uhlenbeck process {๐(๐ก), ๐ก ≥ 0} is a Gaussian process with zero mean function and covariance function: ๐ถ๐๐ฃ(๐(๐ก1), ๐(๐ก2)) = ๐๐ฅ๐{−๐ผ|๐ก1 − ๐ก2|/2}. • Stationarity (a process is stationary if at any given time point the distribution does not depend on the time, i.e. is the same for every t): โถ Brownian motion is NOT a stationary process โถ Ornstein-Uhlenbeck process is a stationary process • Increment: โถ Brownian motion a process with independent (and stationary) increments โถ Ornstein-Uhlenbeck process does not have independent increments, but it has stationary increments Geometric Brownian Motion: • Let {๐ (๐ก), ๐ก ≥ 0} be Brownian motion on the interval [0, ∞). The stochastic process {๐(๐ก), ๐ก ≥ 0} defined by ๐(๐ก) = ๐ µ๐ก+๐๐(๐ก) is called the geometric Brownian motion with drift coefficient µ and variance parameter ๐ 2 . • If {๐(๐ก), ๐ก ≥ 0} is a geometric Brownian motion with drift coefficient µ and variance parameter ๐ 2 , then µ {๐ −1 ๐๐ ๐(๐ก), ๐ก ≥ 0} is a Brownian motion with drift coefficient ๐. • Note that the geometric Brownian motion is not a Gaussian process! โถ For fixed t, the random variable has a log-normal distribution with parameters µ๐ก ๐๐๐ ๐ 2 10 Tables: Absolute supremum (Butler Test): Kolgomorov-Smirnov test: Standard Normal: 11 12