Chapter 4 Stochastic Processes Definition 4.0.1 (Stochastic Process) A stochastic process is a set of random variables indexed by time: X(t) Modeling requires somehow (mathematically consistent) specifying the joint distribution (X(t1 ), X(t2 ), X(t3 ), . . . , X(tk )) for any choice of t1 < t2 < t3 < . . . < tk . Values of X(t) are called states, the set of all possible values for X(t) is called the state space. We have been looking at a Poisson process for some time - our example “hits on a web page” is a typical example for a Poisson process - so, here’s a formal definition: 4.1 Poisson Process Definition 4.1.1 (Poisson Process) A stochastic process X(t) is called homogenous Poisson process with rate λ, if 1. for t > 0 X(t) takes values in {0, 1, 2, 3, . . .}. 2. for any 0 ≤ t1 < t2 : 3. for any 0 ≤ t1 < t2 ≤ t3 < t4 distribution depends only on length of interval X(t2 ) − X(t1 ) ∼ P oλ(t2 −t1 ) non-overlapping intervals are independent Xt2 − Xt1 is independent from Xt4 − Xt3 Jargon: X(t) is a “counting process” with independent Poisson increments. Example 4.1.1 Hits on a web page Number of hits on a webpage Counter X(t) A counter of the number of hits on our webpage is an example for a Poisson Process with rate λ = 2. In the example X(t) = 3 for t between 5 and 8 minutes. Time t (in min) 61 62 CHAPTER 4. STOCHASTIC PROCESSES Note: • X(t) can be thought of as the number of occurrences until time t. • Similarly, X(t2 ) − X(t1 ) is the number of occurrences in the interval (t1 , t2 ]. • With the same argument, X(0) = 0 - ALWAYS! • The distribution of X(t) is Poisson with rate λt, since: X(t) = X(t) − X(0) ∼ P oλ(t−0) For a given Poisson process X(t) we define occurrences O0 Oj = 0 = time of the jth occurrence = = the first t for which X(t) ≥ j and the inter-arrival time between successive hits: Ij = Oj − Oj−1 for j = 1, 2, . . . The time until the kth hit Ok is therefore given as the sum of inter-arrival times Ok = I1 + . . . + Ik . Theorem 4.1.2 X(t) is a Poisson process with rate λ ⇐⇒ The inter-arrival times I1 , I2 , . . . are i.i.d. Expλ . Further: the time until the kth hit Ok is an Erlangk,λ distributed variable, ⇐⇒ X(t) is a Poisson process with rate λ. This theorem is very important! - it links the Poisson, Exponential, and Erlang distributions tightly together. Consider the following very important example: Example 4.1.2 Hits on a webpage Hits on a popular Web page occur according to a Poisson Process with a rate of 10 hits/min. One begins observation at exactly noon. 1. Evaluate the probability of 2 or less hits in the first minute. Let X be the number of hits in the first minute, then X is a Poisson variable with λ = 10: P (X ≤ 2) = P o10 (2) = e−10 + 10 · e−10 + 102 /2e−10 = 0.0028. or table-lookup p.788 2. Evaluate the probability that the time till the first hit exceeds 10 seconds. Let Y be the time until the first hit - then Y has an Exponential distribution with parameter λ = 10 per minute or λ = 1/6 per second. P (Y ≥ 10) = 1 − P (Y ≤ 10) = 1 − (1 − e−10·1/6 ) = e−5/3 = 0.1889. 3. Evaluate the mean and the variance of the time till the 4th hit. Let Z be the time till the 4th hit. Then Z has an Erlang distribution with stage parameter k = 4 and λ = 10 per minute. E[Z] = V ar[Z] = k 4 = = 0.4 minutes λ 10 4 k = = 0.04 minutes2 . λ2 100 63 4.2. BIRTH & DEATH PROCESSES 4. Evaluate the probability that the time till the 4th hit exceeds 24 seconds. P (Z > 24) = 1 − P (Z ≤ 24) = 1 − Erlang4,1/6 (24) = = 1 − (1 − P o1/6·24 (4 − 1)) = P o4 (3) table,p.786 = 0.433 5. The number of hits in the first hour is Poisson with mean 600. You would like to know the probability of more than 650 hits. Exact calculation isn’t really feasible. So approximate this probability and justify your approximation. A Poisson distribution with large rate λ can be approximated by a normal distribution (corollary from the Central Limit Theorem) with mean µ = λ and variance σ 2 = λ. Then X Then: approx ∼ N (600, 600) → Z := P (X > 650) approx X−600 √ ∼ 600 N (0, 1). = 1 − P (X ≤ 650) = 1 − P ≈ 1 − Φ(2.05) table, p.789 = ! Z≤ 650 − 600 √ 600 " ≈ 1 − 0.9798 = 0.0202. Another interesting property of the Poisson process model that’s consistent with thinking of it as “random occurrences” in time t, is Theorem 4.1.3 Let X(t) be a Poisson process. Given that X(T ) = k, the conditional distribution of the time of the k occurrences O1 , . . . , Ok is the same as the distribution of k ordered independent standard uniform variables U(1) , U(2) , . . . , U(k) . This tells us a way to simulate a Poisson process with rate λ on the interval (0, T ): - first, draw a Poisson value w from P oλT . - This tells us, how many uniform values Ui we need to simulate. - secondly, generate w many standard uniform values u1 , . . . , uw - define oi = T · u(i) , where u(i) is the ith smallest value among u1 , . . . , uw . The above theorem tells us, that, if we pick k values at random from an interval (0, t), we can assume, that if we order them, the distance between two successive values has an exponential distribution with rate λ = k/t. So far, we are looking only at arrivals of events. Besides that, we could, for example, look at the number of surfers that are on our web site at the same time. There, we have departures as well and, related to that, the time each surfer stays - which we will call service time (from the perspective of the web server). This leads us to another model: 4.2 Birth & Death Processes Birth & Death Processes (B+D) are a generalization of Poisson processes, that allow the modelling of queues, i.e. we assume, that arrivals stay some time in the system and leave again after that. A B+D process X(t) is a stochastic process that monitors the number of people in a system. If X(t) = k, we assume that at time t there are k people in the system. Again, X(t) is called the state at time t. X(t) is in {0, 1, 2, 3 . . .}, for all t. We can visualize (see fig. 4.1) the set-up for a B+D process in a state diagram as movements between consecutive states. Conditional on X(t) = k we either move to state k + 1 or to k − 1, depending on whether a birth or a death occurs first. 64 CHAPTER 4. STOCHASTIC PROCESSES 0 1 2 3 ... Figure 4.1: State diagram of a Birth & Death process. Example 4.2.1 Stat Printer The ”heavy-duty” printer in the Stats department gets 3 jobs per hour. On average, it takes 15 min to complete printing. The printer queue is monitored for a day (8h total time): Jobs arrive at the following points in time (in h): job i arrival time 1 0.10 2 0.40 3 0.78 4 1.06 5 1.36 6 1.84 7 1.87 8 2.04 9 3.10 10 4.42 job i arrival time 11 4.46 12 4.66 13 4.68 14 4.89 15 5.01 16 5.56 17 5.56 18 5.85 19 6.32 20 6.99 The printer finishes jobs at: job i finishing time 1 0.22 2 0.63 3 1.61 4 1.71 5 1.76 6 1.90 7 2.32 8 2.68 9 3.42 10 4.67 job i finishing time 11 5.31 12 5.54 13 5.59 14 5.62 15 5.84 16 6.04 17 6.83 18 7.10 19 7.23 20 7.39 Let X(t) be the number of jobs in the printer and its queue at time t. X(t) is a Birth & Death process. (a) Draw the graph of X(t) for the values monitored. Number of jobs in the system at time t 5 4 3 X(t) 2 1 0 0 2 4 6 Time (in h) (b) What is the (empirical) probability that there are 5 jobs in the printer and its queue at some time t? The empirical probability for 5 jobs in the printer is the time, X(t) is in state 5 divided by the total time: != 5) = (5.31 − 5.01) + (5.59 − 5.56) = 0.33 = 0.04125. P (X(t) 8 8 65 4.2. BIRTH & DEATH PROCESSES The model for a birth or a death is given conditional on X(t) = k as: B D if = time till a potential birth ∼ Expλk = time till a potential deathj ∼ Expµk B<D B>D the move is to state k + 1 at time t + B the move is to state k − 1 at time t + D remember: P (B = D) = 0! B and D are independent for each state k. This implies, that, given the process is in state k, the probability to move to state k+1 k−1 λk µk + λk µk is . µk + λk is Then Y = min(B, D) is the remaining time in state k until the move. What can we say about the distribution of Y := min(B, D)? P (Y ≤ y) = = = = = P (min(B, D) ≤ y) = P (B ≤ y ∪ D ≤ y) = way, way back, we looked at this kind of probability. . . P (B ≤ y) + P (D ≤ y) − P (B ≤ y ∩ D ≤ y) = B, Dare independent. P (B ≤ y) + P (D ≤ y) − P (B ≤ y) · P (D ≤ y) = B ∼ Expλk , D ∼ Expµk −λk y −µk y −λk y −µk y 1−e +1−e − (1 − e )(1 − e )= 1 − e(λk +µk )y = Expλk +µk (y), i.e. Y itself is again an exponential variable, its rate is the the sum of the rates of B and D. Knowing the distribution of Y , the staying time in state k, gives us, e.g. the possibility to compute the mean staying time in state k. The mean staying time in state k is the expected value of an exponential distribution with rate λk + µk . The mean staying time therefore is 1/(λk + µk ). We will mark this result by (*) and use it below. Note: A Poisson process with rate λ is a special case of a Birth & Death process, where the birth rates and death rates are constant, λk = λ and µk = 0 for all k. The analysis of this model for small t is mathematically difficult because of “start-up” effects - but in some cases, we can compute the “large t” behaviour. A lot depends on the ratio of births and deaths: this is result (*) 66 CHAPTER 4. STOCHASTIC PROCESSES Number of jobs in the system at time t 15 X(t) 5 0 0 500 1000 1500 2000 Time (in sec) Number of jobs in the system at time t 60 X(t) 20 0 0 500 1000 1500 2000 Time (in sec) Number of jobs in the system at time t 400 200 X(t) 0 0 500 1000 1500 2000 Time (in sec) In the picture, three different simulations of Birth & Death processes are shown. Only in the first case, the process is stable (birth rate < death rate). The other two processes are unstable (birth rate = death rate (2nd process) and birth rate > death rate (3rd process)). Only if the B+D process is stable, it will find an equilibrium after some time - this is called the steady state of the B+D process. Mathematically, the notion of a steady state state translates to lim P (X(t) = k) = pk for all k, # where the pk are numbers between 0 and 1, with k pk = 1. The pk probabilities are called the steady state probabilities of the B+D process, they form a density function for X. At the moment it is not clear why the steady state probabilities need to exist at all - in fact, for some systems they do not. For the moment, though, we will assume, that they exist and try to compute them. On the way to the result we will come across conditions under which they will actually exist. We can figure out what the pk must be as follows: t→∞ time in state k until time t → total time t $ # of visits to %$mean stay% state k by time t in state k total time t $ # of visits to % state k by time t total time t A fraction of λk λk +µk pk → pk → pk (λk + µk ) use (*) = long run rate of visits to k visits to state k result in moves to state k + 1, so λk · pk (λk + µk ) = λk pk λk + µk 67 4.2. BIRTH & DEATH PROCESSES is the long run rate of transitions from state k to k + 1 and, similarly, µk pk is the long run rate of transitions from state k to state k − 1. From the very simple principle, that overall everything that flows into state k has to flow out again, we get the so-called balance equations for the steady state probabilities: Balance equations The Flow-In = Flow-Out Principle provides us with the means to derive equations between the steady state probabilities. 1. For state 0 µ1 p1 = λ0 p0 0 0 i.e. p1 = 1 λ0 µ1 p0 . 1 2. For state 1 µ1 p1 + λ1 p1 = λ0 p0 + µ2 p2 1 0 0 i.e. p2 = λ1 µ2 p1 = λ0 λ1 µ1 µ2 p0 . 1 1 2 2 3. For state 2 µ2 p2 + λ2 p2 = λ0 p0 + µ3 p3 1 0 0 i.e. p3 = λ2 µ3 p2 = λ0 λ1 λ2 µ1 µ2 µ3 p0 . 1 1 2 2 4. . . . for state k we get: pk = λ0 λ1 λ2 · . . . · λk−1 p0 . µ1 µ2 µ3 · . . . · µk ok, so now we know all the steady state probabilities depending on p0 . But what use has that, if we don’t know p0 ? Here, we need another trick: we know, that the steady state probabilities are the density function for the state X. Their sum must therefore be 1! Then 1 = p 0 + p1 + p 2 + . . . " ! λ0 λ 0 λ1 = p0 1 + + + ... µ1 µ1 µ2 & '( ) :=S If this sum S converges, we get p0 = S −1 . If it doesn’t converge, we know that we don’t have any steady state probabilities, i.e. the B+D process never reaches an equilibrium. The analysis of S is crucial! 68 CHAPTER 4. STOCHASTIC PROCESSES If S exists, p0 does, and with p0 all pk , which implies, that the Birth & Death process is stable. If S does not exist, then the B & D process is unstable, i.e. it does not have an equilibrium and no steady state probabilities. Special case: Birth & Death process with constant birth and death rates If all birth rates λk = λ a constant birth rate and µk = µ for all k, the ratio between birth and death rates is constant, too: λ a := µ a is called the traffic intensity. In order to decide, whether a specific B&D process is stable or not, we have to look at S. For constant traffic intensities, S can be written as: S =1+ ∞ * λ0 λ1 λ0 + + . . . = 1 + a + a2 + a3 + ... = ak µ1 µ1 µ2 k=0 This sum is called a geometric series. If 0 < a < 1 the series converges: S= Then: p0 pk 1 1−a for 0 < a < 1. = S −1 = 1 − a = ak · (1 − a) = P (X(t) = k), i.e. X(t) therefore has a Geometric distribution for large t: for large t and 0 < a < 1. X(t) ∼ Geo1−a Example 4.2.2 Printer queue (continued) A certain printer in the Stat Lab gets jobs with a rate of 3 per hour. On average, the printer needs 15 min to finish a job. Let X(t) be the number of jobs in the printer and its queue at time t. X(t) is a Birth & Death Process with constant arrival rate λ = 3 and constant death rate µ = 4. (a) Draw a state diagram for X(t) - the (technically possible) number of jobs in the printer (and its queue). 3 0 3 1 4 3 2 4 3 3 4 4 (b) What is the (true) probability that at some time t the printer is idle? P (X(t) = 0) = p0 = 1 − 3 = 0.25. 4 (c) What is the probability that there arrive more than 7 jobs during one hour? Let Y be the number of arrivals. Y is a Poisson Process with arrival rate λ = 3. Y (t) ∼ P oλ·t . P (Y (1) > 7) = 1 − P (Y (1) ≤ 7) = 1 − P o3·1 (7) = 1 − 0.949 = 0.051. 69 4.2. BIRTH & DEATH PROCESSES (d) What is the probability that the printer is idle for more than 1 hour at a time? (Hint: this is the probability that X(t) = 0 and - at the same time - no job arrives for more than one hour.) Let Z be the time until the next arrival, then Z ∼ Exp3 . P (X(t) = 0 ∩ Z > 1) X(t),Zindependent = P (X(t) = 0) · P (Z > 1) = p0 · (1 − Exp3 (1)) = 0.25 · e−3 = 0.0124 (d) What is the probability that there are 3 jobs in the printer queue at time t (including the job printed at the moment)? P (X(t) = 3) = p3 = .753 · .25 = 0.10 (e) What is the difference between the true and the empirical probability of exactly 5 jobs in th printer system? p5 p(5 = = 0.755 · 0.25 = 005933 0.04125 The probabilities are close - which means that we can assume that this particular printer queue actually behaves like a Birth & Death process. Two Examples of Birth & Death Processes Communication System A communication system has two processors for decoding messages and a buffer that will hold at most two further messages. If the buffer is full, any incoming message is lost. Each processor needs on average 2 min to decode a message. Messages come in with a rate of 1 per min. Assume exponential distributions both for interarrival times between messages and the time needed to decode a message. Use a Birth & Death process to model the number of messages in the system. (a) Carefully draw a transition state diagram. 1 0 1 1 0.5 1 2 1 1 3 1 4 1 (b) Find the steady state probability that there are no messages in the system. Since p0 is S −1 , we need to compute S first: S = Therefore p0 = 19 . λ0 λ0 λ1 λ0 λ1 λ2 λ0 λ1 λ 2 λ 3 + + + = µ1 µ1 µ2 µ1 µ2 µ3 µ1 µ2 µ3 µ4 1 + 2 + 2 + 2 + 2 = 9. = 1+ 70 CHAPTER 4. STOCHASTIC PROCESSES ICB - International Campus Bank Ames The ICB Ames employs three tellers. Customers arrive according to a Poisson process with a mean rate of 1 per minute. If a customer finds all tellers busy, he or she joins a queue that is serviced by all tellers. Transaction times are independent and have exponential distributions with mean 2 minutes. (a) Sketch an appropriate state diagram for this queueing system. 1 0 1 1 0.5 1 2 1 1 3 1.5 1 4 1.5 1.5 (b) As it turns out, the large t probability that there are no customers in the system is p0 = 1/9. What is the probability that a customer entering the bank must enter the queue and wait for service? A person entering the bank must queue for service, if at least three people are in the bank (not including the one who enters at the moment). We are therefore looking for the large t probability, that X(t) is at least 3: P (X(t) ≥ 3) = = 1 − P (X(t) < 3) = 1 − P (X(t) ≤ 2) = 1 1 1 4 1 − (p0 + p1 + p2 ) = 1 − ( + 2 · + 2 · ) = . 9 9 9 9