Positive Walks - Department of Mathematics | University of Nebraska

advertisement
Steven R. Dunbar
Department of Mathematics
203 Avery Hall
University of Nebraska-Lincoln
Lincoln, NE 68588-0130
http://www.math.unl.edu
Voice: 402-472-3731
Fax: 402-472-8466
Topics in
Probability Theory and Stochastic Processes
Steven R. Dunbar
Positive Walks
Rating
Mathematicians Only: prolonged scenes of intense rigor.
1
Section Starter Question
How many random walks with 6 steps are there? Among those walks, how
many are positive for all 6 steps? How many are non-negative for all 6 steps?
Key Concepts
1. The Reflection Principle for Paths is the one-to-one correspondence
between paths with origin (0, a) and endpoint (n, b) crossing the x-axis
(call these paths of the first type) equals the number of paths with origin
(0, −a) and endpoint (n, b) (call these paths of the second type).
2. The probability of a positive walk is
P2n [T1 > 0, T2 > 0, . . . , T2n
2n
> 0] = 2n+1
.
2
n
1
3. The probability of a non-negative walk is
P2n [T1 ≥ 0, T2 ≥ 0, . . . , T2n
1 2n
≥ 0] = 2n
.
2
n
4. The probability that one player is ahead until the last coin toss in the
game, which ties the two players is
2n − 2
1
P2n [T1 > 0, T2 > 0, . . . , T2n = 0] = 2n
.
n−1
n2
2
Vocabulary
1. To each element ω ∈ Ωn we associate a piecewise linear curve in R2
consisting of a finite union of segments of the form [(i, j), (i + 1, j + 1)]
or [(i, j), (i + 1, j − 1)] where i, j are integers, called a path.
2. The Reflection Principle for Paths is the one-to-one correspondence
between paths with origin (0, a) and endpoint (n, b) crossing the x-axis
(call these paths of the first type) equals the number of paths with origin
(0, −a) and endpoint (n, b) (call these paths of the second type).
Mathematical Ideas
Probability Interpretation: Positive Walks
Recall that Yi is a sequence of independent random variables which take values 1 with probability 1/2 and −1 with probability 1/2. This is a mathematical model of a fair coin flip game where a 1 results from “heads” on the ith
coin toss and a −1 results from “tails”. Let HnP
and Ln be the number of heads
and tails respectively in n flips. Then Tn = ni=1 Yi = Hn − Ln = 2Sn − n
counts the difference between the number of heads and tails, an excess of
heads if positive.
A common interpretation of this probability game is to imagine it as a
random walk. That is, we imagine an individual on a number line, starting
at some position T0 . The person takes a step to the right to T0 + 1 with
probability p and takes a step to the left to T0 − 1 with probability q and
continues this random process. Then instead of the total fortune at any time,
we consider the geometric position on the line at any time.
Create a common graphical representation of the game. A continuous
piecewise linear curve in R2 consisting of a finite union of segments of the
form [(i, j), (i + 1, j + 1)] or [(i, j), (i + 1, j − 1)] where i, j are integers is
called a path. A path has an origin (a, b) and an endpoint (c, d) which are
3
points on the curve with integer coordinates satisfying a ≤ i ≤ c for all (i, j)
on the curve. We will say the length of√the path is c − a. (Note that the
Euclidean length of the path is (c − a) 2.) STo each element ω ∈ Ωn (see
Binomial Distribution), we associate a path n−1
k=0 [(i, Ti (ω)), (i + 1, Ti+1 (ω))]
with origin (0, 0) and endpoint (n, Tn (ω)).
If c and d are two integers such that 0 ≤ |d| ≤ c then the number of paths
c
with origin (0, 0) and endpoint (c, d) is zero if c+d is odd and (c+d)/2
if c+d is
even. More generally, if a, b, c and d are integers such that 0 ≤ |d − b| ≤ c − a
and c − a + d − b is even, then
the number of paths with origin (a, b) and
c−a
endpoint (c, d) is (c−a+b−d)/2 .
Proposition 1 (Reflection Principle for Paths). Let a, b ≥ 0 and n > 0 be
integers. Then
Pn [Tn = b − a and Tk = −a, k ∈ [0, n]] = Pn [Tn = b + a] .
Proof. The case a = 0 is trivial, so suppose a > 0. The proof will establish
a one-to-one correspondence between paths with origin (0, 0) and endpoint
(n, b − a) that cross the horizontal line y = −a with paths with origin (0, 0)
and endpoint (n, a + b). Translating each path of the first type up by a units,
this is equivalent to showing that the number of paths with origin (0, a) and
endpoint (n, b) crossing the x-axis (call these paths of the first type) equals
the number of paths with origin (0, −a) and endpoint (n, b) (call these paths
of the second type). If C is a path of the first type, set t(C) to be the smallest
i > 0 such that (i, 0) ∈ C. The path C is a union of a path C1 with origin
(0, 0) and endpoint (t(C), 0) and a path C2 with origin (t(C), 0) and endpoint
(n, b). Then to each C, we associate the path C 0 that is the union of C2 with
the reflection of C1 across the x-axis. The path C 0 is a path of the second
type and the correspondence C ↔ C 0 is one-to-one between the two sets of
paths of each of the two types.
Remark. The probability the random walk will end at (0, 0) after 2n steps is
1 2n
P2n T2n = 0 = 2n
.
2
n
This is also the probability that in a coin-tossing game, the players will be
tied at the end of n tosses. The next corollary shows that this is double the
probability that the walk remains positive for all 2n steps, or equivalently
that the Heads player in the coin-tossing game is always ahead.
4
Tn
Tn
b+a
b−a
n
a
−a
n
t
Figure 1: A path C starting at (0, 0), crossing −a, ending at (n, b − a) and
its reflection and translation to a path C 0 starting at (0, 0) passing through
a, ending at (n, b + a).
Theorem 2 (Positive Walks Theorem).
P2n [T1 > 0, T2 > 0, . . . , T2n > 0] =
1
22n+1
2n
.
n
Proof. Count the paths with origin (0, 0) and length 2n strictly contained in
the upper half plane. To obtain this, sum the number of paths from (1, 1) to
(2n, 2k) that do not touch the x-axis. There is only one path that connects
the point (1, 1) to the point (2n, 2n) and this path does not return to the xaxis. If 1 ≤ k < n, the number of paths connecting (1, 1) to the point (2n, 2k)
that do not return to the x-axis equals the number of paths connecting (1, 1)
to the point (2n, 2k) minus the number of paths that do return to the x-axis.
2n−1
The number of paths connecting (1, 1) to the point (2n, 2k) is n+k−1
. By
the reflection principle the number of paths connecting (1, 1) to the point
(2n, 2k) that return to the x-axis equals the number
of paths connecting the
2n−1
point (1, −1) to the point (2n, 2k), which is n+k .
Therefore, the number of paths with origin (0, 0) and length 2n that are
contained in the upper half-plane is
n−1 X
2n − 1
2n − 1
1+
−
.
n
+
k
−
1
n
+
k
k=1
5
This sum telescopes to 2n−1
and
n
2n − 1
(2n − 1)!
1 (2n)(2n − 1)!
1 2n
=
=
=
.
n
n!(n − 1)!
2 n! · n · (n − 1)!
n n
Remark. The probability the random walk will end at (0, 0) after 2n steps is
1 2n
P2n [T2n = 0] = 2n
.
2
n
This is also the probability that in a coin-tossing game, the players will be
tied at the end of n tosses. The next corollary show that this also equals
the probability that the walk remains nonnegative for all 2n steps, or that
the losing player in the coin-tossing game is never ahead. Notice the difference between the previous corollary and the next corollary, the first is about
positive walks and the second is about nonnegative walks.
Theorem 3 (Non-Negative Walks Theorem).
P2n [T1 ≥ 0, T2 ≥ 0, . . . , T2n
1 2n
.
≥ 0] = 2n
2
n
Proof. The proof shows that
P2n [T1 ≥ 0, T2 ≥ 0, . . . , T2n ≥ 0] = 2P2n [T1 > 0, T2 > 0, . . . , T2n > 0] .
The claim is that the number of paths with origin (0, 0) and length 2n that
are contained in the upper half plane with no return to the x-axis equals the
number of paths with origin (0, 0) and length 2n − 1 that are contained in
the upper half-plane including the x-axis. The claim is true because there
is a bijection between the sets of these two types of paths: We associate a
path of the second type to each path of the first type by removing the initial
segment and translating the path 1 unit left and 1 unit down.
Note that T2n−1 is never zero. Then to each path with origin (0, 0) and
length 2n − 1 that is contained in the upper half-plane including the x-axis,
we can associate exactly two paths with length (0, 0) and length 2n that are
contained in the upper half-plane. To do this, we add a segment of length
1 and slope 1 or −1 to the end of the path of length 2n − 1. Therefore,
the cardinality of the event {M1 ≥ 0, M 2 ≥ 0, . . . , M2n ≥ 0} is twice the
cardinality of the event {M1 > 0, M 2 > 0, . . . , M2n > 0}.
6
Remark. The final corollary calculates the probability that one player is
ahead until the last coin toss in the game, which ties the two players.
Corollary 1.
P2n [T1 > 0, T2 > 0, . . . , T2n
2n − 2
1
.
= 0] = 2n
n2
n−1
Proof. Count the paths with origin (0, 0) and endpoint (2n, 0) that are contained in the open upper half-plane which does not include the x-axis. In
other words, count the paths with origin (1, 1) and endpoint (2n − 1, 1) that
are contained in the open upper half-plane. This number equals the number
of paths with origin (1, 1) and endpoint (2n − 1, 1) that touch the x-axis at
some point. The number of paths with origin (1, 1) and endpoint (2n − 1, 1)
equals 2n−2
. By the reflection principle, the number of paths with origin
n−1
(1, 1) and endpoint (2n − 1, 1) that touch the x-axis equals the number of
paths with origin (1, −1) and endpoint (2n − 1, 1). The number is 2n−2
.
n−2
Finally, it is easy to check that
(2n − 2)!
2n − 2
2n − 2
(2n − 2)!
−
−
=
(n − 1)!(n − 1)! (n − 2)!n!
n−1
n−2
n(2n − 2)! − (n − 1)(2n − 2)!
=
(n − 1)!n!
(2n − 2)!
1
=
n (n − 1)!(n − 1)!
1 2n − 2
=
n n−1
Remark. The corollaries imply the following combinatorial identity.
Corollary 2. If n and k are integers with 0 ≤ k ≤ n, then
n−k X
1 2j − 2 2(n − j) − k
2n − k − 1
=
.
j
j
−
1
n
−
j
n
j=1
In particular,
X 1 2j − 22(n − j) 2n
2
n
=
.
j
j
−
1
n
−
j
n
j=1
7
Proof. Count the number of paths with origin (0, 0) and endpoint (2n − k, k)
in two different ways. On one hand, the number of such paths is 2n−k
.
n
On the other hand, the number of paths equals the number of paths with
origin (1, 1) and endpoint (2n − k, k) plus the number of paths with origin
(1, −1) and endpoint (2n − k, k).
The number of paths with origin (1, 1) and
2n−k−1
endpoint (2n − k, k) is n−1 . Next, for every path with origin (1, −1) and
endpoint (2n − k, k), there exists a minimum integer 1 ≤ j ≤ n − k such
that the path passes through (2j, 0). The number of paths from 1 to n − k
is therefore equal to the sum for j from 1 to n − k of the product of the
number of paths with origin (0, 0) and endpoint (2j, 0) that are contained in
the lower half-plane and the number of paths with origin (2j, 0) and endpoint
(2n − k, k). By the previous corollary, this implies the number of paths with
endpoint (1, −1) and endpoint (2n − k, k) is
n−k X
1 2j − 2 2(n − j) − k
.
j j−1
n−j
j=1
Considering all of the information above, we see that this equals
2n − k
2n − k − 1
2n − k − 1
−
=
.
n
n−1
n
The second form follows from setting k = 0 and using the fact that
2n
2n − 1
=2
.
n
n
Remark. Note that:
• The Positive Walks Theorem is different from the Hitting Time Theorem. The Positive Walks Theorem gives the probability that an n-step
random walk starting at 0 is always positive throughout the remainder
of the walk, without regard to the value of the endpoint.
• The Hitting Time Theorem expresses a conditional probability.
• By time reversal and symmetry around k, the Hitting Time Theorem
is equivalent to the statement that an n-step random walk taking independent and identically distributed ±1 steps stays positive after time
0, given that it ends at height k > 0.
8
Sources
This section is adapted from: Heads or Tails, by Emmanuel Lesigne, Student
Mathematical Library Volume 28, American Mathematical Society, Providence, 2005, Chapter 10.3. [1].
Algorithms, Scripts, Simulations
Algorithm
The scripts generate k trials of n-step random walks. Each random walk is
the sequence of cumulative sums from a sequence of n coin flips, embedded
in a k × n + 1 0 matrix walks to set the initial condition. Each random
walk is examined for positive steps, creating a k × n + 1 Boolean matrix
findposwalks. The rows of the k × n + 1 Boolean matrix are summed and
rows with sum n correspond to positive walks. Another comparison finds
a Boolean vector poswalks corresponding to the positive walks. The sum
of the Boolean vector poswalks gives the number of positive walks in the
k trials. Dividing by the number of trials gives the empirical probability of
positive walks. This empirical probability is compared to the probability from
the Positive Walks Theorem which is computed directly from the binomial
coefficient.
Scripts
Scripts
R R script for Positive Walks Theorem.
p <− 0 . 5
n <− 100
k <− 200
walks = array ( 0 , c ( k , n+1))
9
rw <− t ( apply ( 2 ∗ matrix ( ( runif ( n∗k ) <= p ) , k , n ) −1 , 1 , cumsum) )
walks [ , 1 : n+1] <− rw
f i n d p o s w a l k s <− apply ( 0+( walks [ , 1 : n+1] > 0 ) , 1 , sum)
p o s w a l ks = sum( 0+( f i n d p o s w a l k s == n ) )
prob <− p o s w a l ks /k
t h e o r e t i c a l = 2ˆ( −(2∗n+1))∗choose ( 2 ∗n , n )
cat ( s p r i n t f ( ” E m p i r i c a l p r o b a b i l i t y : %f \n” , prob ) )
cat ( s p r i n t f ( ” P o s i t i v e Walks Theorem p r o b a b i l i t y : %f \n” , t h e o r e t i c a l
Octave Octave script for Positive Walks Theorem.
p = 0.5;
n = 100;
k = 200;
walks = zeros ( k , n +1);
walks ( : , 2 : n+1) = cumsum( ( 2 ∗ ( rand ( k , n ) <= p ) − 1 ) , 2 ) ;
f i n d p o s w a l k s = sum( walks ( : , 2 : n+1) > 0 ) ;
p o s w a l ks = sum( f i n d p o s w a l k s == n ) ;
prob = p o s w a l k s /k ;
t h e o r e t i c a l = 2ˆ( −(2∗n+1))∗ b i n c o e f f (2∗ n , n ) ;
disp ( ” E m p i r i c a l p r o b a b i l i t y : ” ) , disp ( prob )
disp ( ” P o s i t i v e Walks Theorem p r o b a b i l i t y : ” ) , disp ( t h e o r e t i c a l )
Perl Perl PDL script for Positive Walks Theorem.
use PDL : : N i c e S l i c e ;
$p = 0 . 5 ;
$n = 1 0 0 ;
$k = 2 0 0 ;
$walks = z e r o s ( $n + 1 , $k ) ;
10
$rw = cumusumover ( 2 ∗ ( random ( $n , $k ) <= $p ) − 1 ) ;
$walks ( 1 : $n , 0 : $k−1 ) .= $rw ;
$ f i n d p o s w a l k s = sumover ( $walks ( 1 : $n , 0 : $k −1) > 0 ) ;
$ po s wa lk s = sum ( $ f i n d p o s w a l k s == $n ) ;
$prob = $ p os w al ks / $k ;
use PDL : : GSLSF : :GAMMA;
$x = 2.∗∗( −(2∗ $n +1))∗ p dl [ g s l s f c h o o s e (2∗ $n , $n ) ] ;
$ t h e o r e t i c a l = $x ( 0 ) ;
print ” E m p i r i c a l p r o b a b i l i t y ” , $prob , ”\n” ;
print ” P o s i t i v e Walks Theorem p r o b a b i l i t y ” , $ t h e o r e t i c a l , ”\n” ;
SciPy Scientific Python script for Positive Walks Theorem.
import s c i p y
p = 0.5
n= 100
k = 200
walks = s c i p y . z e r o s ( ( k , n+1) , dtype=i n t )
rw = s c i p y . cumsum ( 2∗( s c i p y . random . random ( ( k , n ) ) <= p ) −1, a x i s =1)
walks [ : ,
1 : n+1] = rw
f i n d p o s w a l k s = 0+( walks [ : , 1 : n+1] > 0 )
p o s w a l ks = s c i p y . sum ( s c i p y . sum ( f i n d p o s w a l k s , a x i s = 1 ) == n )
prob = f l o a t ( p o s w a l k s ) / f l o a t ( k )
import s c i p y . s p e c i a l
t h e o r e t i c a l = s c i p y . s p e c i a l . exp2 ( −(2∗n+1))∗ s c i p y . s p e c i a l . binom (2∗ n , n
print ” E m p i r i c a l p r o b a b i l i t y : ” , prob
print ” P o s i t i v e Walks Theorem p r o b a b i l i t y : ” , t h e o r e t i c a l
11
Problems to Work for Understanding
1. Modify the scripts to compare the empirical probability of nonnegative
walks to the probability given in the Nonnegative Walks Theorem.
2. Use the asymptotic growth rate of the central binomial coefficient to
find the asymptotic growth rate of the probability of positive walks, as
the length n of the walks goes to infinity.
Reading Suggestion:
References
[1] Emmanuel Lesigne. Heads or Tails: An Introduction to Limit Theorems
in Probability, volume 28 of Student Mathematical Library. American
Mathematical Society, 2005.
Outside Readings and Links:
1.
12
2.
3.
4.
I check all the information on each page for correctness and typographical
errors. Nevertheless, some errors may occur and I would be grateful if you would
alert me to such errors. I make every reasonable effort to present current and
accurate information for public use, however I do not guarantee the accuracy or
timeliness of information on this website. Your use of the information from this
website is strictly voluntary and at your risk.
I have checked the links to external sites for usefulness. Links to external
websites are provided as a convenience. I do not endorse, control, monitor, or
guarantee the information contained in any external website. I don’t guarantee
that the links are active at all times. Use the links here with the same caution as
you would all information on the Internet. This website reflects the thoughts, interests and opinions of its author. They do not explicitly represent official positions
or policies of my employer.
Information on this website is subject to change without notice.
Steve Dunbar’s Home Page, http://www.math.unl.edu/~sdunbar1
Email to Steve Dunbar, sdunbar1 at unl dot edu
Last modified: Processed from LATEX source on September 24, 2014
13
Download