Urban Operations Research Compiled by James S. Kang Fall 2001

advertisement
Urban Operations Research
Compiled by James S. Kang
Problem Set 1 Solutions
Fall 2001
10/3/2001
1. LO Problem 2.3 (Ingolfsson, 1993; Kang, 2001)
Anyone who arrives at the transfer station observes two independent Poisson processes, A and B.
Each arrival of the combined Poisson process comes from process A (B) with probability
λA
λA +λB
B
( λAλ+λ
), independently of all other arrivals. Thus the combined Poisson process has an embedded
B
Bernoulli process. To solve this problem, you need a good grasp of the fundamental properties of
Poisson and Bernoulli processes. If you feel uncomfortable with the answers below, now is a good
time to review Poisson and Bernoulli processes, for example, by reading Chapter 4 of “Fundamentals
of Applied Probability Theory” by Alvin Drake.
(a)
(i) The times between successive A train arrivals are independent exponential random variables with parameter λA = 3/hour. By the memoryless property, the fact that Bart
arrives at a random time has no relevance. The time he has to wait until the next A
train is still an exponential random variable with parameter λA . So if we call his waiting
time X, then the PDF of X is given by
fX (x) = λA e−λA x = 3e−3x , x ≥ 0.
This is in fact a random incidence question. So we can also solve this question by using
formula (2.65) in the textbook. Let Y be the interarrival time of A trains, then
fX (x) =
E[Y ] =
1
λA
= 13 . FY (x) = P {Y ≤ x} =
fX (x) =
x
0
1 − FY (x)
.
E[Y ]
λA e−λA y dy = 1 − e−λA x = 1 − e−3x . Hence
1 − (1 − e−3x )
1
3
= 3e−3x , x ≥ 0.
(ii) An easy way to answer this question is to first translate it into a Bernoulli process that
one is familiar with, for example, a sequence of coin tosses. An A train then becomes a
“head” and a B train becomes a “tail”. The problem now reads: What is the probability
that one obtains at least 3 tails before a head is obtained? This probability is the same
as the probability that the next three tosses result in tails (i.e. the next three trains
are B trains). The outcomes of the fourth and subsequent tosses are irrelevant for the
1
purposes of answering this question. Thus,
P {at least 3 B trains arrive} = P {next 3 trains are B trains}
3 3 3
6
2
λB
=
=
.
=
λA + λB
3+6
3
Another way to answer this question is to consider the complementary event. Let NB
be a random variable denoting the number of B trains that arrive while Bart is waiting.
Then
P {NB ≥ 3} = 1 − P {NB = 0} − P {NB = 1} − P {NB = 2}
3
2
λB
λA
λB
2
λA
λA
−
·
−
=
.
=1−
λA + λB
λA + λB λA + λB
λA + λB
λA + λB
3
(iii) In order for exactly 3 B trains to arrive while Bart is waiting, the next 3 trains should
be B trains and the fourth one should be an A train.
P {exactly 3 B trains} = P {next 3 trains are B trains, the 4th train is an A train}
3 3
2
λA
1
λB
.
=
=
λA + λB
λA + λB
3
3
(b) The combined Poisson process of train arrivals has an arrival rate of λ = λA + λB = 9/hour.
The probability of having exactly 9 arrivals during any hour (t = 1) in this process is obtained
by using the Poisson PMF formula.
PK (9) =
99 e−9
(λt)9 e−λt
=
0.1318.
9!
9!
(c) To answer this question, one needs to consider another Bernoulli process associated with the
combined Poisson process. In this Bernoulli process, each hour is an independent trial and
an hour is a success if exactly 9 trains arrive during that hour. From (b), a success occurs
with probability PK (9) 0.1318. The question asks what the expected number of trials until
the first success is. The answer is the expected value of a geometric random variable with
parameter PK (9), i.e.
E[number of hours until exactly 9 trains per hour] =
(d)
1
1
= 7.59.
PK (9)
0.1318
(i) An A train will be delayed if the time, denoted by Z, from the arrival of the A train to
the moment of the arrival of the next B train is less than 30 seconds. By the memoryless
2
property, this time has an exponential PDF with parameter λB . Thus the probability
that an A train is delayed is obtained by
PD = P {Z ≤ 30 seconds} =
(30
secs)(1 hr/3600 secs)
0
1/120
=
0
λB e−λB t dt
6e−6t dt = 1 − e−1/20 .
Since probabilities equal long-run frequencies, this is also approximately the fraction of
A trains that are delayed over some reasonably long period, say a month.
(ii) A B train passenger that benefits from the delay policy does not have to wait at all for
an A train, so her expected waiting time under the policy is zero. Without the policy,
her mean waiting time would be
1
λA
=
1
3
= 20 minutes. Therefore such a passenger’s
mean waiting time reduction is 20 minutes.
Next, consider a passenger in an A train. With probability PD computed in (i), the
passenger’s travel time will increase by the amount of time that the A train waits for a
B train. Note that the mean increase in travel time for a passenger in an A train that
is held for a B train is equal to E[Z | Z ≤ 1/120]. To compute this quantity, we invoke
the total expectation theorem:
E[Z] = E[Z | Z ≤ 1/120]P {Z ≤ 1/120} + E[Z | Z > 1/120]P {Z > 1/120}.
Since E[Z] =
1
λB
and E[Z|Z > 1/120] =
1
120
+
1
λB ,
we have
1/λB = E[Z | Z ≤ 1/120]PD + (1/120 + 1/λB )(1 − PD ).
Rearranging terms,
E[Z | Z ≤ 1/120] =
1/λB − (1/120 + 1/λB )(1 − PD )
PD
0.00413 hours = 14.9 seconds.
Therefore the mean increase in travel time for an A train passenger is
E[Z | Z ≤ 1/120] × PD = 14.9 seconds × (1 − e−1/20 ) = 0.73 seconds.
One might attempt to obtain the mean increase in travel time (i.e. the expected waiting
time, denoted by E[W ]) for an A train passenger as follows:
3
E[W ] = E[W | Z ≤ 1/120]PD + E[W | Z > 1/120](1 − PD )
1
? 1
= ·
× (1 − e−1/20 ) + 0 .
2 120
This is not correct because E[W | Z ≤ 1/120] =
1
2
·
1
120 .
The restriction, Z ≤ 1/120,
cuts off the right tail of the exponential density curve, fW (w), but what is left is NOT
uniform.
Let us assume that it never happens that two or more A trains are delayed waiting for
the same B train, i.e. we ignore the possibility that two A trains may arrive within
30 seconds of each other. Under this assumption, one A train is held for every B train
receiving the benefits. Therefore the policy will lead to a net global travel time reduction
if
E[total time reduction] > E[total time increase]
1
⇒ E[NBA ] × hours > E[NA ] × 0.00413 hours
3
⇒ E[NBA ] > 0.012 × E[NA ],
where NBA is the number of people on a B train who wish to transfer to an A train and
NA is the number of people on an A train being held. In words, the policy is favored if
the average number of people on a B train who wish to transfer is at least 1.2% of the
average number of people on an A train.
This is a condition that one would expect to hold true for most subway transfer stations.
You might want to think about the effect of ignoring the possibility of two A trains
arriving within 30 seconds of each other. How would one assess the reasonableness of
this simplification? How would the analysis change if one did not make this simplifying
assumption?
2. LO Exercise 3.5 (Kang, 2001)
As in class, let U be a random variable denoting the distance from the center of the needle to the
nearest line, and let Θ be a random variable denoting the acute angle between the needle and the
vertical line as shown in the figure below.
If the needle is thrown randomly, U is uniformly distributed between 0 and
uniformly distributed over [0,
cos θ ≥
d
2,
i.e. θ ≤
and Θ is also
Let T be the event that the needle touches a line (or lines).
2l
πd as we saw in class.
cos−1 dl , then the needle
If l ≤ d, P (T ) =
l
2
π
2 ].
d
2
We now compute P (T ) when l > d. Note that if
always touches at least one line no matter where it
4
d
U
l/2
Θ
lands. For the sake of notational simplicity, let A represent the event that 0 ≤ Θ ≤ cos−1 dl . Using
the total probability theorem, P (T ) is computed by
P (T ) = P (T | A)P (A) + P (T | Ac )P (Ac ) .
Since Θ ∼ U [0, π2 ], P (A) =
2
π
cos−1 dl . Because P (T | A) = 1, we have
P (T ) =
2
d
d
2
cos−1 + P (T | Ac ) 1 − cos−1
.
π
l
π
l
If we compute P (T | Ac ), we are all set. Given a θ, P (T | Ac , Θ = θ) is computed by
P (T | Ac , Θ = θ) = P (U ≤
l
l
2 l
cos θ | Ac ) = · cos θ = cos θ ,
2
d 2
d
because U ∼ U [0, d2 ]. Now we can compute P (T | Ac ) as follows:
P (T | A ) =
c
π
2
cos−1
d
l
l
cos θ fΘ|Ac (θ) dθ ,
d
where fΘ|Ac (θ) is the conditional PDF of Θ conditioned on Ac . Since fΘ|Ac (θ) =
P (T | A ) =
c
=
=
π
2
cos−1
1−
1−
2
π
2
π
d
l
l
cos θ
d
1−
2
π
2l
πd
cos−1 dl
2l
πd
cos−1
5
d
l
2
π
cos−1
d
l
dθ
π
sin θ
2
cos−1
d
l
1 − sin cos
−1
d
l
.
fΘ (θ)
P (Ac ) ,
Now we have
2l
d
2
P (T ) = cos−1 +
π
l
πd
Because sin θ =
√
1 − sin cos
−1
d
l
.
1 − cos2 θ,
sin cos
−1
d
=
l
1−
cos cos−1
d
l
2
=
2
d
1−
.
l
Therefore, P (T ) is simplified as

P (T ) =
2l 
d
2
cos−1 +
1−
π
l
πd

2
d 
.
1−
l
To summarize,



P (T ) =


2l
πd
2
π
cos−1
d
l
+
2l
πd
1−
1−
d 2
l
if l ≤ d,
if l > d.
3. (Kang, 2001)
Let random variable Y denote the interarrival time of buses.
E[Y ] = 3 × 0.4 + 5 × 0.5 + 12 × 0.1 = 4.9
E[Y 2 ] = 32 × 0.4 + 52 × 0.5 + 122 × 0.1 = 30.5
(a) Let V be the waiting time. Using Equation (2.66) in the textbook,
E[V ] =
30.5
E[Y 2 ]
=
= 3.11 minutes
2E[Y ]
2 × 4.9
(b) Consider N intervals, where N is very large. We can expect that the number of intervals with
length of 3 minutes is 0.4N . Similarly, 0.5N and 0.1N are the numbers of intervals with length
of 5 minutes and of 12 minutes, respectively. Therefore, the total length (minutes) of the N
intervals is 3 × 0.4N + 5 × 0.5N + 12 × 0.1N = 4.9N . The probability that he arrives during
a 12-minute interval is the proportion of the total length taken up by 12-minute intervals to
4.9N .
P (Mendel arrives during a 12-minute interval) =
6
12 × 0.1N
= 0.245
4.9N
(c) Let W be the length of the interval in which Mendel arrives. We can compute P (V < 1) by
P (V < 1) =P (V < 1 | W = 3)P (W = 3) +
P (V < 1 | W = 5)P (W = 5) +
P (V < 1 | W = 12)P (W = 12).
Given that Mendel arrives in a 3-minute interval, the probability that he waits less than one
minute is
1
3
because the moment of his arrival is totally random (uniformly distributed) over
the 3-minute interval. Similarly, P (V < 1 | W = 5) =
1
5
and P (V < 1 | W = 12) =
1
12 .
Hence,
1
1
1
P (W = 3) + P (W = 5) +
P (W = 12)
3
5
12
1 5 × 0.5
1 12 × 0.1
1 3 × 0.4
+
+
=
3
4.9
5
4.9
12
4.9
P (V < 1) =
= 0.204
4. LO Problem 3.13 (Chew, 1997; Kang, 2001)
Following the notation given in the text, (X1 , Y1 ) and (X2 , Y2 ) denote the locations of the response
unit and incident, respectively. S (S ) denote the set of points within (outside) the central square.
Let A = {(X1 , Y1 ) ∈ S} and B = {(X2 , Y2 ) ∈ S}.
R1
1
R2
1111
0000
0000
1111
0000
1111 a
0000
1111
a
1
(a) Let us consider the case in which incidents and the response unit are uniformly, independently distributed over the entire square. In this case, the expected travel distance can be
7
decomposed as
E[D] =E[D | A ∩ B]P (A ∩ B) + E[D | A ∩ B ]P (A ∩ B ) +
E[D | A ∩ B]P (A ∩ B) + E[D | A ∩ B ]P (A ∩ B ) .
By symmetry, E[D | A ∩ B ]P (A ∩ B ) = E[D | A ∩ B]P (A ∩ B). Hence,
E[D] = E[D | A ∩ B]P (A ∩ B) + 2E[D | A ∩ B ]P (A ∩ B ) + E[D | A ∩ B ]P (A ∩ B ) .
We know that E[D] =
and E[D | A ∩ B] = 23 a from class. Since A and B are independent
and P (A) = P (B) =
we have
E[D] =
(b)
2
3
2
a ,
2
= E[D | A ∩ B]P (A)P (B) + 2E[D | A ∩ B ]P (A)P (B ) +
3
E[D | A ∩ B ]P (A )P (B )
2
= a(a2 )2 + 2E[D | A ∩ B ]a2 (1 − a2 ) + E[D | A ∩ B ](1 − a2 )2 .
3
(i) The set B can be divided into two classes of identically-sized shapes: four rectangles of
type R1 (bordering the central square) and four rectangles of type R2 (at corners of the
unit-square). Hence,
E[D | A ∩ B ] = 4E[D | A ∩ R1 ]P (R1 | B ) + 4E[D | A ∩ R2 ]P (R2 | B ) .
Note that P (R1 | B ) = P (R1 | R1 ∪ R2 ) P (R1 ∪ R2 | B ). Since P (R1 ∪ R2 | B ) = 14 ,
we can rewrite E[D | A ∩ B ] as
E[D | A ∩ B ] = E[D | A ∩ R1 ]P (R1 | R1 ∪ R2 ) + E[D | A ∩ R2 ]P (R2 | R1 ∪ R2 ) .
(ii) By the definition of the conditional probability,
P {(X2 , Y2 ) ∈ R1 | (X2 , Y2 ) ∈ R1 ∪ R2 }
=
P {(X2 , Y2 ) ∈ R1 }
P {(X2 , Y2 ) ∈ R1 and (X2 , Y2 ) ∈ R1 ∪ R2 }
=
P {(X2 , Y2 ) ∈ R1 ∪ R2 }
P {(X2 , Y2 ) ∈ R1 ∪ R2 }
a(1 − a) 12
2a
a
=
.
=
1
2 =
1
1
1
+a
a + 2 (1 − a)
a(1 − a) 2 + 2 (1 − a)
(iii) Let Dx , Dy be the travel distances in the x axis and in the y axis, respectively. From
class, we know E[Dx | A ∩ R1 ] = a3 . If the locations of the response unit and incident are
8
uniformly distributed over S and R1 respectively, E[Dy | A ∩ R1 ] =
a
2
+
1
2
·
1−a
2 .
Hence,
E[D | A ∩ R1 ] = E[Dx | A ∩ R1 ] + E[Dy | A ∩ R1 ]
1
7
a a 1 1−a
= + a.
= + + ·
3 2 2
2
4 12
Note that E[Dy | A∩R2 ] is the same as E[Dy | A∩R1 ]. It is easy to see E[Dx | A∩R2 ] =
a
2
+
1
2
·
1−a
2 .
Therefore,
E[D | A ∩ R2 ] = E[Dx | A ∩ R2 ] + E[Dy | A ∩ R2 ]
1 1
a 1 1−a
+ ·
= + a.
=2
2 2
2
2 2
(c) From (a), we have
W̄ (a) = E[D | A ∩ B ] =
2
3
− 23 a(a2 )2 − 2E[D | A ∩ B ]a2 (1 − a2 )
.
(1 − a2 )2
From (b), E[D | A ∩ B ] is computed as follows:
E[D | A ∩ B ] = E[D | A ∩ R1 ]P (R1 | R1 ∪ R2 ) + E[D | A ∩ R2 ]P (R2 | R1 ∪ R2 )
7
2a
2a
1 1
1
+ a
+
+ a
1−
=
4 12
a+1
2 2
a+1
2a
1+a
1−a
3 + 7a
+
=
12
a+1
2
a+1
7a2
4 2
1
1
2
a+
+1−a =
a +a+1 .
=
2(a + 1)
3
2(a + 1) 3
W̄ (a) is then given by
W̄ (a) =
2
3 (1
− a5 ) − ( 43 a2 + a + 1)a2 (1 − a)
(1 − a2 )2
=
−2a4 − a3 − a2 + 2a + 2
3(1 + a)(1 − a2 )
=
2a3 + 3a2 + 4a + 2
.
3(1 + a)2
W̄ (0) indicates the expected travel distance when no zero-demand zone exists, which should
be equal to E[D]. Indeed, W̄ (0) = 23 .
W̄ (1) =
11
12 .
If a = 1, the entire unit-square is a zero-demand zone. In this case, the
response unit and incidents are uniformly distributed along the perimeter of the unit-square.
9
W̄ (1) is the expected travel distance from one point on the perimeter to another point on
the perimeter. Let us compute this quantity in another way. Consider the locations of the
response unit and incident. There are 16 possible cases to consider:
• The response unit and the incident are on the same edge of the square (4 cases). The
expected travel distance between two locations is 13 .
• The response unit and the incident are on adjacent edges of the square (8 cases). The
expected travel distance between two locations is
1
2
+
1
2
= 1.
• The response unit and the incident are on opposite edges of the square (4 cases). The
expected travel distance between two locations is
Since all cases are equally likely, W̄ (1) =
4
16
·
1
3
+
8
16
1
3
+ 1 = 43 .
·1+
4
16
·
4
3
=
11
12 .
5. LO Problem 3.14 (Kang, 2001)
R1
R8
1
R7
R6
Z1
1111
0000
0000
1111
0000
1111
0000
1111
Z2
R2
R3
R4
R5
1
(a) Given that (X1 , Y1 ) ∈ A and (X2 , Y2 ) ∈ B , the cases where the perturbation term is strictly
positive are:
• (X1 , Y1 ) ∈ R1 and (X2 , X2 ) ∈ R5
• (X1 , Y1 ) ∈ R5 and (X2 , X2 ) ∈ R1
• (X1 , Y1 ) ∈ R3 and (X2 , X2 ) ∈ R7
• (X1 , Y1 ) ∈ R7 and (X2 , X2 ) ∈ R3
10
The probability of the first case is computed by
P ((X1 , Y1 ) ∈ R1 ∩ (X2 , X2 ) ∈ R5 )
P (A ∩ B )
a(1−a) 2
2
a2
=
.
=
(1 − a2 )2
4(a + 1)2
P ((X1 , Y1 ) ∈ R1 ∩ (X2 , X2 ) ∈ R5 | A ∩ B ) =
By symmetry, the probabilities of the other three cases are equal to
a2
P (W̄E (a) > 0) = 4 ×
=
4(a + 1)2
a
a+1
a2
.
4(a+1)2
Therefore,
2
.
(b) Consider the case where (X1 , Y1 ) ∈ R1 and (X2 , X2 ) ∈ R5 . Note that in this case, there is no
extra travel distance in the y direction. The travel distance in the x direction is given by
DxB
| R1 ∩ R5 = min(Z1 + Z2 , 2a − Z1 − Z2 ) =
Z1 + Z2 ,
if Z1 + Z2 ≤ a,
2a − Z1 − Z2 , otherwise,
where Z1 and Z2 is the x−distances from the left edges of R1 and R5 to the response unit
and the incident, respectively (see the figure above).
E[DxB
| R1 ∩ R5 ] =
a a−z1
0
0
0
a−z1
a a
(z1 + z2 )fZ1 (z1 )fZ2 (z2 ) dz2 dz1 +
(2a − z1 − z2 )fZ1 (z1 )fZ2 (z2 ) dz2 dz1 .
Since fZ1 (z1 ) = fZ2 (z2 ) = a1 ,
a a
1
(2a − z1 − z2 ) dz2 dz1
a2 0 a−z1
0
0
a
a
1
1 2 a−z1
1
1 2 a
= 2
dz1 + 2
dz1
z1 z2 + z2
2az2 − z1 z2 − z2
a 0
2
a 0
2
a−z1
0
a
a
2
1
1
1 2
2
= 2
a − z1 dz1 + 2
az1 − z1 dz1
2a 0
a 0
2
a
a
1
1
1 1 2 1 3
az − z
= 2 a2 z1 − z13 + 2
2a
3
a 2 1 6 1 0
0
E[DxB | R1 ∩ R5 ] =
1
a2
a a−z1
(z1 + z2 ) dz2 dz1 +
1
2
1
= a + a = a.
3
3
3
The expected travel distance in the x direction without the square barrier is 13 a. Therefore
11
the extra travel distance, given the perturbation term is positive, is 23 a − 13 a = 13 a. This gives
1
a
3
W̄E (a) =
W̄ (1) = W̄ (1) + W̄E (1) =
11
12
+
1
12
a
a+1
2
.
= 1. To support this result, consider again the locations
of the response unit and incident when a = 1 (see Problem 3.13 (c)). There are 16 possible
cases to consider. Here, we focus on the extra travel distance due to the zero-demand zone
barrier, which is additionally required compared to Problem 3.13 (c).
• The response unit and the incident are on the same edge of the square (4 cases). The
expected extra travel distance between two locations is 0.
• The response unit and the incident are on adjacent edges of the square (8 cases). The
expected extra travel distance between two locations is also 0.
• The response unit and the incident are on opposite edges of the square (4 cases). Note
that the expected travel distance between two locations was
1
3
+ 1 = 43 . However, since
no travel is allowed through the zero-demand zone, the expected travel distance becomes
2
3
+ 1 (we can obtain
2
3
by using the same procedure as we used in (b)). Hence the extra
travel distance between two locations is 13 .
Since all cases are equally likely, W̄E (1) is computed by W̄E (1) =
4
16
·0+
8
16
·0+
4
16
·
1
3
=
1
12 .
6. LO Problem 3.18 (Kang, 2001)
For this problem, we employ the notation used in class, which is a little different from the notation
in the textbook.
Let G(a) ≡ E[D p ] ≡ E[|X1 − X2 |p ]. Let us consider G(a + ε) that is E[Dp ] when the highway
segment under consideration is extended by ε where ε is very small. Suppose a < X1 ≤ a + ε and
0 ≤ X2 ≤ a. Since X1 and X2 are independent, G(a + ε) for this case is computed as follows:
G(a + ε) = E[(X1 − X2 ) ] =
p
a
a+ε
a
0
(x1 − x2 )p fX2 (x2 )fX1 (x1 ) dx2 dx1 ,
where fX1 (x1 ) and fX2 (x2 ) are the probability density functions of X1 and X2 , respectively. Because
X1 and X2 are uniformly distributed over (a, a+ε] and [0, a] respectively, fX1 (x1 ) =
1
a.
Thus,
12
1
ε
and fX2 (x2 ) =
1
G(a + ε) =
aε
1
=
aε
=
a+ε
a
a+ε
a
0
a
1
1
·
aε p + 1
(x1 − x2 )p dx2 dx1
a
−1
p+1
(x1 − x2 )
dx1
p+1
0
a+ε p+1
−
(x
−
a)
xp+1
dx1
1
1
a
1
1
1
1
·
xp+2
(x1 − a)p+2
=
−
1
aε p + 1 p + 2
p+2
G(a + ε) ≈
a
=
1
1
·
(a + ε)p+2 − εp+2 − ap+2
aε (p + 1)(p + 2)
=
1
1
·
(p + 2)ap+1 ε + o(ε) ,
aε (p + 1)(p + 2)
where o(ε) represents higher order terms of ε satisfying limε→0
ap
(p+1)
a+ε
o(ε)
ε
= 0 (“pathetic terms”). Clearly,
as ε → 0.
When 0 ≤ X1 ≤ a and a < X2 ≤ a + ε, we also have G(a + ε) ≈
ap
(p+1)
as ε → 0 by symmetry. If
0 ≤ X1 ≤ a and 0 ≤ X2 ≤ a, then G(a+ ε) = G(a). Finally, we do not have to compute G(a+ ε) for
the case where a < X1 ≤ a + ε and a < X2 ≤ a + ε because the associated probability is negligible.
The following table summarizes G(a + ε)’s.
Case
0 ≤ X1 ≤ a, 0 ≤ X2 ≤ a
Probability of a case
a
a
a 2
a+ε · a+ε = ( a+ε )
G(a + ε) given a case
G(a)
a < X1 ≤ a + ε, 0 ≤ X2 ≤ a
ε
a+ε
·
a
a+ε
=
εa
(a+ε)2
ap
(p+1)
0 ≤ X1 ≤ a, a < X2 ≤ a + ε
a
a+ε
·
ε
a+ε
=
εa
(a+ε)2
ap
(p+1)
a < X1 ≤ a + ε, a < X2 ≤ a + ε
ε
a+ε
·
ε
a+ε
ε 2
= ( a+ε
)
We do not care.
Using the total expectation theorem, we obtain
G(a + ε) = G(a)
= G(a)
≈ G(a)
a
a+ε
a
a+ε
a
a+ε
2
+
εa
εa
ap
ap
+
+ o(ε2 )
2
(p + 1) (a + ε)
(p + 1) (a + ε)2
+
εa
2ap
+ o(ε2 )
(p + 1) (a + ε)2
+
εa
2ap
.
(p + 1) (a + ε)2
2
2
13
From the formula of the sum of an infinite geometric series, we know
1
a
=
a+ε
1+
ε
a
=1−
ε ε 2 ε 3
+
−
+ ··· .
a
a
a
Ignoring higher order terms of ε, we get
ε
a
≈1− .
a+ε
a
This gives the following approximations:
2
ε 2
2ε ε2
2ε
+ 2 ≈1−
,
≈ 1−
=1−
a
a
a
a
2
a
2ε
ε 2ε2
ε
ε
ε
εa
1
−
=
− 2 ≈ .
=
≈
2
(a + ε)
a a+ε
a
a
a
a
a
a
a+ε
Therefore, we can rewrite G(a + ε) as
2ε
G(a + ε) ≈ G(a) 1 −
a
2ap
ε
2ε
2ap−1 ε
+
· = G(a) 1 −
+
.
(p + 1) a
a
(p + 1)
Rearranging terms, we have
2G(a)
2ap−1
G(a + ε) − G(a)
=−
+
.
ε
a
(p + 1)
If ε → 0, we have the following differential equation:
G (a) = −
2ap−1
2G(a)
+
.
a
(p + 1)
“Judicious” guesses (or consultation with books on differential equations) lead us to the following
solution:
G(a) ≡ E[D p ] =
2ap
.
(p + 1)(p + 2)
We can skip the derivation of the differential equation by directly using Equation (3.64) in the
textbook. Once we obtain G(a + ε), we can plug it in (3.64), which gives the same differential
equation as above.
14
Download