Solutions of All Homework Problems, CS/EE 143, Fall 2015

advertisement
Solutions of All Homework Problems, CS/EE 143, Fall 2015
Prof. Steven Low. TA: Changhong Zhao {czhao@caltech.edu}
Last updated: 12/08/2015
1
Introduction to Internet
1.1
W&P, P1.1
2 Points. How many hosts can one have on the Internet if each one needs a distinct IP address?
[Solution] Since the IP(v4) address is 4-byte, there can potentially be 232 distinct addresses.
1.2
Adapted from W&P, P1.3
8 Points. Imagine that all routers have 17 ports. There are 216 hosts to be connected. Assume you
can organize the hosts and routers any way you like. Your goal is to (i) design the network structure so
as to minimize the number of routers required, and (ii) based on the network design in (i), assign the
addresses to minimize the size (number of entries) of the routing table required in each router.
(a) 4 Points. Describe your network structure and explain why your design gives the minimum number
of routers.
(b) 4 Points. Describe your scheme for assigning addresses and routing. Explain why your design
gives the minimum routing table size in each router.
There exist theoretical lower bounds on the number of routers and the size of the routing table. Point
out these lower bounds explicitly in your solution, and use them to verify your design.
[Solution] The routers will be arranged in a tree topology, with each of 16 of the ports representing
the next half-byte in the network address. The 17th port would be the link to the parent router. The
root router would have a dead 17th port, and so there would be a routing table of size 16 on the first
router, and 17 on every other router. This would be of the form:
• (prefix, halfbyte, unmatched) routes to the port indexed by halfbyte, where prefix is the total
specified address on the parent router that points to this router on its 17th port;
• default route goes to port 17;
• the root router is just: (halfbyte, unmatched) routes to the port indexed by halfbyte.
The total number of routers we use is 212 + 28 + 24 + 20 = 4369.
This is optimal because:
(a) Consider all the routers and hosts as vertex and links as edge. Then in the result graph, it should
be a tree, otherwise we can further remove edges and/or vertexes to simplify the graph. Assume we
1
have N routers. For each router, we have 17 ports. For each host, we have 1 port. The total number
of edges should be no less than the total number of vertex minus 1, which is (N + 216 − 1) to provide
connectivity. Also we have the number of edges should be no larger than half of the total ports, which
is (17N + 216 )/2. Hence we have (17N + 216 )/2 ≥ (N + 216 − 1), so N ≥ 4368.9.
(b) The routing table at every router except the root has only 17 entries, which is minimum with a
router fanout of 17.
1.3
W&P, P1.4
3 Points. Assume that a host A in Berkeley sends packets with a bit rate of 100Mbps to a host B
in Boston. Assume also that it takes 130ms for the first acknowledgment to come back after A sends
the first packet. Say that A sends one packet of 1Kbyte and then waits for an acknowledgment before
sending the next packet, and so on. What is the long-term average bit rate of the connection? Assume
now that A sends N packets before it waits for the first acknowledgment. Express the long-term average
bit rate of the connection as a function of N. [Note: 1 Mbps = 106 bits per second; 1 ms = 1 millisecond
= 10−3 second.]
[Solution] It takes 1KB/100Mbps = 0.08ms to transmit one packet. When N = 1, the connection
transfers 1 packet per RTT, i.e., 1KB per 130.08ms ∼ 130ms. This represents a throughput of 1KB/0.13s
= 61.5 kbps. (If you use 1KB = 1,024 bits instead of 1,000 bits, then the throughput will be 63 kbps.)
It is OK to ignore the packet transmission time (0.08ms) since it is order of magnitude smaller than
the propagation delay of 130ms.
When N is small enough (see below), the sender sends N packets, and then stops and waits for the first
ACK which arrives after one RTT of 130.08ms ∼ 130ms. Therefore the throughput is N KB/130ms =
61.5N kbps.
1 KB
Let K be the largest integer such that K 100Mbps
≤ 130.08 ms, i.e., K = 1, 626. When N ≥ K, the
sender would have already received the first ACK when it finishes transmitting N packets, and therefore
can immediately transmit the N + 1st packet. In this case, the throughput is 100Mbps.
In summary, the throughput is:
(
61.5N kbps
throughput =
100 Mbps
1.4
if N < 1, 626
if N ≥ 1, 626
= min {61.5N kbps, 100 Mbps}
B&G, 2.9: Horizontal and vertical parity checks
4 Points. A horizontal and vertical parity check of size K by L takes (K − 1) binary sequences each of
length (L − 1) bits and outputs K binary sequences each of length L bits such that if these sequences
are arranged into an K by L matrix, then each row and each column will have an even number of 1’s.
The parity check works as follows. To send (K − 1) data sequences each of length (L − 1) bits, the
sender sends the K by L matrix. The receiver checks the parity of each row and each column. If there
is any row or column that has odd parity, an error is detected.
(a) 2 Points. Give an example of a pattern of six errors that cannot be detected by horizontal and
vertical parity checks.
(b) 2 Points. Find the number of different patterns of four errors that will not be detected by horizontal
and vertical parity checks.
[Solution] (a) Any 6 bit errors in the following entry of the K × L matrix will not be detected: (i1 , j1 ),
(i1 , j2 ), (i2 , j2 ), (i2 , j3 ), (i3 , j3 ), (i3 , j1 ), because every row and every column has zero or 2 bit errors
which maintains valid parity of the matrix.
2
(b) Any 4 bit errors distributed such that every row and every column has zero or 2 bit errors will not
be detected. There are
1
K
L
= KL(K − 1)(L − 1)
2
2
4
such configurations.
1.5
Adapted from B&G, P3.1
2 Points. Customers arrive at a fast food restaurant at a rate of five per minute and wait to receive
their order for an average of 10 minutes. Customers eat in the restaurant with probability 1/2 and take
out their order without eating with probability 1/2. A meal requires an average of 20 minutes. What
is the average number of customers in the restaurant?
[Solution] For customers who take out their orders, the average time spent in the restaurant is 10
minutes; for those who eat in, the average time spent is 30 minutes. Therefore, the average time spent
in the restaurant over all customers equals 20 minutes. Applying Littles law, this means the average
number of customers in the restaurant equals 20 × 5 = 100.
1.6
Capacity constrained network design
3 Points. Two nodes A and B need to be connected via a communication link. On average, it is
estimated that 1000 packets will arrive at A destined for B each second, each packet having an average
size of 1 Kbyte. There are two design choices: (i) you build a single link with rate 10Mbps between
A and B, (ii) you build two parallel 5Mbps links, and send a packet arriving at A randomly via either
link. Assume each link is equipped with an infinite sized buffer. Assuming the M/M/1 formula for the
delay on a link, compute the average packet delay in each case. Which design choice is better? Can
you explain why? You may ignore packet propagation times.
[Solution] In case (i), the link service rate equals 10Mbps/8Kb = 1250 pkts/sec. Therefore, the mean
delay equals 1/(1250-1000) = 4 ms.
In case (ii), each link has a service rate of 625 pkts/sec, and sees an arrival rate of 500 pkts/sec. The
mean delay therefore is 1/(625-500) = 8ms.
The first design choice is better since if you double the arrival rate as well as the service rate in an
M/M/1 queue, the average waiting time decreases by a factor of 1/2.
1.7
W&P: P2.6
3 Points. Consider a router in the backbone of the Internet. Assume that the router has 24 ports,
each attached to a 1Gbps link. Say that each link is used at 15% of its capacity by connections that
have an average throughput of 200kbps. How many connections go through the router at any given
time? Say that the connections last an average of 1 minute. How many new connections are set up
that go through the router in any given second, on average?
[Solution] Aggregate average throughput per link is
1Gbps × 15% = 150Mbps.
Hence average number of connections per link is 150Mbps/200kbps = 750, and average number of
connections at the router is 750 × 24 = 18, 000.
3
CS/EE143 Networking WF: 10:30 – 11:55am Fall 2011 Professor Steven Low Aggregate average throughput per link = 1Gbps x 15% = 150Mbps. Hence, average #connections per link = 150Mbps / 200kbps = 750, and average #connections/router = 750 x 24 Then Littles Law implies the number of new connections set up at the router:
= 18,000. Little’s Law then implies:  = N / T = 18,000 / 60 = 300 connections/sec. λ = N/T = 18, 000/60 = 300 connections/sec.
8. W&P, P2.7 [4 points] 1.8 W&P, We would like to transfer 20 KB (1 Byte=8 bits) file across a network from node A to node F. P2.7
Packets have a length of 1 KB (neglecting header). The path from node A to node F passes through 5 links, and 4 intermediate nodes. Each of the links is a 10 Km optical fiber with a rate 4 Points. We would
like to transfer 20 KB (1 Byte=8 bits) file across a network from node A to node
of 10 Mbps (assume speed of light in optical fiber is 2.0x108m/s). The 4 intermediate nodes are F. Packets have store‐and‐forward a length of 1 KB
(neglecting header). The path from node A to node F passes through
devices, and each intermediate node must perform a 50 s routing table 5 links, and 4 intermediate
nodes. Each of the links is a 10 Km optical fiber with a rate of 10 Mbps
look up after receiving a packet before it can begin sending it on the outgoing link. How long (assume speed ofdoes it take to send the entire file across the network? light in optical fiber is 2.0 × 108 m/s). The 4 intermediate nodes are store-and-forward
devices, and each
intermediate node must perform a 50 µs routing table look up after receiving a packet
before it can begin
sending it on the outgoing link. How long does it take to send the entire file across
[Solution] the network? Timing diagram: [Solution] Look
at the time diagram Figure 1.
A B 1 2 C D T1
T2
T3
T1
1
20 T2
T3
2
1
20
2
Let: link propagation delay = T1 = 10 km / (2x108 m/s) = 0.05 ms packet transmission time = T2 = 1 kB/10Mbps = 0.8 ms Figure 1: Time diagram of packets transferred through 5 links and 4 intermediate nodes.
5 Let:
link propagation delay = T1 = 10 km / (2 × 108 m/s) = 0.05 ms
packet transmission time = T2 = 1kB/10Mbps = 0.8 ms
node processing time = T3 = 0.05 ms.
Assume A starts transmission at time 0, then:
The time at which B receives and finishes processing first packet is TB = T1 + T2 + T3 .
The time at which C receives and finishes processing first packet is TC = TB + T1 + T2 + T3 = 2TB .
The time at which D receives and finishes processing first packet is TD = 3TB .
The time at which F receives the first packet is TF = 4TB + T1 + T2 since F does not need to look up
the routing table.
4
Therefore, the time at which F receives all 20 packets is
TF + 19T2
1.9
=
4(T1 + T2 + T3 ) + T1 + 20T2
=
5T1 + 24T2 + 4T3
=
5 × 0.05 + 24 × 0.8 + 4 × 0.05 = 19.65 ms.
W&P, P2.9
3 Points. Consider the case of GSM cell phones. GSM operates at 270.88 Kbps and uses a spectrum
spanning 200 KHz. What is the theoretical SNR (in dB) that these phones need for operation? In
reality, the phones use a SNR of 10 dB. Use Shannons theorem to calculate the theoretical capacity
of the channel, under this signal-to-noise ratio. How does the utilized capacity compare with the
theoretical capacity?
[Solution]
(a) By Shannon’s formula C = W log2 (1 + SNR), we have
270.88kbps = 200kHz log2 (1 + SNR).
Hence, required SNR is 2
270.88
200
− 1 = 1.557 = 10 log10 (1.557) dB = 1.9 dB.
(b) With SNR = 10 dB, Shannon capacity = 200kHz log2 (1 + 10) = 692 kbps. Therefore, the utilization
is 270.88/692 = 39%.
1.10
IP addresses
2 Points. Consider an IPv4 subnet with private IP address space 166.111.8.0/24. If each IP interface
in the subnet needs a distinct IP address, then how many IP interfaces can there be in the subnet?
[Hint: to obtain the right answer, please look up what is a broadcast address online.]
[Solution] There is a total of 28 = 256 IP addresses in this subnet. One of these IP addresses (the one
whose last 8 digits are all 1) should be reserved for broadcasting, leaving 255 IP addresses for physical
interfaces. Hence, a maximum number of 255 IP interfaces is allowed in the subnet.
1.11
Network operations
6 Points. Assume that host A sends a file of 20 KByte (1 Byte=8 bits) to host F. Host A segments the
file into 22 packets, each of 1 KB (why 22 packets instead of 20?). The path from A to F passes through
5 links and 4 routers. Each link is a 10km optical fiber with 10 Mbps capacity (the speed of light in an
optical fiber is 2.0 × 108 m/s). East router performs a 50µs routing table look up after receiving a whole
packet, before forwarding the packet to an output port. How long does it take for the file to reach F?
(Assume that a router can look up the routing table for multiple packets simultaneously.)
[Solution] The 20KByte file is segmented into 22 packets of 1KB instead of 20, since each packet has
header (so that the data in a packet is less than 1KB) and redundant packets may be introduced for
forward error correction.
Let B, C, D, E denote the four intermediate routers on the path from A to F, and let tp , ts , tl denote
the propagation time of a link, transmission time of a packet, and table lookup time respectively. The
timeline of the overall data transmission is shown in Figure 19. It is clear from the figure that the
5
A
B
C
1 tp
2 ts
tl
1 2 D
E
F
tp
ts
tl
1 tp
2 ts
tl
1 2 tp
ts
tl
1 tp
2 ts
ts
Figure 2: Timeline of the data transfer from A to F. Here B, C, D, E represent the four routers on the
path from A to F, and tp , ts , tl represent the propagation time of a link, transmission time of a packet,
and table lookup time, respectively.
overall transmission time is
T = 4(tp + ts + tl ) + tp + 22ts
= 5tp + 26ts + 4tl
10km
1KB
= 5×
+ 26 ×
+ 4 × 50µs
2 × 108 m/s
10Mbps
= 5 × 0.05ms + 26 × 0.8ms + 4 × 0.05ms
= 21.25ms.
6
2
Ethernet
2.1
W&P, P3.2
3 Points. Consider the Slotted ALOHA MAC protocol. There are N nodes sharing a medium, and
time is divided into slots. Each packet takes up a single slot. If a node has a packet to send, it
attempts transmission with a certain probability. The transmission succeeds if no other node attempts
transmission in that slot.
Now, suppose that we want to give differentiated services to these nodes, i.e., we want different nodes
to get a different share of the medium. The scheme we choose works as follows: If node i has a packet
to send, it will try to send the packet with probability pi . Assume that every node has a packet to
send all times. In such a situation, will the nodes indeed get a share of the medium in the ratio of their
probability of access?
[Solution] Consider a given slot. Since every node has a packet to send and transmits with probability
pi , the probability of success for any node i is
Y
pi
β
pi (1 − pj ) =
1 − pi
j6=i
where β :=
QN
j=1 (1
− pj ) is independent of any user j.
Therefore, the share of the media any user i gets is proportional to
2.2
pi
1−pi ,
not to pi .
W&P, P3.4
4 Points. Consider a commercial 10 Mbps Ethernet configuration with one hub (i.e., all end stations
are in a single collision domain).
(a) 2 Point. Find the Ethernet efficiency for transporting 512 byte packets (including Ethernet overhead) assuming that the propagation delay between the communicating end stations is always 25.6 µs,
and that there are many pairs of end stations trying to communicate.
(b) 2 Points. Recall that the maximum efficiency of Slotted Aloha is 1/e. Find the threshold for the
frame size (including Ethernet overhead) such that Ethernet is more efficient than Slotted Aloha if the
fixed frame size is larger than this threshold. Explain why Ethernet becomes less efficient as the frame
size becomes smaller.
[Solution]
(a) Following the derivation of efficiency of the text, we have
η=
1
1 + 3.4A
A :=
PROP
.
TRANS
where
Now
TRANS = packet transmission time =
Therefore A =
25.6 µs
409.6 µs
512 × 8 bits
= 409.6 µs.
10 Mbps
= 0.0625, and hence efficiency
η=
1
= 82%.
1 + 3.4 × 0.0625
7
(b) Let the Ethernet frame size be s bits, then
TRANS =
and
A=
PROP
10PROP
=
.
TRANS
s
η=
1
1 + 3.4 ×
We want s such that
which implies
s≥
s bits
s
=
µs
10 Mbps
10
10PROP
s
≥
1
e
34 PROP
34 × 25.6
=
= 506.6 bits.
e−1
2.718 − 1
A smaller frame size implies a larger A, which implies a smaller efficiency. Note that on average, an
interval of length 2(e − 1)PROP gets wasted between successful transmissions. A smaller frame size
implies therefore that the fraction of time spent transmitting data successfully is smaller.
2.3
W&P, P3.5
4 Point. Ethernet standards require a minimum frame size of 512 bits in order to ensure that a node
can detect any possible collision while it is still transmitting. This corresponds to the number of bits
that can be transmitted at 10 Mbps in one roundtrip time. It only takes one propagation delay, however,
for the first bit of an Ethernet frame to traverse the entire length of the network, and during this time,
256 bits are transmitted. Why, then, is it necessary that the minimum frame size be 512 bits instead
of 256?
[Solution] A station detects collision by listening and comparing what it hears on the wire with what
it transmits. Therefore, if it transmits for a round-trip time, it can compare the signal it detects on
the wire with what it just transmitted onto the wire, and a collision is detected if they are different. If
it transmits only for 12 RTT, it would have stopped transmitting before the signal from the other end
reaches it, and hence miss the detection. This is illustrated in Figure 3.
2.4
Switch vs Hub, W&P, P3.7
6 Points. In the network shown in Figure 4, all of the devices want to transmit at an average rate
of R Mbps, with equal amounts of traffic going to every other node. Assume that all of the links are
half-duplex and operate at 100Mbps and that the medium access control protocol is perfectly efficient.
Thus, each link can only be used in one direction at a time, at 100Mbps. There is no delay to switch
from one direction to the other.
(a) 3 Points. What is the maximum value of R?
(b) 3 Points. The hub is now replaced with another switch. What is the maximum value of R now?
[Solution] Denote the flows on links as in Figure 5.
(a) There are two potential bottlenecks: the switch-hub link or the hub.
Switch-hub link bottleneck:
Each node generates 7 flows, each of rate R/7, to each of the other nodes. Therefore the rate crossing
the link from switch to link is
R
16
X=
×4×4=
R.
7
7
8
CS/EE143 Networking WF: 10:30 – 11:55am Fall 2011 Professor Steven Low CS/EE143 Networking WF: 10:30 – 11:55am Fall 2011 Professor Steven Low from the root; the rest are all blocked ports), no LAN segment will use Bridge 8 to send or receive Ethernet frames. Therefore, removing Bridge 8 has no effect on the spanning tree, except that it has one fewer leaf node now. 5. W&P, P3.7 [4 points.] In the network shown below, all of the devices want to transmit at an average rate of R Mbps, with equal amounts of traffic going to every other node. Assume that all of the links are half‐
4. duplex and operate at 100 Mbps and that the media access control protocol is perfectly W&P, P3.6 [4 points] efficient. Thus, each link can only be used in one direction at a time, at 100 Mbps. There is no Consider the corporate Ethernet shown in the figure below. Each switch is labeled with its ID. Diagram
demonstrating why the minimum frame should be transmitted for RTT in
delay to switch from one direction to the other. Figure 3:
successfully detect collision.
order to
(a) What is the maximum value of R? [2 points] (a) Determine which links get deactivated after the Spanning Tree Algorithm runs, and indicate (b) The hub is now replaced with another switch. What is the maximum value of R now? [2 them on the diagram by putting a small X through the deactivated links. [3 points] points] Figure 4: An(b) A disgruntled employee wishes to disrupt the network, so she plans on unplugging central ethernet. Each circle represents a host, that wants to send an aggregate of R Mbps traffic,
[Solution] evenly to other
hosts.
Bridge 8. How does this affect the spanning tree and the paths that Ethernet frames follow? [1 point] [Solution] 10 12 Figure 5: Flows in the hub-switch ethernet.
9
The rate crossing the link from hub to switch is
Y = 4R
since the hub repeats all input flows on all links except from they come from. Then we have
X + Y ≤ 100 Mbps =⇒ R ≤ 15.9 Mbps.
Hub bottleneck:
Take any link, say, from hub to node A. The total rate of traffic on this link is the sum of all input
traffic to the link except its own:
SA = 3R + X = (3 +
16
37
)R =
R.
7
7
This traffic shares the link with the traffic from A to hub, which has the rate R. Therefore,
37
+ 1)R ≤ 100 Mbps =⇒ R ≤ 15.9 Mbps.
7
Hence, both bottlenecks impose the same upper bound for R.
SA + R ≤ 100 Mbps =⇒ (
(b) When the hub is replace with a switch, then X = Y and the switch-switch link is the only bottleneck.
2X =
2.5
32
R ≤ 100 Mbps =⇒ R ≤ 21.875 Mbps.
7
Spanning Tree Protocol, W&P, P3.6
6 Points. Consider the network topology shown in Figure 6, where 1, 2, . . . , 8 denote 8 switches
interconnecting 9 Ethernets.
Figure 6: Each circle represents a switch, interconnecting 9 Ethernets.
(a) 3 Points. Determine which links get deactivated after the Spanning Tree protocol runs, and indicate
them on the diagram by putting a small X through the deactivated links.
(b) 3 Points. A disgruntled employee wishes to disrupt the network, so she plans on unplugging central
bridge switch 8. How does this affect the spanning tree and the paths that Ethernet packets follow?
[Solution]
(a) The STP operates as follows:
10
CS/EE143 Networking WF: 10:30 – 11:55am Fall 2011 Professor Steven Low (a) The STP operates as follows: 1. The bridge with the smallest ID will be the root (Bridge 1) 1. The2.bridge
with the smallest ID will be the root (Bridge 1).
Each bridge finds a shortest path to the root, where link cost is simply hop count (# bridges en route to root). The port on a bridge through which it roots is called a root 2. Each bridge finds a shortest path to the root, where link cost is simply hop count (# bridges en
port (RP). route to root). The port on a bridge through which it roots is called a root port (RP).
3. A tie is broken by choosing the port / path where the next hop (bridge) has a smallest 3. A tie isID. broken by choosing the port / path where the next hop (bridge) has a smallest ID.
4. Each LAN segment will choose a bridge towards the root; the port on the bridge that’s 4. Each LAN
segment will choose a bridge towards the root; the port on the bridge thats chosen by
chosen by the LAN segment is called a designated port (DP). the 5.
LAN
segment is called a designated port (DP).
A tie is broken by the LAN choosing a bridge with the smallest ID. 5. A tie is broken by the LAN choosing a bridge with the smallest ID.
For the following figure, In Figure 7:
BP: blocked port BP: blocked port
DP: designated port (incoming chosen by LAN) DP: designated
port (incoming chosen by LAN)
RP: rootRP: root port (outgoing chosen by bridge) port (outgoing chosen by bridge).
Note that LAN chooses Bridge 2 over 8 become of smaller bridge ID. (b) Since Bridge 8 has no DP (it only has 1 RP which will be used by Bridge 8 to send / receive Figure 7: Spanning Tree Protocol.
11 Note that LAN chooses Bridge 2 over 8 because of smaller bridge ID.
(b) Since Bridge 8 has no DP (it only has 1 RP which will be used by Bridge 8 to send / receive from
the root; the rest are all blocked ports), no LAN segment will use Bridge 8 to send or receive Ethernet
frames. Therefore, removing Bridge 8 has no effect on the spanning tree, except that it has one fewer
leaf node now.
11
2.6
Aloha
(a) (Equal Share). Assume that n hosts share a medium using the slotted ALOHA protocol: at
every time slot, each host attempts to send a packet with probability p. A host succeeds to send a
packet at a given time slot if and only if it is the only host that sends a packet at that time slot.
1. 2 Points. What is the probability that a host sends a packet successfully at a give time slot?
2. 2 Points. What is the probability P{a packet sent} that a packet be sent at a given time slot?
3. 4 Points. What choice of p maximizes the probability P{a packet sent}? How does this maximum
probability behave as n → ∞?
(b) (Unequal Share). Assume that n hosts share a medium using the slotted ALOHA protocol, but
at every time slot, each host attempts to send a packet with a different probability. More specifically,
let N := {1, . . . , n} denote the collection of hosts and assume that host i attempts to send a packet
with probability pi ∈ (0, 1) at every time slot for i = 1, . . . , n.
1. 2 Points. What is the probability Pi := P{i sends a packet successfully} that host i ∈ N sends
a packet successfully at a given time slot?
2. 2 Points. What is the ratio P1 : P2 : P3 : · · · : Pn ? This ratio characterizes the share of medium
among the hosts. Is the share of medium proportional to the probabilities pi that hosts attempt
to send packets, i.e., is the ratio P1 : P2 : P3 : · · · : Pn equal to p1 : p2 : p3 : · · · : pn ?
Pn
Pn
3. (*) 5 Points. Assume i=1 pi = 1 and let P := i=1 Pi denote the probability that a packet gets
successfully transmitted at a given time slot. Prove that (p1 , p2 , p3 , . . . , pn ) = (1/n, 1/n, . . . , 1/n)
minimizes P , i.e., (1/n, 1/n, . . . , 1/n) is the solution to
min
n
X
i=1
pi
Y
(1 − pj )
j6=i
s.t. 0 ≤ pi ≤ 1,
n
X
pi = 1.
i = 1, . . . , n;
i=1
It means that equal share of the medium minimizes the throughput.
[Solution]
(a) Answers given in the lecture/textbook.
(b) 1–2 same answer as Problem 2.1 in this set;
3. Consider the nontrivial case where n ≥ 3. Let P (p) :=
p = (p1 , . . . , pn ) 6= ( n1 , . . . , n1 ) which satisfies
n
X
pi = 1
Pn
i=1
pi
Q
j6=i (1
− pj ). Consider any
and
i=1
0 ≤ pi ≤ 1,
∀i
Let M := maxi=1,...,n pi , and m := mini=1,...,n pi . Hence we have M > n1 > m. Without loss of
M +m
generality, let p1 = M and p2 = m, then p = (M, m, . . . , pn ). Let p0 = ( M +m
2 ,
2 , . . . , pn ). It is
12
sufficient to show P (p) > P (p0 ). We have


n
n


Y
X
Y
P (p) − P (p0 ) =
[M (1 − m) + (1 − M )m]
(1 − pj ) + (1 − m)(1 − M )
pi
(1 − pj )


j=3
i=3
j6=1,2,i


n
n


Y
M +m Y
M +m 2X
M +m
− 2×
(1 −
)
)
pi
(1 − pj ) + (1 −
(1 − pj )


2
2
2
i=3
j=3
j6=1,2,i


n
n
Y
X
Y
(M + m)2
− M m 2
=
(1 − pj ) −
pi
(1 − pj )
4
j=3
i=3
j6=1,2,i


n
n
n
X
Y
X
Y
(M + m)2
− M m 2
pi
(1 − pj ) −
>
pi
(1 − pj )
(1)
4
i=3
j=3
i=3
j6=1,2,i


n
n
X
Y
X
Y
(M + m)2
− M m 2
pi (1 − pi )
(1 − pj ) −
pi
(1 − pj )
=
4
i=3
i=3
j6=1,2,i
j6=1,2,i


n
X
Y
(M + m)2
=
(1 − pj )
− M m  (1 − 2pi )pi
4
i=3
j6=1,2,i
≥ 0
(2)
Pn
where the inequality in (1) is because (M + m)2 − 4M m > 0 for M 6= m and i=3 pi < 1, and the
inequality in (2) is because pi ≤ 21 since otherwise pi > M , which leads to a contradiction.
13
3
3.1
Routing
Longest prefix matching (exercise)
1 Point. Consider the following routing table.
IP
166.111.8.0/24
166.111.0.0/16
output port
1
2
Which outport should a packet be forwarded to, if its destination IP address is 166.111.8.28?
[Solution] The IP address 166.111.8.28 matches the first 24 bits of the first entry, and only 16 bits for
the second entry. Hence, the packet should be forwarded to port 1 according to longest prefix matching.
3.2
Static routing, W&P, P5.1
Consider the network topology depicted in Figure 8. Each link is marked with its weight/cost.
Figure 8: Network topology with link weights.
(a) 3 points. Run Dijkstras algorithm on the above network to determine the routing table for node
3. Please show steps of the algorithm.
(b) 3 points. Repeat (a) using Bellman-Ford algorithm.
[Solution]
(a) The steps of Dijkatra’s algorithm are:
steps
0
1
2
3
4
5
6
F
3
3,2
3,2,1
3,2,1,7
3,2,1,7,5
3,2,1,7,5,4
3,2,1,7,5,4,6
D1, pred
inf, –
3, 2
D2, pred
1, 3
D4, pred
8, 3
8, 3
8, 3
8, 3
8, 3
D5, pred
inf, –
inf, –
7, 1
7, 1
D6, pred
inf, –
inf, –
inf, –
10, 7
9, 5
9,5
D7, pred
inf, –
inf, –
6, 1
where F is the set of points for which the shortest distance and path to node 3 have been determined, Di
14
denotes the current-step shortest distance from node i to node 3, and “pred” stands for the predecessor
of node i on its shortest path to node 3. D3 = 0 at initialization.
Hence the routing table for node 3 is:
destination
1
2
4
5
6
7
next node
2
2
4
2
2
2
(b) The steps of Bellman-Ford algorithm are:
steps
0
1
2
3
4
D1, pred
inf, –
3, 2
3, 2
3, 2
3, 2
D2, pred
1, 3
1, 3
1, 3
1, 3
1, 3
D4, pred
8, 3
8, 3
8, 3
8, 3
8, 3
D5, pred
inf, –
9, 4
7, 1
7, 1
7, 1
D6, pred
inf, –
inf, –
11, 5
9, 5
9, 5
D7, pred
inf, –
inf, –
6, 1
6, 1
6, 1
where D3 = 0 at initialization. The resulting routing table of node 3 is the same as (a).
3.3
Dynamic routing
Consider 5 stations connected in a bi-directional ring, as shown in Figure 9. Suppose station 0 is
Figure 9: A bi-directional ring topology.
the only sender, and it sends packets to all other stations 1, 2, 3, 4 at rates 4, 3, 2, 1 packets/sec,
respectively. Note that these are end-to-end traffic rates between source 0 and all destinations, not
the link flow rates which depend on the routing. These end-to-end source-to-destination rates and the
routing decision jointly induce a traffic pattern on the network and hence flow rates on the links.
(a) 3 points. Table 1 shows the routing tables at each station. For each station, the first column is
D (destination) and the second column is NN (next node). Indicate in a diagram the flow rates on the
links as implemented by the routing table.
15
CS/EE143 Networking WF: 10:30 – 11:55am Fall 2011 Professor Steven Low These end‐to‐end source‐to‐destination rates and the routing decision jointly induce a traffic Table 1: Routing tables of stations
pattern on the network and hence flow rates on the links. Station 0 Station 1 Station 2 Station 3 Station 4
D NN
D NN
D NN
D NN
D NN
(a) The routing tables at each station is as shown below (for each station, the first column is D (destination) and the second column is NN (next node)). 1 1
0 2
0 3
0 4
0 0
2Station 0 1
D NN 3 1
1 1 4 1
2 1 3 1 4 1 2 Station 1 2
1 Station 2 3
1
3D NN 2
0 2 42 2 2
3 2 4 2 3 D NN 3
0 3 4 1 3 3
3 3 4 3 4Station 3 1
D NN 2 4
2
0 4 4 1 4 4
3
2 4 4 4 0 Station 4 D NN 0
0 0 0
1 0 2 0 3 0 (b) 3 points. Indicate Use thein link
flow rates
obtained
instation (a) as0 the
links costs
there areby 10the links
a diagram the routes from to stations 1, 2, (note
3, 4 as that
implemented in total). Fix those
link
costs,
and
use
the
Dijkstra
algorithm
(and
show
the
steps)
to
compute
the
routing table. new shortest paths (with minimum cost) from station 0 to all other stations, and calculate the new link
rates using the (b) Suppose the flow rate on a link is used as the cost of that link (10 of them). Use the Dijkstra new shortest paths.
(c) 2 points. algorithm to compute the shortest paths from station 0 to all other stations. Use the links flow rates you obtained in (b) as the links costs. Again, fix those link
costs, and compute
the new shortest paths from station 0 to all other stations, using the Bellman-Ford
(c) Compute the new flow rates and then the shortest paths from station 0 to all other stations, algorithm (and show the steps), and calculate the new link rates using the new shortest paths.
using the Bellman‐Ford algorithm. Will the routing ever converge? (d) 1 point. If this procedure is repeated, will the routing ever converge?
[Solution] (a) The tree
shortest‐path tree implemented by that
the routing table goes
is: all traffic goes (a) The shortest-path
from station
0 is such that
all traffic
clockwise,
asclockwise, shown inas Fig.
shown: 10.
0 4 1 3 2 [Solution]
18 Figure 10: [Solution] Shortest-path tree in P3.3 (a).
(b) The costs (flow rates) on the links induced by the routing table are as shown in Fig. 11. Note that
all links in the counter-clockwise direction have zero costs. The steps of Dijkstra’s algorithm are shown
in the table below:
steps
0
1
2
3
4
F
0
0,4
0,4,3
0,4,3,2
0,4,3,2,1
D1, pred
10, 0
10, 0
10, 0
0, 2
D2, pred
inf, –
inf, –
0, 3
D3, pred
inf, –
0, 4
D4, pred
0, 0
Therefore, all traffic is routed in the counter-clockwise direction in the shortest-path tree, exactly
16
that all links in the counter‐clockwise direction have zero costs. 0 0 10 0 0 4 1 0 0 6 1 0 CS/EE143 Networking Fall 2011 3 2 WF: 10:30 – 11:55am Professor Steven Low 3 CS/EE143 Networking Fall 2011 Dijkstra computation: WF: 10:30 – 11:55am Professor Steven Low 11:t [Solution]
topology
with link
costs (flow
rates) in P3.3 (b).
F Network
D1, pred D2, pred
D3, pred
D4, pred
Figure
0 0 0 10, 0 Inf, ‐
Inf, ‐
0, 0
routing used in (a), as shown in Fig. 12.
the
opposite to
1 0, 4 10, 0 Inf, ‐
0, 4
4 1 0 2 0, 4, 3 10, 0 0, 3
3 0, 4, 3, 2 0, 2 4 1 4 0, 4, 3, 2, 1 3 2 Therefore, all traffic is routed in the counter‐clockwise direction in the shortest‐path tree, exactly opposite to the routing used in the previous period, as shown: 3 2 (c) The costs (flow rates) on the links induced by the routing table are as shown below. Note that all links in the clockwise direction now have zero cost. Figure 12: [Solution] New shortest path19 tree calculated in P3.3 (b).
(c) The costs (flow rates) on the links induced by the routing table are as shown below. Note (c) The costs
(flow rates) on the links induced by the routing in (b) are as shown in Fig. 13. Note that
that all links in the clockwise direction now have zero cost. all links in the clockwise direction have zero costs. The steps of Bellman-Ford algorithm are:
0 0 0 0 10 4 1 0 0 0 9 4 0 0 10 0 7 4 1 3 2 9 4 0 0 0 7 Bellman‐Ford computation: 3 2 Figure 13:
[Solution]
Network topology
withD4, pred
link costs (flow rates) in P3.3 (c).
t D1, pred D2, pred D3, pred
0 0 0, 0 Inf, ‐ Inf, ‐
10, 0
Bellman‐Ford computation: 17
t D1, pred D2, pred D3, pred D4, pred
0 0, 0 Inf, ‐ Inf, ‐
20 10, 0
steps
0
1
2
3
4
D1, pred
0, 0
0, 0
0, 0
0, 0
0, 0
D2, pred
inf, –
0, 1
0, 1
0, 1
0, 1
D3, pred
inf, –
19, 4
0, 2
0, 2
0, 2
D4, pred
10, 0
10, 0
10, 0
0, 3
0, 3
Therefore, all traffic is routed in the clockwise direction in the shortest-path tree, exactly the same as
the routing used in (a), as shown in Fig. 10.
(d) Indeed, the routing updates will continue to oscillate across routing update periods, between the
results of (a) and (b), and will never converge.
3.4
Dynamic routing (exercise)
Consider the case where H1 sends 2Mbps traffic to H2 via one of two links as in Figure 14, either
through R1 or through R2. Consider the dynamic routing case where the routing table is updated
every 3 minutes. When the routing table is updated, the link weight at a link is computed by the
following equation:
weight =
1Mbps + average throughput over the past 3 minutes
.
capacity
Assume that at t = 0, the routing table is updated, and at t = 1ms, H1 starts sending traffic to H2.
2Mbps
R1 2Mbps
H1 H2 4Mbps
R2 4Mbps
Figure 14: A sends packets of 1KB to B via a 1Mbps link with 20KB buffer.
Give the traffic throughput through Routers R1 and R2 at t = 1, 4, 7, . . . minutes.
[Solution] At t = 0, the traffic is 0 on both paths (H1→R1→H2 and H1→R2→H2). Therefore the link
weights are
t=0
R1
R2
H1
1/2
1/4
H2
1/2
1/4
After running the shortest algorithm, H1 decides to choose the path via R2, during 0∼3 minutes.
At t = 3, the traffic is 0 on the path via R1 and 2Mbps on the path via R2. Therefore the link weights
are
t=3
R1
R2
H1
1/2
3/4
18
H2
1/2
3/4
After running the shortest algorithm, H1 decides to choose the path via R1, during 3∼6 minutes.
At t = 6, the traffic is 2Mbps on the path via R1 and 0 on the path via R2. Therefore the link weights
are
t=6
R1
R2
H1
3/2
1/4
H2
3/2
1/4
After running the shortest algorithm, H1 decides to choose the path via R2, during 6∼9 minutes.
Then at t = 9, the situation is the same as t = 3, and routing starts oscillating every 6 minutes. As a
result, the traffic throughput is shown below.
t
R1
R2
3.5
1
0
2Mbps
4
2Mbps
0
7
0
2Mbps
10
2Mbps
0
13
0
2Mbps
···
···
···
Routing on a continuum of nodes (exercise)
Consider the network given in Figure 15. Each point represents a router, connected to its neighbors
Figure 15: Network topology for problem 3.5.
via links of capacity 1. The links form the ring. Lable the routers by x ∈ [0, 1), and give router 0 two
labels: 0 and 1. Assume that all traffic have the same destination: router 0. Let r(x) denote the source
rate at x for x ∈ [0, 1), and assume r(x) = 2x.
(a) (Static Routing). Consider the following static routing strategy: pick a y ∈ (0, 1), let every router
x ∈ (0, y) forwards packets clockwise, and every router x ∈ (y, 1) forwards packets counter-clockwise.
• What is the traffic throughput f − (x, y) at link x for x ∈ (0, y), and what is the traffic throughput
f + (x, y) at link x for x ∈ (y, 1)?
• Use the expression of the queueing delay for the M/M/1 queue. What is the queueing delay
+
d−
s (x, y) at link x for x ∈ (0, y), and what is the queueing delay ds (x, y) at link x for x ∈ (y, 1)?
−
Let d−
(y)
:=
sup
d
(x,
y)
denote
the
maximum
queueing
delay
over links x ∈ (0, y), i.e.,
x∈(0,y) s
s
+
over the links that forward packets clockwise. And let ds (y) := supx∈(y,1) d+
s (x, y) denote the
19
maximum queueing delay over links x ∈ (y, 1), i.e., over the links that forward packets counter−
clockwise. What is d+
s (y) and ds (y)?
• Assume that the propagation delay d−
i (x) from x to 0 (clockwise) is x, and that the propagation
delay d+
i (x) from x to 1 (counter-clockwise) is 1 − x. Each router x has two paths, clockwise or
counter-clockwise, to forward packets to the destination—router 0(1). Label the clockwise path
by superscript - and the counter-clockwise path by superscript +, and define costs
D+ (x, y)
+
:= d+
s (y) + di (x),
D− (x, y)
−
:= d−
s (y) + di (x)
for the two paths. For what values of x is D+ (x, y) equal to D− (x, y)?
• Let x(y) denote the x where D+ (x, y) = D− (x, y). Use matlab (or other tools) to draw x(y) as y
increases from 0 to 1, when does the line intersect z(y) = y?
• Let y ∗ denote the y ∈ (0, 1) where x(y) intersects z(y). Show that
0<x<y
⇒ D− (x, y) < D+ (x, y),
y<x<1
⇒ D− (x, y) > D+ (x, y)
when y = y ∗ . That is, when x < y ∗ , the left path has smaller cost, and the right path has bigger
cost. This is considered “stationary.” Give an interpretation of why this is called “stationary”.
(b) (Dynamic Routing). Let’s extend (a) to the dynamic routing case where routing y is updated
over time. Let y k denote the routing strategy at time k = 0, 1, 2 . . . and assume y k+1 = x(y k ) for
k = 0, 1, 2, . . . For what initial values of y0 does the sequence {yk }y≥0 converge?
[Solution]
(a)
•
Z
−
y
f (x, y) =
f + (x, y =
Z
r(s)ds = y 2 − x2 ,
x ∈ (0, y)
r(s)ds = x2 − y 2 ,
x ∈ (y, 1)
x
x
y
.
•
d−
s (x, y) =
d+
s (x, y) =
1
=
1
,
x ∈ (0, y)
1
1
=
,
1 − f + (x, y)
1 − x2 + y 2
x ∈ (y, 1)
1−
f − (x, y)
1 − y 2 + x2
Hence we have
−
d−
s (y) = ds (0, y) =
1
1 − y2
+
d+
s (y) = ds (1, y) =
.
20
1
y2
• Let
D− (x, y) =
1
1
+ x = 2 + 1 − x = D+ (x, y)
2
1−y
y
, and hence we have
x(y) =
.
1
1
1
)
(1 + 2 −
2
y
1 − y2
• See Fig. 16. The unique solution of x(y) = y on [0, 1] is y ∗ = 0.6756.
• When 0 < x < y ∗ we have
D− (x, y ∗ ) < D− (y ∗ , y ∗ ) = D− (x(y ∗ ), y ∗ ) = D+ (x(y ∗ ), y ∗ ) = D+ (y ∗ , y ∗ ) < D+ (x, y ∗ )
We can show D− (x, y ∗ ) > D+ (x, y ∗ ) when y ∗ < x < 1, in a similar way. When y = y ∗ , the
routing protocol happens to guarantee that all the nodes select the shorter path, and there is no
motivation for them to deviate to any other routing.
1
0.9
0.8
X: 0.68
Y: 0.68
0.7
x
0.6
0.5
0.4
0.3
0.2
0.1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
y
Figure 16: [Solution] x(y) as a function of y.
(b) Let y := sup{y ∈ [0, 1]|x(y) = 1} and y := inf{y ∈ [0, 1]|x(y) = 0}. From Fig. 16 we see that
|x0 (y)| > 1 for y ∈ [y, y], which implies that y k+1 = x(y k ) will not converge, unless y 0 = y ∗ .
3.6
Forward error correction code
The forward error correction code discussed in the lecture (and textbook) can be represented in the
matrix form as in Figure 17, i.e., C = P A, where C is a k ×m matrix that represents m encoded packets
each of k bits (the m columns of C), P is a k × n matrix that represents n original packets each of k
bits (the n column of P ), and A is an n × m 0-1 matrix that represents the coding Aij ∈ {0, 1}. For
example, the jth column of C is:
n
X
Cj =
Pi Aij
i=1
for j = 1, . . . , m.

1 1
0 1
Let A = 
0 0
1 0
Here the summation is elementwise XOR.

0 1 0 1
0 0 1 1

1 1 0 1
1 1 0 1
21
C1 C2 C3
P1 P2 P3
Cm
Pn
A
=
k
Figure 17: Matrix representation of forward error correction.
(a) 3 points. If packets C1 = [1 0 0 1]T , C2 = [0 0 1 1]T , C5 = [0 0 1 0]T , C6 = [1 1 1 1]T are received,
find the original packets P1 , P2 , P3 , P4 .
(b) 3 points. If another set of packets C1 = [1 0 0 1]T , C3 = [0 0 1 0]T , C4 = [1 0 1 0]T , C6 = [0 1 1 0]T
are received, find the original packets P1 , P2 , P3 , P4 .
[Solution]
(a) The corresponding matrices

1
0

0
1
are
0
0
1
1
0
0
1
0

1
1
 = C 0 = P A0 = P
1
1

1
0

0
1
1
1
0
0
0
1
0
0

1
1

1
1
where P is the 4 × 4 matrix whose j-th column is the original packet Pj . It amounts to inverting
A0 to obtain P , and can be done as follows. Note that P2 = C30 = [0 0 1 0]T , then C20 = P1 + P2
and hence P1 = C20 + P2 = [0 0 1 1]T + [0 0 1 0]T = [0 0 0 1]T . Similarly, C10 = P1 + P4 and
hence P4 = C10 + P1 = C10 + C20 + P2 = [1 0 0 1]T + [0 0 1 1]T + [0 0 1 0]T = [1 0 0 0]T . Finally,
C40 = P1 + P2 + P3 + P4 and hence P3 = C40 + P1 + P2 + P4 = [0 1 0 0]T . Hence


0 0 0 1
0 0 1 0

P =
0 1 0 0 .
1 0 0 0
(b) Similarly, the corresponding matrices are


1 0 1 0
0 0 0 1
00
00


0 1 1 1 = C = P A = P
1 0 0 0

1
0

0
1
0
0
1
1
1
0
1
1

1
1

1
1
Note that C100 = P1 + P4 and C300 = P1 + P3 + P4 , and hence P3 = C100 + C30 = [0 0 1 1]T . Then from
C200 = P3 + P4 we have P4 = C200 + P3 = [0 0 0 1]T . Then P1 = C100 + P4 = [1 0 0 0]T and, since
C400 = P1 + P2 + P3 + P4 , we have P2 = C400 + P1 + P3 + P4 = [1 1 0 0]T . Hence


1 1 0 0
0 1 0 0

P =
0 0 1 0 .
0 0 1 1
22
3.7
Network coding, W&P, P5.4
Consider a wireless network with nodes X and Y exchanging packets via an access point Z. For simplicity,
we assume that there are no link-layer acknowledgments. Suppose that X sends packets to Y at rate
2R packets/sec and Y sends packets to X at rate R packets/sec; all the packets are of the maximum
size allowed. The access point uses network coding. That is, whenever it can, it sends the “exclusive
or” of a packet from X and a packet from Y instead of sending the two packets separately.
(a) 2 points. What is the total rate of packet transmissions by the three nodes without network
coding?
(b) 2 points. What is the total rate of packet transmissions by the three nodes with network coding?
[Solution]
(a) Without network coding:
• X transmits at rate 2R to Z.
• Y transmits at rate R to Z.
• Z transmits at rate 3R to X, Y
Hence, total rate required = 6R.
(b) With network coding, in every 1 second, on average there is 0.5 second when Z broadcasts R packets
x+y to both X and Y. In the other 0.5 second, Z relays R packets from X to Y. Hence, total rate required
= 2R + R + (R+ R)/(0.5+0.5) = 5R.
23
4
4.1
Internetworking
W&P, P6.1
(a) 2 points. How many IP addresses need to be leased from an ISP to support a DHCP server (with
L ports) that uses NAT to service N clients at the same time, if every client uses at most P ports?
(b) 2 points. If M unique clients request an IP address every day from the above mentioned DHCP
server, what is the maximum lease time allowable to prevent new clients from being denied access
assuming that requests are uniformly spaced throughout the day, and that the addressing scheme used
supports a max of N clients at the same time?
[Solution]
(a) Consider the worst case that all the P ports of all the N clients are connected, i.e., there are
N P connections. To use NAT, the DHCP server must maintain an address translation table in which
every entry takes the form IPa , TCPb ↔ IPb , TCPm , where IPa is the IP address actually assigned
by ISP, TCPb is an port at the NAT server, IPb is the (private) IP of a client that is known only
by the NAT server, and TCPm is the actual port used by the client for a connection. Hence the
maximum number of entries in this table is restricted by the number L of ports of the NAT server
multiplied by the number S of assigned IPs, and must not be smaller than the number of connections,
i.e., SL ≥ #entries in the table ≥ N P , which implies S ≥ NLP . Therefore there should at least be
d NLP e IP addressed assigned by the ISP.
24
(b) Suppose all the IP addresses are available at the beginning of the day (time 0). Every M
hours
24N
there is a request of IP address from a new client, and therefore at time M hours a maximum number
of N clients are supported and there is no more IP address available. In that case the first client must
release the IP address it has been occupying to satisfy any new request. Hence the maximum lease time
is 24N
M hours.
4.2
Insufficient IP addresses (exercise): A variant of P4.1
(a) Consider a type D subnet (there are 256 IP addresses in a type D network, and one of these IP
addresses is used for broadcasting). Assume that 15 IP addresses are assigned to servers, then how
many hosts can this subnet support if there is no DHCP service nor NAT service?
(b) Assume that there is a DHCP server in the subnet, and that each host connects to the Internet 8
hours a day. How many hosts can this subnet support?
(c) Assume that the DHCP server also runs the NAT protocol, and can use up to 2000 TCP ports for
each dynamic IP address. Also assume that each host needs 20 TCP connections. How many hosts can
this subnet support?
[Solution]
(a) There is a total of 256-1-15=240 IP addresses that can be assigned to hosts. If there is no DHCP
service nor NAT service, then 1 IP address can be assigned to at most one host. Hence, this subnet can
only support 240 hosts.
(b) If the subnet runs DHCP, then 1 IP address can be assigned to different hosts if the hosts get online
at different times of a day. If each host connects to the Internet 8 hours a day, then 3 hosts can share
a single IP address. Hence, this subnet can support 240*3=720 hosts if DHCP service is enabled.
(c) If NAT protocol is run, then multiple hosts can use the same IP address simultaneously, as long as
they are assigned different port numbers. If 2000 TCP ports can be used for NAT and each host has
20 TCP connections, then 1 IP address can be shared by 2000/20=100 hosts. Hence, this subnet can
24
support 720*100=72000 hosts.
25
5
5.1
Transport
W&P, P7.1
(a) 2 points. Suppose you and two friends named Alice and Bob share a 200 Kbps DSL connection to
the Internet. You need to download a 100 MB file using FTP. Bob is also starting a 100 MB file transfer,
while Alice is watching a 150 Kbps streaming video using UDP. You have the opportunity to unplug
either Alice or Bob’s computer at the router, but you cannot unplug both. To minimize the transfer
time of your file, whose computer should you unplug and why? Assume that the DSL connection is the
only bottleneck link, and that your connection and Bob’s connection have a similar round trip time.
(b) 2 points. What if the rate of your DSL connection were 500 Kbps? Again, assuming that the DSL
connection were the only bottleneck link, which computer should you unplug?
[Solution] If I share with Bob, then each of us will get half of the link rate because we both use the
same TCP algorithm for file transfer and both have the same RTT. If I share with Alice, then Alice
will get 150kbps since she is using UDP without congestion control, and I will get the remaining link
rate. Therefore, to maximize my own throughput:
(a) I should unplug Alice, so I will get 100kbps.
(b) I should unplug Bob, so I will get 500 − 150 = 350 kbps, as opposed to 250kbps if sharing with Bob.
5.2
W&P, P7.2
As shown in Figure 18, station A has an unlimited amount of data to transfer to Station E. Station A
uses a sliding window transport protocol with a fixed window size. Thus, station A begins a new packet
transmission whenever the number of unacknowledged packets is less than W and any previous packet
being sent from A has finished transmitting.
The size of the packets is 10000 bits (neglect headers). So for example if W > 2, station A would start
sending packet 1 at time t = 0, and then would send packet 2 as soon as packet 1 finished transmission,
at time t = 0.33 ms. Assume that the speed of light is 3 × 108 meters/sec.
(a) 4 points. Suppose station B is silent, and that there is no congestion along the acknowledgement
path from C to A. (The only delay acknowledgements face is the propagation delay to and from the
satellite.) Plot the average throughput as a function of window size W . What is the minimum window
size that A should choose to achieve a throughput of 30 Mbps? Call this value W ∗ . With this choice
of window size, what is the average packet delay (time from leaving A to arriving at E)?
(b) 4 points. Suppose now that station B also has an unlimited amount of data to send to E, and that
station B and station A both use the window size W ∗ . What throughput would A and B get for their
flows? How much average delay do packets of both flows incur?
(c) 4 points. What average throughput and delays would A and B get for their flows if A and B both
used window size 0.5W ∗ ? What would be the average throughput and delay for each flow if A used a
window size of W ∗ and B used a window size of 0.5W ∗ ?
[Solution]
(a) Recalling that throughput is inverse proportional to the round-trip time, we start with calculating
the round-trip time of data transfer from A to E. The timeline of data transfer from A to E is shown in
Figure 19, where ts , tq , and tp denote the transmission time of a packet, queueing delay at the buffer of
C, and propagation delay on a space-to-ground link respectively. The transmission time of a packet is
ts = 0.33 ms.
26
Figure 18: Transmitting data from stations A, B to E.
A
C
1 D
E
ts
2 tq
1 tp
2 ts
1 2 tp
ts
tp
1 2 tp
Figure 19: [Solution] Timeline of data transfer from A to E.
When B does not transmit, the buffer at C is empty (since incoming traffic rate is limited by the capacity
of the link from A to C, which is no greater than the capacity of the output link from C to D), and
27
therefore the queueing delay at the buffer of C is tq = 0. The propagation delay on a space-to-ground
link is
5 ∗ 104 km
tp =
= 166.7ms.
3 ∗ 108 m/s
Hence, the round-trip time is
RTT = 3ts + tq + 4tp = 667.7ms.
The throughput is given by
x=
W ∗ pkt size
W ∗ 10, 000bits
30W
=
=
Mbps
RTT
667.7ms
2003
when 30W/2003 ≤ 30, i.e., W ≤ 2003. When W > 2003, the throughput is 30Mbps. Average
throughput as a function of the window size W is shown in Figure 20. The minimum window size to
throguhput (Mbps)
30
25
20
15
10
5
0
0
2003
window size (pkts)
Figure 20: [Solution P5.2] Throughput as a function of the window size.
achieve a throughput of 30Mbps is W ∗ = 2003 ≈ 2000. The average packet delay is
(average packet delay) = 3ts + tq + 2tp = 334.3ms.
(b) When A and B both use a window size of W ∗ , the buffer at C will build up, leading to a longer
round-trip time for both flows. This increased round-trip time will slow down the throughput of both
A and B. Eventually, the system comes to a steady state, where the round-trip time does not change.
Now compute the steady state. Let tq denote the steady-state queueing delay at the buffer of C, and
note that A and B have the same steady-state round-trip time
RTT = 3ts + tq + 4tp .
Then the steady state throughputs of A and B are
xA =
W ∗ × pkt size
= xB .
RTT
and therefore xA = xB = 15Mbps. We have
RTT =
W ∗ × pkt size
2000 ∗ 10, 000bits
=
= 1.333s.
xA
15Mbps
It follows that
tq = RTT − 3ts − 4tp = 1333.3 − 3 × 0.33 − 4 × 166.7 ≈ 665.7ms.
28
CS/EE143 Networking WF: 10:30 – 11:55am Fall 2011 Professor Steven Low Therefore the average packet delay is
(average packet delay) = 3ts + tq + 2tp = 1000ms
.
(c) When each has window size 0.5W ∗ , the two flows have combined window sizes that are sufficient
to keep the pipe full. Their combined throughput will be 30 Mbps, so by symmetry each will get 15
Mbps. The round-trip delay, which is the same for A and B, can be found by:
RTT =
0.5 × 2000 × 10, 000bits
= 666.7ms
15Mbps
Subtracting the delay on the reverse path, i.e., 2tp = 333.3ms, we get a forward path delay of 333.3ms,
which is exactly the propagation delay 2tp , i.e., the queueing delay is tq = 0 (we ignored the transmission
time 3ts from the RTT for simplicity).
When A’s window is W ∗ and B’s is 0.5W ∗ , the two flows have combined window sizes that are sufficient
to keep the pipe full. Their combined throughput will be 30Mbps. A will have twice as many packets
in the pipeline as B, so their throughputs will have a ratio of 2:1. Thus A gets 20 Mbps and B gets 10
Mbps. The roud-trip time for A and B is the same, which is found by:
0.5 × 2000 × 10, 000bits
1 × 2000 × 10, 000bits
=
= 1000ms
RTT =
(a) Suppose station B is silent, and that there is no congestion along the acknowledgement
path 20Mbps
10Mbps
from C to A. (The only delay acknowledgements face is the propagation delay to and from the Subtracting
the delay
on the reverse path, i.e.,
p = 333.3ms, we get a forward path delay of 666.7ms
satellite.) Plot the average throughput as a 2t
function of window size W. What is the minimum for both
A
and
B.
Their
queueing
delay
is
t
=
666.7
− 333.3 = 333.3ms (we ignored the transmission
q
window size that A should choose to achieve a throughput of 30 Mbps? Call this value W*. With time 3ts from the RTT for simplicity).
this choice of window size, what is the average packet delay (time from leaving A to arriving at E)? Suppose now that station B also has an unlimited amount of data to send to E, and that 5.3 (b) W&P,
P7.3
station B and station A both use the window size W*. What throughput would A and B get for As shown
Figure 21, flows 1 and 2 sharepackets of both
a link with capacity
C = 120 Kbps. There is no other
theirinflows? How much average delay do
flows incur? bottleneck.
The
round
trip
time
of
flow
1
is
0.1
sec
and
that
of
flow
is 0.2 sec.and B both used Let x1 and x2 denote
(c) What average throughput and delays would A and B get for their2 flows if A
the rates obtained by the two flows, respectively. The hosts use AIMD to regulate their flows. That
window size 0.5W*? What would be the average throughput and delay for each flow if A used a is, as long as x1 + x2 < C, the rates increase linearly over time: the window of a flow increases by one
window size of W* and B used a window
size of 0.5W*? packet every round trip time. Rates are estimated
as the window size divided by the round-trip time.
Assume that as soon as x1 + x2 > C, the hosts divide their rates x1 and x2 by the factor α = 1.1.
4. W&P: P7.3 [4 points] As shown in the figure, flows 1 and 2 share a link with capacity C = 120 Kbps. There is no other bottleneck. The round trip time of flow 1 is 0.1 sec and that of flow 2 is 0.2 sec. Let x1 and x2 Figure 21: Two flows sharing a link.
10 (a) 3 points.
Draw the evolution of the vector (x1 , x2 ) over time.
(b) 3 points. What is the approximate limiting value for the vector?
[Solution]
(a) Since flow 2’s RTT is twice that of flow 1’s RTT, and flow 1’s window increases at a pace that is
twice that of flow 2’s, then flow 1’s rate (=window/RTT) increases at a pace that is 4 times that of
flow 2’s. Hence on the plot, the rates increase along a line that has a slope of 1/4. When x1 + x2 = C,
29
[Solution] (a) Since B’s RTT is twice that of A’s RTT, A’s window increases at a pace that’s twice that of B’s, and A’s rate (=window/RTT) increases at a pace that’s 4 times that of B’s. Hence on the ( x1  x2 )  plot, the rates increase along a line that has a slope of ¼. When x1  x2  C, both divide their rates by 1.1 immediately without feedback dealy, i.e., their new rates are 90% of those before the multiplicative decrease. Therefore, the evolution of their rates is as shown. x2
C
slope = 1/4
0.9C
steady state
slope = 1/4
x1  x2  C 0.9C
C
x1
In the steady state, both rates increase at the rate of s/Ti2, where s = packet size in bits (b) of flow i, a and decrease (assuming use the same constant packet size and flows
Ti = RTT Figure both 22: [Solution]
Rates change
ofs) two
sharing
link.
multiplicatively by 10%. Therefore the rates oscillate along the double‐arrowed line segment indicated above, between the lines x1  x2  0.9C and x1  x2  C. To calculate the both divide their rates
by 1.1
immediately,
their new rates are (about) 90% of those before the
approximate steady‐state rates xi.e.,
1 , x2 , assume that x1  4x2 and x1  x2  C. Hence, multiplicative decrease.
Therefore,
the
evolution
of
x  4C / 5  96kbps and x  C / 5  24kbps. their rates is as shown in Fig. 22.
1
2
steady state,
(b) In the
the rates oscillate along the double-arrowed line segment indicated above,
between the lines x1 + x2 = 0.9C and x1 + x2 = C. To calculate the approximate steady-state rates
26 x1 , x2 , assume that
x1 + x2 = C. Since we also have x1 = 4x2 , we get x1 = 4C/5 = 96kbps and
x2 = C/5 = 24kbps.
5.4
W&P, P7.4
Consider a TCP connection between a client C and a server S.
(a) 3 points. Sketch a diagram of the window size of the server S as a function of time.
(b) 2 points. Using the diagram, argue that the time to transmit N packets from S to C is approximately equal to a + bN for large N .
(c) 3 points. Explain the key factors that determine the value of b in that expression for an Internet
connection.
[Solution] Here all reasonable answers are accepted.
(a) The graph depends on the protocol. But basically it looks like Figure 7.10 of the textbook, which
is also shown in Fig. 23.
(b) The total number of packets transmitted equals to the number of packets transmitted in the slow
start stage and the congestion avoidance stage. For the slow start stage, you are transferring lower than
the maximum rate possible, which cause a constant part a. In the congestion avoidance stage, you are
transferring at the maximum rate possible, say 1/b, which results in a linear part bN .
(c) Possible factors can be the maximum rate of Ethernet connection, efficiency of the Ethernet, RTT,
routing, etc.
5.5
Window size control (exercise)
Assume that a host A in Los Angeles sends packets, each of 1KByte, to a host B in San Francisco,
through a connection of 100Mbps capacity. Also assume that the round-trip time of each packet is a
30
(c) Explain the key factors that determine the value of b in that expression for an Internet connection. [Solution] Here all reasonable answers are accepted. (a) The graph depends on the protocol. But basically it looks like Figure 7.10 of the book. (b) The total number of packets transmitted equals to the number of packets transmitted in the Figure 23: [Solution] Window size in Figure 7.10 of the textbook, for P5.4.
slow start stage and the congestion avoidance stage. For the slow start stage, you are transferring lower than W
the maximum rate possible, which cause a constant part a. In the congestion avoidance stage, you are transferring at the maximum rate possible, say 1/b, which Wmax
results in a linear part bN. Wmin
(c) Possible factors can be the maximum rate of Ethernet connection, efficiency of the Ethernet, RTT, routing, etc. 2T
T
time
Figure 24: Window size W fluctuates piecewise linearly in time, from Wmin to Wmax , in periods of
length T , which is assumed to be much bigger than the round-trip time.
constant of 130ms (no jitter).
(a) If A uses the window flow control mechanism
27 with a constant window size W , then what is the
average
bit
rate
of
the
connection
as
a
function
of
W ? What happens as W increases to infinity?
(b) Now consider the case where W fluctuates, at a much slower timescale than the round-trip time
(this is not true in the TCP protocol) as in Figure 24. What is the average rate of the connection
assuming Wmax is “not too big”?
[Solution]
(a) When window size W is small, the average bit rate x is
W pkts ∗ packet size
round-trip time
W pkts ∗ 1KB/pkt
=
130ms
8W
=
Mbps
130
x =
This expression holds as long as 8W/130 ≤ 100,
When window size is big, more specifically, when 8W/130 > 100, W packets can no longer be sent
within a round-trip time. In this case, there will always be less than W unacknowledged packets after
A finish transmitting a packet, and therefore A will keep transmitting packets, at the link capacity.
31
To summarize, the average bit rate is
(
4W/65Mbps if 4W/65 ≤ 100,
x=
100Mbps
if 4W/65 > 100.
And the rate stablizes at 100Mbps as W tends to infinity.
(b) Let W (t) denote the window size at time t, then the instantaneous bit rate at time t is x(t) =
4W (t)/65Mbps provided W (t) is not too big (i.e., 4W (t)/65 ≤ 100). It follows that the average rate of
the connection is
Z T0
1
x = lim
x(t)dt
T0 →∞ T0 0
Z
1 T
=
x(t)dt
T 0
Z
1 T 4W (t)
=
dt
T 0
65
2(Wmin + Wmax )
=
Mbps.
65
5.6
TCP with AIMD (exercise), adapted from P5.3
Flows 1 and 2 share a link (the only bottleneck link) with capacity C = 20Mbps as in Figure 25. The
round-trip time of flow 1 is τ1 = 0.1s while that of flow 2 is τ2 = 0.2s (assume that there is no jitter).
Let x1 and x2 denote the throughput of flows 1 and 2 respectively. The hosts use AIMD to regulate
their flows:
• When x1 + x2 ≤ C, the throughput x1 and x2 increase linearly over time: the window of a flow
increases by one packet every round-trip time. Assume that the packet size is 1.5KBytes.
• When x1 + x2 > C, the hosts divide their window size by the factor α = 1.1. Assume that flow 1
and 2 decrease their window sizes at most once every 0.2ms.
Throughput is estimated as the window size divided by the round-trip time.
Host flow 1 R R Host Host flow 2 Host Figure 25: Two flows share a bottleneck link. The bottleneck link is highlighted in red.
(a) Assume that at time t = 0, the window size of both flows is 1. Draw the evolution of the vector
(x1 , x2 ) over time. [Hint: use matlab.]
(b) What is the approximate limiting behavior for the vector? You will notice that the end-point of
the vector moves on a line that go through the origin as time evolves, and you are required to give the
slope of this line.
32
(c) Repeat (a) and (b) for α = 1.2, 1.3, 1.4, 1.5. Can you find out how the slope depends on α?
[Solution]
(a)The evolution of the vector (x1 , x2 ) over time is shown in Figure 26. The matlab code is attached
throughput of flow 2 (Mbps)
alpha=1.1
throughput vector
line with slope 1/4
4
3
2
1
0
0
5
10
15
throughput of flow 1 (Mbps)
Figure 26: [Solution] The evolution of the vector (x1 , x2 ) over time with α = 1.1.
below.
clear a l l ; close a l l ; clc ;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%% i n i t i a l i z a t i o n
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
t
t
t
i n i t i a l i z e time stamps
resolution = 0.1;
% l e n g t h o f a time stamp .
range = 60;
% f i n i s h i n g time .
count = floor ( t range / t r e s o l u t i o n ) ;
% number o f time stamps
% i n i t i a l i z e window s i z e
W1 = 1 ;
% window s i z e o f f l o w 1
W2 = 1 ;
% window s i z e o f f l o w 2
% i n t i a l i z e round−t r i p time
tau1 = 0 . 1 ;
% round−t r i p time o f f l o w 1
tau2 = 0 . 2 ;
% round−t r i p time o f f l o w 2
% i n i t i a l i z e throughput
x1 = zeros ( t c o u n t , 1 ) ;
x2 = zeros ( t c o u n t , 1 ) ;
% initialize
% a l l o c a t i n g space f o r throughput of flow 1
% a l l o c a t i n g space f o r thorughput of flow 2
link capacity
33
C = 20E6 ;
P = 1 . 5 E3 ∗ 8 ;
C = C / P;
% l i n k capacity in bps
% packet s i z e in b i t s
% l i n k capacity in pkt /s
% i n i t i a l i z e alpha
alpha = 1 . 1 ;
% b a c k o f f parameter used i n MD
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%% time e v o l u a t i o n
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
f o r t =1: t c o u n t
% compute c u r r e n t t h r o u g h p u t
x1 ( t ) = W1 / tau1 ;
% t h r o u g h p u t o f f l o w 1 a t ( t −1) ∗ t r e s o l u t i o n
x2 ( t ) = W2 / tau2 ;
% t h r o u g h p u t o f f l o w 2 a t ( t −1) ∗ t r e s o l u t i o n
% u p d a t e window s i z e
i f x1 ( t )+x2 ( t ) >= C
%
i f mod( t ,2)==0
%
W1 = W1 / a l p h a ;
W2 = W2 / a l p h a ;
end
else
W1 = W1 + 1 ;
%
i f mod( t ,2)==0
%
W2 = W2 + 1 ;
end
end
i f throughput exceeds l i n k capacity
d e c r e a s e window s i z e e v e r y 0 . 2 s e c o n d s
i n c r e a s e wndow s i z e by 1
i n c r e a e window s i z e e v e r y 0 . 2 s e c o n d s
end
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%% p o s t p r o c e s s i n g
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% change u n i t s
x1 = x1 ∗P / 1E6 ;
x2 = x2 ∗P / 1E6 ;
% t h r o u g h p u t o f f l o w 1 i n Mbps
% t h r o u g h p u t o f f l o w 2 i n Mbps
% plot figure
f i g = figure ( 1 ) ; c l f ;
plot ( x1 , x2 , ’ l i n e w i d t h ’ , 2 ) ; hold on ;
set ( gca , ’ f o n t s i z e ’ , 2 4 ) ;
t i t l e ( s p r i n t f ( ’ a l p h a =%2.1 f ’ , a l p h a ) ) ;
xlabel ( ’ t h r o u g h p u t o f f l o w 1 ( Mbps ) ’ ) ;
ylabel ( ’ t h r o u g h p u t o f f l o w 2 ( Mbps ) ’ ) ;
set ( gca , ’ x l i m ’ , [ 0 1 8 ] ) ;
set ( gca , ’ y l i m ’ , [ 0 4 . 5 ] ) ;
plot ( [ 0 1 8 ] , [ 0 4 . 5 ] , ’ r−− ’ , ’ l i n e w i d t h ’ , 2 ) ; hold o f f ;
legend ( 2 , ’ t h r o u g h p u t v e c t o r ’ , ’ l i n e with s l o p e 1/4 ’ , ’ l o c a t i o n ’ , ’ n o r t h w e s t ’ ) ;
34
print ( f i g , s p r i n t f ( ’ a l p h a %d . e p s ’ , a l p h a ∗ 1 0 ) , ’−d e p s c ’ ) ;
(b) See P5.3.
(c) The plots are shown in Figure 27. It can be observed that the slope does not depend on α.
3
2
1
0
0
throughput of flow 2 (Mbps)
throughput vector
line with slope 1/4
throughput of flow 2 (Mbps)
4
4
throughput vector
line with slope 1/4
2
1
4
2
1
4
5
10
15
throughput of flow 1 (Mbps)
alpha=1.5
throughput vector
line with slope 1/4
3
2
1
0
0
5
10
15
throughput of flow 1 (Mbps)
throughput vector
line with slope 1/4
3
0
0
5
10
15
throughput of flow 1 (Mbps)
alpha=1.4
3
0
0
alpha=1.3
throughput of flow 2 (Mbps)
throughput of flow 2 (Mbps)
alpha=1.2
5
10
15
throughput of flow 1 (Mbps)
Figure 27: [Solution] The evolution of the vector (x1 , x2 ) over time with α = 1.2, 1.3, 1.4, 1.5.
5.7
TCP Vegas (exercise)
20KB
A 1Mbps
B Figure 28: A sends packets of 1KB to B via a 1Mbps link with 20KB buffer.
Consider the case where A sends packets of 1KB to B via a 1Mbps link with 20KB buffer as in Figure 28.
The propagation delay (round-trip time with the buffer being empty) is assumed to be 20ms. Assume
that A uses TCP Vegas as the transmission control protocol, i.e., it keeps track of the minimum round-
35
trip time τmin and the current round-trip time τ ,


W + 1
W ← W −1


W
and updates the window size W according to
if τW
−W
τ <
min
W
W
if τmin − τ >
otherwise
α
τmin
β
τmin
where α, β are TCP parameters. The parameters are chosen α < β to introduce hysteresis, which can
reduce the fluctuation in window size W .
Now we try to analyze the steady-state of this scenario. We start with making the following simplifications: at steady state, one has
W
W
(α + β)/2
−
=
.
τmin
τ
τmin
Assume α = 2.8 and β = 3.2.
(a) If τmin observed by A is equal to the round-trip time without queueing delay, which is 20ms, then
what is the window size W at steady state? How many packets are there in the buffer?
(b) If τmin observed by A is equal to the round-trip time without queueing delay, plus τe = 10ms, i.e.,
30ms, then what is the window size W at steady state? How many packets are there in the buffer?
(c) At what value of τe will the buffer be full? Let τe∗ denote this value. What happens if τe > τe∗ ?
[Solution]
(a) At steady state, the traffic throughput x = W P/τ , where P is the packet size, equals link capacity
C. Therefore
W
x
C
1Mbps
=
=
=
= 125pkt/s.
τ
P
P
1KB/pkt
Then,
W
W
(α + β)/2
−
=
τmin
τ
τmin
⇐⇒
W −3
W
=
τmin
τ
=⇒
W = 3 + τmin
The round-trip time is
τ=
W
= 3 + 20ms ∗ 125pkt/s = 5.5pkt.
τ
W
= 44ms
125pkt/s
and therefore the queuing time is tq = 44 − 20 = 24ms. Hence, the number of packets in the buffer is
(average number of packets in buffer) = tq ∗ 125pkt/s = 3pkt.
(b) One still have
W = 3 + τmin
W
= 3 + 30ms ∗ 125pkt/s = 6.75pkt.
τ
The round-trip time is
τ=
W
= 54ms
125pkt/s
and therefore the queuing time is tq = 54 − 20 = 34ms. Hence, the number of packets in the buffer is
(average number of packets in buffer) = tq ∗ 125pkt/s = 4.25pkt.
(c) For the buffer to be full, one needs
20pkt = tq ∗ 125pkt/s =⇒ tq = 160ms.
36
Then, the round-trip time is τ = 160 + 20 = 180ms. The window size would be
W = τ ∗ 125pkt/s = 22.5pkt.
Substitute in W = 3 + τmin W/τ to obtain
τmin =
W −3
τ = 156ms.
W
Then τe∗ = τmin − 20 = 136ms. When τe > τe∗ , the buffer will not be able to hold all packets, and has
to drop some packets. Then, TCP Vegas will work as TCP Reno (since packet drop occurs).
37
6
TCP models and equilibrium
6.1
Lipschitz continuity (From Lecture Notes)
4 points. Show that if f is locally Lipschitz at x0 then it is continuous at x0 , i.e., given any > 0
there exists a δ = δ() > 0 such that kf (x) − f (x0 )k < for all x ∈ Bδ (x0 ).
[Solution]
By Definition 1.1(1) of local Lipschitz continuity, there exist r > 0 and L ≥ 0 such that kf (x) −
f (x0 )k ≤ Lkx − x0 k for all x ∈ Br (x0 ). We only consider the nontrivial case where L > 0.
Now, given any > 0, let δ = min{r, /(2L)}. Hence Bδ (x0 ) ⊆ Br (x0 ) since δ ≤ r, and therefore
for all x ∈ Bδ (x0 ), we also have x ∈ Br (x0 ) and that the Lipschitz condition holds for x. It implies
< .
kf (x) − f (x0 )k ≤ Lkx − x0 k ≤ Lδ ≤ L · 2L
6.2
Convex sets
(a) 6 points. (Examples of convex sets). Three types of convex sets—affine set, second-order
cone, and the collection of semidefinite matrices—are of particular interest in convex optimization, since
optimization over these convex sets has found efficient algorithms. Below we prove that these sets are
indeed convex.
1. (affine set). Let A ∈ Rm×n and b ∈ Rm for some m, n ≥ 1. Show that the set
C = {x ∈ Rn | Ax = b}
is convex.
[Solution] Let x1 ∈ C, x2 ∈ C, i.e., Ax1 = Ax2 = B. Then for any 0 ≤ t ≤ 1, we have
A(tx1 + (1 − t)x2 ) = tAx1 + (1 − t)Ax2 = (t + 1 − t)B = B, i.e., tx1 + (1 − t)x2 ∈ C. Hence C is
convex.
2. (Second-order cone). Let n ≥ 1. For a vector x = (x1 , x2 , . . . , xn ) ∈ Rn , introduce the `2 norm
q
kxk := x21 + x22 + · · · + x2n .
Show that the set
C = {(x, t) ∈ Rn ⊗ R | kxk ≤ t}
is convex.
[Solution] Let (x, t) ∈ C, (y, s) ∈ C. Note that s, t ≥ 0. Then for any 0 ≤ α ≤ 1, we have
kαx + (1 − α)yk2
=
n
n
n
n
X
X
X
X
(αxi + (1 − α)yi )2 = α2
x2i + (1 − α)2
yi2 + 2α(1 − α)
xi yi
i=1
≤
α2
n
X
i=1
≤
i=1
x2i + (1 − α)2
n
X
i=1
i
i=1
i=1
v
v
u n
u n
X uX
u
y 2 + 2α(1 − α)t
x2 t
y2
i
i=1
i
i=1
α2 t2 + (1 − α)2 s2 + 2α(1 − α)st = (αt + (1 − α)s)2
where the first inequality holds due to Cauchy–Schwarz inequality, and the last inequality holds
because (x, t) ∈ C, (y, s) ∈ C. Therefore α · (x, t) + (1 − α) · (y, s) ∈ C, and hence C is convex.
38
3. (Collection of semidefinite matrices). Let n ≥ 1 and Sn×n denote the collection of symmetric n×n
real matrices. A matrix A ∈ Sn×n is called positive semidefinite, if xT Ax ≥ 0 for any x ∈ Rn . Let
A 0 denote A being positive semidefinite. Show that the set
C = {A ∈ Sn×n | A 0}
is convex.
[Solution] Let matrix A, B ∈ C. Hence A 0 and B 0. For any 0 ≤ α ≤ 1, αA + (1 − α)B ∈
Sn×n . Moreover, for any x ∈ Rn , we have
xT (αA + (1 − α)B)x = αxT Ax + (1 − α)xT Bx ≥ 0
Therefore αA + (1 − α)B 0, and hence C is convex.
(b) (Exercise only. Operations preserving convexity). We will explore several set operations
that preserve convexity. These operations are of fundamental importance to the convex optimization
theory.
1. (Linear transformation). Let X and Y be two linear spaces and f : X → Y be linear. For example,
consider X = Rn , Y = Rm , and f ∈ Rm×n . Show that if A ∈ X is convex, then
f (A) = {f (x) | x ∈ A}
is convex. Also show that if B ∈ Y is convex, then
f −1 (B) = {x ∈ X | f (x) ∈ B}
is convex.
[Solution] For any y1 , y2 ∈ f (A), there are x1 , x2 ∈ A such that y1 = f (x1 ), y2 = f (x2 ). Since f
is linear, for any λ ∈ [0, 1], we have f (λx1 + (1 − λ)x2 ) = λy1 + (1 − λ)y2 . Since A is convex, we
have λx1 + (1 − λ)x2 ∈ A, and hence λy1 + (1 − λ)y2 ∈ f (A). Therefore f (A) is convex.
For any x1 , x2 ∈ f −1 (B), we have y1 := f (x1 ) ∈ B and y2 := f (x2 ) ∈ B. Since f is linear, for
any λ ∈ [0, 1], we have f (λx1 + (1 − λ)x2 ) = λy1 + (1 − λ)y2 ∈ B, since B is convex. Therefore
λx1 + (1 − λ)x2 ∈ f −1 (B), and f −1 (B) is convex.
2. (Arbitrary concatenation). Let X and Y be two linear spaces and A ⊆ X, B ⊆ Y be convex. The
product space
X ⊗ Y := {(x, y) | x ∈ X, y ∈ Y}
with + and · defined by
(x1 , y1 ) + (x2 , y2 ) := (x1 + x2 , y1 + y2 ),
λ(x, y) := (λx, λy),
∀(x1 , y1 ), (x2 , y2 ) ∈ X ⊗ Y;
∀λ ∈ R, ∀(x, y) ∈ X ⊗ Y
is also a linear space (you don’t need to show this). For example, if X = Rm and Y = Rn for some
m, n ≥ 1, then X ⊗ Y = Rm+n . Show that the direct product
A ⊗ B := {(x, y) | x ∈ A, y ∈ B}
is convex. (*) Argue that the direct product of arbitrary number of convex sets is convex.
[Solution] Consider any (x1 , y1 ), (x2 , y2 ) ∈ A ⊗ B. Since both A and B are convex, for any
λ ∈ [0, 1], we have λx1 + (1 − λ)x2 ∈ A and λy1 + (1 − λ)y2 ∈ B. Then λ(x1 , y1 ) + (1 − λ)(x2 , y2 ) =
(λx1 + (1 − λ)x2 , λy1 + (1 − λ)y2 ) ∈ A ⊗ B. Therefore A ⊗ B is convex. Similar argument extends
to the direct product of arbitrary finite number of convex sets.
39
3. (Finite sum). Let X be a linear space and A, B ⊆ X be convex. Show that the set
A + B := {a + b | a ∈ A, b ∈ B}
is convex. Argue that the sum of finite convex sets is convex.
[Solution] Consider x, y ∈ A + B. Then there are ax , ay ∈ A and bx , by ∈ B such that x = ax + bx
and y = ay + by . Then for any λ ∈ [0, 1], we have λx + (1 − λ)y = λ(ax + bx ) + (1 − λ)(ay + by ) =
λax + (1 − λ)ay + λbx + (1 − λ)by . Since A and B are both convex, we have λax + (1 − λ)ay ∈ A
and λbx + (1 − λ)by ∈ B, and therefore λx + (1 − λ)y ∈ A + B. Therefore A + B is convex. Similar
argument extends to the sum of any finite number of convex sets.
4. (Arbitrary intersection). Let X be a linear space and A, B ⊆ X be convex. Show that the
intersection A ∩ B is convex. Argue that the intersection of arbitrary number of convex sets is
convex.
[Solution] Consider x, y ∈ A ∩ B. Then x, y ∈ A and x, y ∈ B. Since A and B are both convex,
for any λ ∈ [0, 1], we have λx + (1 − λ)y ∈ A and λx + (1 − λ)y ∈ B. Hence λx + (1 − λ)y ∈ A ∩ B
and therefore A ∩ B is convex. Similar argument extends to the intersection of any finite number
of convex sets.
5. (Union?). Let X be a linear space and A, B ⊆ X be convex. Give an example where the union
A ∪ B is nonconvex. Hint: It suffices to consider X = R.
[Solution] Consider X = R, and the sets A := {(x, y)|x2 + y 2 ≤ 1} and B := {(x, y)|(x − 3)2 +
y 2 ≤ 1}. It is easy to show that both A and B are convex. However, if we look at two points
a = (1, 0) ∈ A and b = (2, 0) ∈ B (and therefore a, b ∈ A ∪ B), one of there convex combination
1
1
2 (1, 0) + 2 (2, 0) = (1.5, 0) lies neither in A nor in B, and hence is not in A ∪ B. Hence A ∪ B is
nonconvex.
6.3
Convex functions (exercise)
(a) (Examples of convex functions).
1. (Exponential). Show that the function f (x) = eax defined on C = R, where a ∈ R, is convex.
2. (Entropy). Show that the function f (x) = x ln x defined on R++ = (0, ∞) is convex.
3. (Log-exponential). Show that the function f (x1 , x2 ) = ln(ex1 + ex2 ) defined on R2 is convex.
[Solution]
1. Since f 00 (x) = a2 eax > 0 on R, f is convex.
2. We have f 0 (x) = ln x + x ·
1
x
= ln x + 1, and f 00 (x) =
3. The Hessian Matrix of the function f = ln(e
H=
∂2f
∂x21
∂2f
∂x2 x1
x1
∂2f
∂x1 x2
∂2f
∂x22
1
x
> 0 on (0, +∞), so f is convex.
x2
+ e ) is
!
=
ex1 +x2
x
e 1 + ex2
1
−1
−1
.
1
1 −1
The eigenvalues of −1
are 0 and 2, so the eigenvalues of H are both nonnegative. Hence H is
1
positive semidefinite on R and f is convex.
40
(b) (Operations preserving convexity). We will show that addition, multiplication by positive
constant, and supremum preserves convexity.
1. Show that if f1 and f2 are two convex functions on the same domain, and α, β ≥ 0, then αf1 + βf2
is also convex.
2. Show that if f1 and f2 are two convex functions on the same domain, then f = max{f1 , f2 } is
also convex. Use this result to show that the function f (x, y) = |x| + |y| defined on R2 is convex.
[Solution]
For any x, y in the domain of f1 , f2 , and any λ ∈ [0, 1], we have fi (λx + (1 − λ)y) ≤ λfi (x) + (1 − λ)fi (y),
for i = 1, 2, by the definition of convex function. Hence
(αf1 + βf2 )(λx + (1 − λ)y)
=
αf1 (λx + (1 − λ)y) + βf2 (λx + (1 − λ)y)
≤
α(λf1 (x) + (1 − λ)f1 (y)) + β(λf2 (x) + (1 − λ)f2 (y))
=
λ(αf1 + βf2 )(x) + (1 − λ)(αf1 + βf2 )(y).
Therefore αf1 + βf2 is convex.
Consider f = max{f1 , f2 }. Without loss of generality, suppose we have f (λx + (1 − λ)y) = f1 (λx +
(1 − λ)y) ≥ f2 (λx + (1 − λ)y). Hence
f (λx + (1 − λ)y)
=
f1 (λx + (1 − λ)y)
≤
λf1 (x) + (1 − λ)f1 (y)
≤
λf (x) + (1 − λ)f (y).
(by convexity of f1 )
Hence f is convex. Since f (x, y) = |x| + |y| = max{x + y, x − y, −x + y, −x − y}, where all the four
functions are convex, f is convex.
(c) (Level sets are convex). If f is a convex set defined on C, show that the level sets {x ∈ C |
f (x) ≤ α} are convex for α ∈ R.
[Solution] For any fixed α ∈ R, sonsider two points x, y in the level set Lα := {x ∈ C | f (x) ≤ α}. For
any λ ∈ [0, 1], we have
f (λx + (1 − λ)y) ≤ λf (x) + (1 − λ)f (y)
(by convexity of f )
≤ λα + (1 − λ)α = α.
Hence λx + (1 − λ)y ∈ Lα and the level set Lα is convex.
6.4
Convex optimization (exercise)
Consider the following optimization problem. Let x ∈ Rn be the optimization variable and f , g1 , . . . , gk
be scalar functions defined on Rn where k ≥ 1. Let A ∈ Rm×n and b ∈ Rm be given. The optimization
problem is as follows.
(P) : min f (x)
s.t. Ax = b;
gi (x) ≤ 0,
i = 1, . . . , k,
when f, g1 , g2 , . . . , gk are convex, (P) is called convex. Use the knowledge we gained in previou problems
to show that the feasible set
C = {x | Ax = b, gi (x) ≤ 0 for i = 1, . . . , k}
41
is convex. Hint: The set C is the intersection of k + 1 convex sets.
[Solution] It is obvious that {x | Ax = b} is convex. Since each of gi for i = 1, . . . , k is convex, its
level set {x | gi (x) ≤ 0} is also convex. Then C is the intersection of k + 1 convex sets and hence is also
convex.
6.5
Duality theory (exercise)
We will work through the duality theory in a simple case. Consider Problem (P) in Problem 2.3. Let
µ ∈ Rm , λ ∈ Rk+ = [0, ∞)k , and define
L(x, µ, λ) := f (x) + µT (Ax − b) + λT g(x)
where g(x) = (g1 (x), g2 (x), . . . , gk (x))T .
(a) (Unconstrained optimization). Let L(µ, λ) := minx∈Rn L(x, µ, λ) denote the unconstrained
optimization over x for a fixed µ, λ. Assume that Problem (P) has an optimal solution and denote it
by x∗ . Show that L(µ, λ) ≤ f (x∗ ) for any µ ∈ Rm and λ ∈ Rk+ .
[Solution] Let x(µ, λ) := arg minx∈Rn L(x, µ, λ), and then L(µ, λ) = L(x(µ, λ), µ, λ) ≤ L(x∗ , µ, λ) since
x(µ, λ) is the minimizer of L(·, µ, λ) given (µ, λ). Since x∗ is feasible, we have Ax∗ −b = 0 and gi (x∗ ) ≤ 0
for i = 1, . . . , k. We also have λ ∈ Rk+ . Hence
L(µ, λ) ≤ L(x∗ , µ, λ)
= f (x∗ ) + µT · 0 + λT g(x∗ )
≤ f (x∗ ).
(b) (Dual problem). Consider the dual problem
(D) : max L(µ, λ)
s.t. λ ≥ 0.
Assume (D) has an optimal solution and denote it by (µ∗ , λ∗ ).
1. Show that L(µ∗ , λ∗ ) − f (x∗ ) ≤
bound for Problem (P).
Pk
i=1
λ∗i gi (x∗ ) ≤ 0. It implies that Problem (D) provides a lower
[Solution] The result in (a) still holds if we take (µ, λ) = (µ∗ , λ∗ ).
2. Assume now f, g1 , g2 , . . . , gk are convex. Show that the equality is attained if and only if
∂x L(x∗ , µ∗ , λ∗ ) = 0.
[Solution] The first equality, i.e., L(µ∗ , λ∗ ) = f (x∗ ) + λ∗,T g(x∗ ) = L(x∗ , µ∗ , λ∗ ) is attained if and
only if x∗ is the minimizer of L(·, µ∗ , λ∗ ). Since f and gi for i = 1, . . . , k are convex, L(x, µ∗ , λ∗ )
is a convex function of x. Then x∗ is a minimizer if and only if
∂x L(x∗ , µ∗ , λ∗ ) = 0.
3. Show that if there exists (x, µ, λ) such that x is feasible for (P), (µ, λ) is feasible for (D),
∂x L(x, µ, λ) = 0, and λi gi (x) = 0 for i = 1, . . . , k, then x solves (P) and (µ, λ) solves (D).
These are called the KKT conditions.
[Solution] Consider an (x∗ , µ∗ , λ∗ ) that satisfies the conditions above. Since ∂x L(x∗ , µ∗ , λ∗ ) = 0,
Pk
we have L(µ∗ , λ∗ ) = f (x∗ )+λ∗,T g(x∗ ). Since λ∗,T g(x∗ ) = i=1 λ∗i gi (x∗ ) = 0, we have L(µ∗ , λ∗ ) =
f (x∗ ). Hence f attains its lower bound at x∗ , and L attains its upper bound at (µ∗ , λ∗ ). Therefore
x∗ solves (P) and (µ∗ , λ∗ ) solves (D).
42
6.6
Convex functions
4 points. For each of the following functions, determine if it is convex, concave, or neither.
• f (x) = ex − 1 on R.
• f (x) = x1 x2 on (x1 , x2 ) ∈ R2 | x1 > 0, x2 > 0 .
• f (x) = x11x2 on (x1 , x2 ) ∈ R2 | x1 > 0, x2 > 0 .
• f (x) = x1 /x2 on (x1 , x2 ) ∈ R2 | x1 > 0, x2 > 0 .
[Solution]
(1) f 00 (x) = ex > 0 on R, so f is strictly convex.
(2) The Hessian Matrix of the function f (x) = x1 x2 is
H=
∂2f
∂x21
∂2f
∂x2 x1
∂2f
∂x1 x2
∂2f
∂x22
!
=
0
1
1
0
whose eigenvalues are ±1. Hence H is indefinite and f is neither convex nor concave.
(3) The Hessian Matrix of the function f =
H=
∂2f
∂x21
∂2f
∂x2 x1
1
x1 x2
is
∂2f
∂x1 x2
∂2f
∂x22
!
=
2
x31 x2
1
x21 x22
1
x21 x22
2
x1 x32
!
.
The principle minors of H are x32x2 > 0 and x4−1
4 x4 > 0; one can also check that the eigenvalues are
1
1 2
strictly positive for x1 , x2 > 0. Hence H is positive definite and f is strictly convex.
(4) The Hessian Matrix of the function f (x) = x1 /x2 is
!
∂2f
∂2f
H=
∂x21
∂2f
∂x2 x1
∂x1 x2
∂2f
∂x22
=
0
− x12
2
− x12
2
2x1
x32
!
.
q 2
x
The eigenvalues of H are xx13 ± x16 + x14 . Therefore, one of the eigenvalues is positive and the other
2
2
2
negative if x1 , x2 > 0. Hence H is indefinite and f is neither convex nor concave.
6.7
Throughput vs. fairness
6 points. Consider a linear network with L links indexed by 1, . . . , L, each of capacity c = 1. There
are L + 1 flows indexed by 0, . . . , L. Flows l = 1, . . . , L traverse only link l and the flow indexed by 0
traverses all the L links. Suppose all flows have the following utility function with the same α ≥ 0:
( 1−α
xi
, α 6= 1
Ui (xi ) = 1−α
log xi , α = 1.
The rate at which each flow transmits is determined by the solution of the following utility maximization,
subject to capacity constraints:
L
X
max
Ui (xi ) s.t. Rx ≤ c
x≥0
i=0
43
where x = (x0 , . . . , xL ), matrix R is the routing matrix introduced in class. The expression x ≥ 0 means
xi ≥ 0 for i = 0, . . . , L.
PL
Calculate the aggregate throughput T (α) = i=0 xi (α). Explain the dependence of T (α) on α. Also
comment on the dependence of fairness (how fair the link capacity is divided by the flows) on α.
[Solution] Note that the constraints Rx ≤ c can be written as x0 + xl ≤ 1, ∀l = 1, . . . , L.
The KKT conditions for primal and dual optimal solutions (x(α), p(α)) of the utility maximization
problem (and its dual) with parameter α are:
−Ul0 (xl (α)) + pl (α) = 0, l = 1, . . . , L,
−U00 (x0 (α)) +
L
X
pl (α) = 0
(Stationarity)
l=1
x0 (α) + xl (α) ≤ 1,
pl (α) ≥ 0,
l = 1, . . . , L
l = 1, . . . , L
(Primal feasibility)
(Dual feasibility)
pl (α)[x0 (α) + xl (α) − 1] = 0,
l = 1, . . . , L
(Complementary slackness)
Since Ui0 (xi (α)) = (xi (α))−α > 0 (which also requires xi (α) > 0 and cannot be zero) for i = 0, . . . , L,
we have pl (α) > 0 for l = 1, . . . , L. Hence x0 (α) + xl (α) = 1 for l = 1, . . . , L, due to complementary
slackness. Therefore we have
(x0 (α))−α =
L
X
(xl (α))−α = L(1 − x0 (α))−α
l=1
which implies
x0 (α) =
1
1
α
L +1
1
Lα
,
xl (α) = 1
Lα + 1
PL
Hence the total throughput T (α) = i=0 xi (α) = L −
l = 1, . . . , L.
L−1
1
L α +1
.
Clearly, T (α) is a decreasing function of α. However, when α increases, the difference between x0 (α)
and xl (α), for any l = 1, . . . , L, becomes small, which means the allocation of throughputs is more
fair.
6.8
TCP steady state analysis
Consider the network in Fig. 29, where R1–R4 are routers, L1–L3 are links, S1–S3 are source hosts,
and T1–T3 are the corresponding destination hosts.
The link capacities of L1, L2 and L3 are 2500 packets/s. The one way propagation delay of each link
L1 – L3 is 10ms and assume there is no propagation delay between a host and a router. There are three
flows: flow 1 from S1 to T1, flow 2 from S2 to T2, and flow 3 from S3 to T3. Flow 1 starts at t=0,
flow 2 starts at t=10sec and flow 3 starts at t=20sec. All flows use TCP Fast, i.e., the window update
min
is W (t + 1) = RTT
RTT W (t) + α, with α = 50.
(a) 5 points. Calculate the steady-state throughput of each flow and queue length of each link, during
0s–10s, 10s–20s and after 20s, assuming each flow knows its RTTmin (round-trip propagation delay)
accurately. Assume before flow 2 starts, all packets are buffered at L1.
(b) 5 points. Repeat (a) but with each flow measuring its RTTmin that includes queueing delay due
to other flows that started before it does. Assume before flow 2 starts, all packets are buffered at L1.
44
Figure 29: Network topology for TCP steady state control, P6.8.
[Solution]
αRTT
(a) For each TCP source, the steady-state window size is W = RTT−RTT
, which implies the steadymin
α
state rate is x = W/RTT = RTT−RTTmin . Indeed, for TCP flow i, RTTi = di + qi where di is its
round-trip propagation delay and qi is its queueing delay. When each flow knows its RTTmin,i as di , we
have xi = qαi .
Between 0s–10s, there is only flow 1 in the network. It will use up all the capacity of links L1–L3, i.e.,
50
x1 = 2500 pkts/s. The queueing delay is q1 = 2500
s = 20ms and the queue length of flow 1, which is
only on link L1, is 50 packets. The queue lengths on L2 and L3 are both 0.
Between 10s–20s, flows 1 and 2 share link L1, which becomes the bottleneck. Then we have x1 + x2 =
2500pkts/s. Let pi denote the queueing delay on links Li, i=1, 2, 3. Since links L2, L3 are underutilized,
there are no queues on them and hence p2 = p3 = 0. Therefore x1 = α/p1 = x2 = 1250pkts/s. The
50
queueing delay on L1 is p1 = 1250
s = 40ms, and the queue length on L1 is 2500pkts/s×0.04s = 100pkts.
After 20s, flows 1 and 2 share link L1, and flows 1 and 3 share link L3, which both become the bottleneck
links. Then we have x1 + x2 = 2500pkts/s, x1 + x3 = 2500pkts/s. Since L2 is underutilized, there is no
queues on L2 and hence p2 = 0. We have x2 = α/p1 = α/p3 = x3 , and x1 = α/(p1 + p3 ) = x22 . Hence
50
x1 = 2500 × 31 = 833.3pkts/s, and x2 = x3 = 1666.7pkts/s. Therefore p1 = p3 = 1666.7
s = 30ms, and
the queue length is 2500pkts/s × 0.03s = 75pkts on L1 and L3 each.
(b) Again, consider x =
α
RTT−RTTmin .
Between 0s–10s, there is only flow 1 in the network, and the result is the same as (a).
Between 10s–20s, flows 1 and 2 share link L1, which becomes the bottleneck. There are no queues
on L2 and L3, and hence p2 = p3 = 0. Flow 2 knows it RTTmin,2 as d2 + 20ms, and hence √
x2 =
α
α
50
50
α
=
.
We
still
have
x
=
.
By
solving
+
=
2500
we
have
p
=
0.03
+
0.01
5≈
1
1
q2 −0.02
p1 −0.02
p1
p1 −0.02
p1
0.0524sec = 52.4ms. The flow rates are x1 = 50/0.0524 ≈ 955pkts/s, and x2 = 50/0.0324 ≈ 1545pkts/s.
The queue length on L1 is 2500 × 0.0524 ≈ 131 pkts.
After 20s, flows 1 and 2 share link L1, and flows 1 and 3 share link L3, which both become the bottleneck
links. There is no queue on L2 and hence p2 = 0. Flow 3 knows it RTTmin,3 as d√
3 , and hence x3 = α/p3 .
α
α
α
α
Solving p1 +p
+
=
2500
and
+
=
2500,
we
have
p
=
0.03
+
0.01
3 ≈ 0.0473s = 47.3ms,
1
p1 −0.02
p1 +p3
p3
3
√
and p3 = 0.01 + 0.01 3 ≈ 0.0273s = 27.3ms. The flow rates are x1 = 50/(0.0473 + 0.0273) ≈ 670pkts/s
and x2 = x3 = 50/0.0273 ≈ 1830pkts/s. The queue length on L1 is 2500 × 0.0473 ≈ 118pkts and the
queue length on L3 is 2500 × 0.0273 ≈ 68pkts.
45
max  U i ( x) s. t. Rx  c, x  0
x0
i
Calculate the aggregate throughput T ( ) 
 x ( ) . Explain the dependence of the aggregate i
i
throughput on 
Solution 7 Dynamical systems, stability of TCP
Fix an  and let x( ) denote a solution of the utility maximization. For the linear network, the optimality (KKT) condition is: there exists p( )  0, such that x0 ( )  xl ( )  1 with equality if 7.1 Asymptotic
stability
p ( )  0, l  1,..., L, and dynamical system
4 lpoints. Consider the
xl ( )   pl ( ),
L

2
2
 x2 )
1 +
l  1,...,ẋL1, = −x2 − x1xsin(x
pl ( )
0 ( )  
ẋ2 =
x1 − x2 sin(x21 + x22 ).
l 1
(1)
Since xl ( )  1, we have pl ( )  xl ( )   0, and hence x0 ( )  xl ( )  1. Then, (1) implies Prove that the origin is asymptotically stable.
L
L

2
2 1  x ( ) 


1V(x)x0 (= )12(x
x0Lyapunov
( )   function
xl ( ) candidate

 L
[Solution] Consider
0 is zero at (0, 0) and positive
1 +x
2 ), which
l 1
l 1
everywhere else. We have the derivative
of V along
any system trajectory:
L1 / 
V̇ (x) = 1x ẋ + x ẋ2
Hence x0 ( )  1 /  1 1, x2l (
)  12/  2 , l  1,..., L
L x1 (−x
1 2 − x1 sin(x
L1 +x12 )) + x2 (x1 − x2 sin(x21 + x22 ))
=
L
2
2 L  12
2
1+
( ) 1+Lx2 ) sin(x
T ( )  = xi−(x
. x2 )
and 1/ 
 1π − } ( is a small positive constant) of the origin, is
0 1 , x2 ) | x2 +Lx2 ≤
1
2
which, in a neighborhood i{(x
negative everywhere except the origin and zero at the origin. Hence V is a Lyapunov function which is
Thus, as we make the allocation more Clearly, the aggregate throughput is a decreasing function of sufficient to show the origin is asymptotically stable.
and more ‘fair’, the throughput gets lowered. 


7.2 Stability of TCP
4. TCP congestion control: modeling and stability [8 points] Consider the network in Fig. 30.
Consider the following network model: x1
x2
c1 , p1
c2 , p2
x3
30 Figure 30: The network in P7.2.
Suppose the TCP algorithms are given by
1
,
p1 (t)
1
x2 (t) = p
,
p2 (t)
2
x3 (t) =
(p1 (t) + p2 (t))1/3
x1 (t) =
and the queue management algorithms are given by
d
p1 (t) = γ(x1 (t) + x3 (t) − c1 ),
dt
d
p2 (t) = γ(x2 (t) + x3 (t) − c2 ).
dt
46
(a) 6 points. Find the utility functions of the 3 flows and write down the network utility maximization
problem implicitly solved by this algorithm. Hint: Write down the equilibrium condition and interpret
that as the optimality condition of a network utility maximization problem.
[Solution] Using the optimality condition of utility maximization problem, we have
1
,
x1
1
U20 (x2 ) = p2 = 2 ,
x2
U10 (x1 ) = p1 =
U30 (x3 ) = p1 + p2 =
8
.
x33
Hence we have
U1 (x1 ) = log x1 ,
1
U2 (x2 ) = − ,
x2
4
U3 (x3 ) = − 2 .
x3
The utility maximization problem is
3
X
max
xi ≥0,i=1,2,3
Ui (xi )
i=1
s.t. x1 + x3 ≤ c1
x 2 + x 3 ≤ c2 .
(b) 4 points. Is the equilibrium point (x∗ , p∗ ) unique? Explain.
[Solution] Yes. The primal objective function is strictly concave and hence x∗ = [x∗1 , x∗2 , x∗3 ]T is unique.
Then [U10 (x∗1 ), U20 (x∗2 ), U30 (x∗3 )]T = RT p∗ is also unique where the routing matrix
1 0 1
R=
0 1 1
is of full row rank. Hence p∗ is also unique.
(c) 6 points. Prove that the equilibrium point (x∗ , p∗ ) is asymptotically stable. Hint: Try the dual
objective function as a candidate Lyapunov function.
[Solution] Consider the candidate Lyapunov function:
V (p)
:=
3
X
Ui (xi (p)) − p1 (x1 (p) + x3 (p) − c1 ) − p2 (x2 (p) + x3 (p) − c2 )
i=1
=
3
X
i=1
"
Ui (xi (p)) − xi (p)
2
X
#
Rli pl +
l=1
2
X
cl pl
(3)
l=1
where xi (p) for i = 1, 2, 3 are given in the problem description, and are also the maximizers of the
Lagrangian given p. As the dual objective function, V is minimized at p∗ , and V (p) > V (p∗ ) for any
47
p 6= p∗ . The derivative of V along any trajectory of the system is
"
! 2
#
3
2
2
2
2
X
X
X
X
X
X
∂xi
∂xi
0
V̇ (p) =
Ui (xi (p))
ṗl −
ṗl
Rli pl − xi (p)
Rli ṗl +
cl ṗl
∂pl
∂pl
i=1
l=1
l=1
l=1
l=1
l=1
"
#
3
2
2
X
X
X
=
−xi (p)
Rli ṗl +
cl ṗl
i=1
=
=
−
−
2
X
l=1
ṗl
3
X
l=1
i=1
2
X
3
X
l=1
γ
l=1
!
Rli xi (p) − cl
!2
Rli xi (p) − cl
i=1
P2
where the second inequality is due to Ui0 (xi (p)) = l=1 Rli pl , and the last inequality is due to the
queue management algorithms in problem description. Hence V̇ (p) ≤ 0, with equality if and only if
p = p∗ (due to the uniqueness of dual optimal solution p∗ ). Hence V is a Lyapunov function and
(x∗ , p∗ ) = (x(p∗ ), p∗ ) is asymptotically stable.
48
8
8.1
Queueing systems
B&G, P3.9
A communication line with link rate 50 Kbps is used to serve 10 flows, each generating Poisson traffic
at a rate 150 packets/min. Packet lengths are exponentially distributed with mean 1 Kbits.
(a) 6 points. Find the average number of packets in queue waiting, the average number of packets in
the system, and the average packet delay (time spent in the system), when:
(i) the link is divided into 10 independent equal-capacity channels. Each channel has an independent
buffer and serves one flow.
(ii) the link is shared as a single channel by the 10 flows via statistical multiplexing.
(b) 4 points. Repeat (a) if 5 of the flows generate packets at a rate of 250 packets/min, while the
other 5 flows generate packets at a rate of 50 packets/min.
[Solution]
(a) For each session of (i) and (ii), the arrival rate of one flow is λ1 = 150/60 = 2.5 pkts/sec.
(i) When the link is divided into 10 equal-capacity channels, each channel has a service rate µ1 =
50Kbps/10/1Kbits = 5 pkts/sec.
Then, the average packet delay (for the whole system since it is the same for every channel) is T =
1/(µ1 − λ1 ) = 0.4 s. By Little’s theorem, the average number of packets in every channel is N1 =
λ1 T = 2.5 × 0.4 = 1 pkt. Therefore, the average number of packets in the queue for every channel is
Q1 = N1 − λ1 /µ1 = 0.5 pkt. Therefore, total number of packets in the system is N = 10 × 1 = 10 pkts,
and total number of packets waiting in the queue is Q = 10 × 0.5 = 5 pkts.
(ii) When the link is not divided, the total arrival rate is λ = 25 pkts/sec and service rate is µ =
50 pkts/sec. Then, the average packet delay is T = 1/(µ − λ) = 0.04 s By Little’s theorem, average
number of packets in the system is N = λT = 25 × 0.04 = 1 pkt, and the average number of packets in
the queue is Q = N − λ/µ = 0.5 pkt.
(b) For a flow (indexed by 1) that generates packets at a rate of 250 pkts/min, its arrival rate is
λ1 = 250/60 = 4.17 pkts/sec. Then, T1 = 1/(µ1 − λ1 ) = 1/(5 − 4.17) = 1.2 s, and N1 = λ1 T1 =
4.17 × 1.2 = 5 pkts, and Q1 = N1 − λ1 /µ1 = 4.17 pkts.
For a flow (indexed by 2) that generates packets at a rate of 50 pkts/min, its arrival rate is λ2 = 50/60 =
0.83 pkts/sec. Then, T2 = 1/(µ2 − λ2 ) = 1/(5 − 0.83) = 0.24 s, and N2 = λ2 T2 = 0.83 × 0.24 = 0.2 pkt,
and Q2 = N2 − λ2 /µ2 = 0.03 pkt.
Therefore, the average number of packets in the system is N = 5(N1 + N2 ) = 5 × (5 + 0.2) = 26 pkts,
the average packet delay is T = 26/(5λ1 + 5λ2 ) = 1.04 s, and the average number of packets in the
queue is Q = 5(Q1 + Q2 ) = 5 × (4.17 + 0.03) = 21 pkts.
When statistical multiplexing is used, the total arrival rate is λ = 5(λ1 + λ2 ) = 5(4.17 + 0.83) =
25 pkts/sec, and therefore the results is the same as in (a).
8.2
B&G, P3.13
6 points. Persons arrive at a taxi stand with room for W = 5 taxis according to a Poisson process
with rate λ. A person boards a taxi upon arrival if one is available and otherwise waits in a line. Taxis
arrive at the stand according to a Poisson process with rate µ. An arriving taxi that finds the stand
full departs immediately; otherwise, it picks up a customer if at least one is waiting, or else joins the
queue of waiting taxis. Use an M/M/1 queue formulation to obtain the steady-state distribution of the
49
taxis queue when λ = 1 and µ = 2 per minute. Compute the average number of people waiting, and
the average number of taxis waiting in steady state.
[Solution] Let the state X := #passengers − #taxis. Then X is a Markov chain taking value in
{−5, −4, ..., 0, 1, 2, ...}. Moreover, it has the same transition behavior as an M/M/1 queue with arrival
rate λ and service rate µ.
Therefore, the stationary distribution is
n+5 λ
λ
P (X = n) =
1−
.
µ
µ
Therefore, since λ/µ = 12 , the average number of taxis waiting is
2
3
4
5
5
X
1
1
1
1
1
kP (X = −k) = 5 × + 4 ×
+3×
+2×
+1×
2
2
2
2
2
k=1
1
,
32
and the average number of people waiting is
=
∞
X
4
kP (X = k)
=
k=1
=
=
8.3
6 P∞ k d
λ
λ
k=1 ρ
1−
|ρ= λ
µ
µ
µ
dρ
6
λ
λ
1
| 1
1−
µ
µ
(1 − ρ)2 ρ= 2
1
.
32
Jackson network
6 points. Packets arrive at a processor/transmitter according to a Poisson process with rate α pkts/sec.
Each packet takes an i.i.d. exponential time to process and transmit (with an average of 1/µ0 sec). It
is then sent onto one of two networks towards its destination, with probabilities p01 and p02 = 1 − p01 .
A packet that is sent onto Network 1 will incur a processing/transmission time that is i.i.d. and
exponentially distributed (with an average of 1/µ1 sec), after which it either arrives at the destination
correctly (with probability 1 − p10 ), or in error (with probability p10 ). In the latter case, the packet
must be retransmitted by the processor/transmitter. Similarly, a packet that is sent onto Network 2
will processing/transmission time that is i.i.d. and exponentially distributed (with an average of 1/µ2
sec), and then either arrives at the destination correctly (with probability 1 − p20 ), or in error (with
probability p20 ). In the latter case, it must be retransmitted by the processor/transmitter.
Draw a queueing model of the system, and find the expected numbers of packets at the processor/
transmitter, in Networks 1 and 2, and the expected time it takes for a packet to reach the destination
correctly.
[Solution]
The network model is shown in Fig. 31.
The arrival rates at the processor/transmitter (node 0) and Networks 1 and 2 are respectively λ0 , λ1
and λ2 , which satisfy
λ0
= α + p10 λ1 + p20 λ2
λ1
= λ0 p01
λ2
= λ0 p02
50
Draw a queueing model of the system, and find the expected numbers of packets at the processor/transmitter, in Networks 1 and 2, and the expected time it takes for a packet to reach the destination correctly. [Solution] Network model: p

1-p
p
 

p

1-p
p
Let , ,  be the arrival rates to the queues 0, 1, 2. Then,  p10 131:
 p[Solution]
0  
Figure
P8.3, the network model.
20 2
which imply
1  p010
2  p02 0  (1  p02 )0
Hence λ0
=
λ1
=
λ2
=
α
1 − p01
p
− p02 p20
10
37 p01 α
1 − p01 p10 − p02 p20
p02 α
.
1 − p01 p10 − p02 p20
By Jackson’s Theorem, the average numbers of packets N0 at the processor/transmitter and N1 , N2 in
Networks 1 and 2 can be calculated independently at every node as in a M/M/1 queue. Therefore
N0
=
N1
=
N2
=
λ0
α
=
µ0 − λ0
βµ0 − α
λ1
p01 α
=
µ1 − λ1
βµ1 − p01 α
λ2
p02 α
=
µ2 − λ2
βµ2 − p02 α
where β := 1 − p01 p10 − p02 p20 . By Little’s Theorem, the expected time a packet spends in the system
is
1
p01
p02
N0 + N1 + N2
T =
=
+
+
.
α
βµ0 − α βµ1 − p01 α βµ2 − p02 α
8.4
Optimal stochastic routing
8 points. Consider sending packets from Caltech to MIT through an intermediate node, as shown in
Fig. 32.
Packets are generated by the sender at CIT according to a Poisson process with rate λ packets/sec.
The packet sizes are statistically independent and exponentially distributed with mean length 1 kbits.
All packets are destined for MIT. They are randomly routed through Chicago, with probability p, or
Atlanta, with probability 1 − p. The nodes at Chicago and Atlanta have infinite buffers. The Chicago
node has a constant transmission rate of µ1 kbps and the Atlanta node has a constant transmission rate
of µ2 kbps. Suppose the signal propagation and process delays are negligible so that the total delay
involves just queueing and transmission times at the intermediate nodes.
51
T  N / 
p01
p02
1


b0   b1  p01 b 2  p02
4. Optimal routing [6 points] Consider sending packets from Caltech to MIT through an intermediate node, as shown: Chicago p CIT MIT 1‐p Atlanta Figure 32: P8.4, the network model.
1
If µ2 = 4µ1 , prove that p = λ−2µ
minimizes the expected
3λ
38 total delay, assuming that 2µ1 < λ < 5µ1 .
Argue carefully why this is the unique minimizing routing probability.
[Solution]
Each node, Chicago or Atlanta, can be modeled as an M/M/1 queue with arrival rates λp and λ(1 − p)
pkts/sec and service rates µ1 and µ2 pkts/sec respectively. Hence, the average number of packets at
Chicago is
λp
N1 =
,
µ1 − λp
and the average number of packets at Atlanta is
N2 =
λ(1 − p)
,
µ2 − λ(1 − p)
and the total expected delay (which is a function of p) is
T (p) =
N1 + N2
p
1−p
=
+
.
λ
µ1 − λp 4µ1 − λ(1 − p)
Setting T 0 (p) = 0 yields
p=
λ − 2µ1
3λ
or p =
6µ1 − λ
λ
Note that for queues at Chicago and Atlanta to be stable, we require
λp < µ1
and λ(1 − p) < 4µ1 ,
but one can check that p = 6µ1λ−λ does not satisfy this stability condition, and hence the expected delay
will be unbounded with this routing probability.
1
On the other hand, if 2µ1 < λ < 5µ1 then the routing probability p = λ−2µ
> 0, and it satisfies the
3λ
λ−2µ1
0
1
stability condition above. Moreover, if p < 3λ then T (p) < 0 and if p > λ−2µ
then T 0 (p) > 0.
3λ
λ−2µ1
Therefore, p = 3λ minimizes the expected delay T (p).
52
Download