>> Ratul Mahajan: It's my great pleasure to introduce... the University of Massachusetts in Amherst. I've known Arun...

advertisement
>> Ratul Mahajan: It's my great pleasure to introduce Arun Venkataramani from
the University of Massachusetts in Amherst. I've known Arun now for a few
years, and I've had the privilege of working with him on a couple of interesting
projects as well.
So before Arun graduated he was at UT Austin where he did work on
background TCP transfers, which I believe now have made their way into some
ideas, at least a couple of IBM's products about background transfers, and since
then he's had a stellar record, including two [inaudible] in consecutive years and
gone on to do some interesting stuff.
Today he's going to describe some of his recent work on block switching
networks.
>> Arun Venkataramani: Thanks, Ratul, for the introduction. I'm not sure if the
microphone is active, but I assume you can all hear me.
So let me, first of all, apologize for starting late. So my excuse is that I spent the
last over an hour in a cab. I thought I knew Seattle traffic, and I thought
40 minutes would be enough to come from Seattle, but it turns out he took the
wrong exit, and then we pulled out the GPS and I was helping him figure out his
GPS to get here [laughter]. So there.
Okay. My talk today is about block switching. The title is Towards a Robust
Protocol Stack For Diverse Wireless Networks. Much of the talk, the work and
the talk, is based on NSDI 2009 paper this year. This work -- all the work that I
talk about is in collaboration with Ming Li and Devesh Agrawal, students who did
the Hop work, faculty at [inaudible] Deepak Ganesan and student Aruna
Balasubramanian and Xiaozheng Tie, who are doing some of the more open
ended and explorative stuff that I'm going to touch it upon at the end of this talk.
Okay. So what's the motivation for this? There's a big gap between what's
advertised as the max capacity of a wireless technology and what applications
actually get. So this is an example. I'll describe our testbed itself in more detail,
but these are experiments conducted from our pit.
We chose three different links. One good link, one average link, and one bad
link. And we simply measured the goodput achieved by UDP and TCP over
these links.
So UDP is a lower bound on what you can achieve. This is a 802.11 [phonetic]
network, but similar results can be observed for G and N as well.
So UDP places -- yes?
>>: Are you only looking over 802.11 or are you looking over other wireless --
>> Arun Venkataramani: Much of this talk is only going to be about 802.11, but
to the extent possible, I'm going to give the motivational arguments and justify
Hops design with minimum recourse to 80211's deficiency, as little as possible,
and focus more on TCP's fundamental problems over wireless, but the [inaudible]
set this design for 802.11 as it is.
Okay. So UDP places a lower bound. You can at least get as much as what
UDP can get, but what most applications which is use TPC see is the TPC
throughput which shaves off a huge factor.
What's particularly noteworthy is that on bad links, and that's not really worst link,
that's somewhere in the bottom third or bottom quartile of the links, there's a big
order of magnitude, almost two orders of magnitude of difference between what
you can get or what you should reasonably expect and what TPC gets.
>>: [inaudible].
>> Arun Venkataramani: High loss with links.
>>: [inaudible].
>> Arun Venkataramani: High loss rate as observed by UDP.
>>: [inaudible].
>> Arun Venkataramani: There is no congestion in this. These are experiments
done with just the channel itself. So poor RSSI is the main reason, attenuation,
whatever, to the extent multi [inaudible] exists in the building. This is a building
testbed. Okay?
Feel free to interrupt me with questions. It's a very small audience, and the more
interactive they talk, the better. It's all very simple stuff that I'm going to talk
about.
Okay. So why does TPC perform so badly over 802.11. I will go into these -- I'll
describe some fundamental problems not just with TPC but sort of the TPC/IP
stack itself, the fundamental assumptions that are embodied in this architecture
and this [inaudible] of functionality in the TCP/IP stack.
So the first fundamental problem is the use of end-to-end techniques in wireless
transport. What do I mean by end-to-end? End-to-end means that the transport
treats the network as a block box. The network does not explicitly assist
transport in its objectives. So I'm calling them as end-to-end techniques.
So the first problem with end-to-end techniques is they're a rate control. Rate
control is fundamental to TPC, but rate control is problematic in wireless
environments for two reasons. One, the environment itself is fluctuating a lot. So
TPC responds. TPC behavior is completely defined by the loss rate or the loss
and the delay signals observed by each packet. But these fluctuate a lot, and
TPC performance decreases with this fluctuation.
The second is, even -- it's hardly even distinguished between what losses are
caused due to congestion and what losses are caused due to other effects, like
channel effects or interference-like effects. So these are well known problems
with rate control over wireless networks. These have been asserted for over a
decade.
Okay. The second fundamental -- sorry. The second problem with end-to-end
techniques in wireless is one with retransmissions. End-to-end retransmissions
are wasteful. For reliable transport you need -- for end-to-end reliability you need
end-to-end retransmissions, but then what end-to-end retransmissions do is if a
packet goes, say, two hops into the network and then gets dropped because let's
say 802.11 gave up or a node crashed or because of whatever reason, then
you're going to retransmit the packet along the whole way. Right?
But ideally you could have retransmitted just from the point P all the way up. The
destination. We don't do this because packets are small, they're harmless. We
just retransmit them. But these overheads catch up over time.
The final problem is the very assumption -- yes?
>>: The retransmission point actually is -- I mean, there is a lot of [inaudible]
transmissions. So end-to-end -- the thing is with end-to-end retransmissions, it's
there as a safety mechanism that I'm guessing nobody would argue for taking it
away. So I guess what you're probably arguing for is a different balance between
like [inaudible] retransmissions. So sending retransmissions, for instance, is not
enough.
>> Arun Venkataramani: Correct. We don't know how to send this number
exactly. You can make this hundred and there are problems with that. You
could make this 102, there are problems with that. We don't know where to set
this number to. There are problems with each setting.
>>: Sure. So, actually, I don't know -- have people looked into sort of, for a
single wireless op, just doing adaptive link layer retransmissions?
>> Arun Venkataramani: Right. So this is not an easy problem because when
you say adaptive, presumably the number of retransmissions on each link
changes according to some network-wide objective. And this, as far as I know, is
not an easy problem, and I don't know if people are trying to do this for link ARQ
transmissions.
What's done in practice commonly is that people accept that TPC is great. It's
the universal transport protocol, and therefore by itself, without link layer ARQ
transmissions, it's going to suck terribly in wireless and so we're just going to
make the wireless environment good enough for TPC to work as if over wireless.
And so they just pick a number like 10 or 11 in typical implementations.
Setting it to a really high value doesn't help, because you're going to end up
underutilizing resources. So you end up doing retransmissions in bad links too
much and setting it to a low value obviously is not enough because then TPC
performance will suck. Okay?
So the third problem with end-to-end techniques the very assumption that an
end-to-end route exists between the source and destination. This may not be the
case in mobile [inaudible] networks or an extreme point in the design space
which is disruption to all known networks, or DTNs.
In fact, this whole protocol, Hop, is primarily designed with mesh networks or W
lines in mind, but then it's inspired by DTNs. By looking at an extreme point in
the design space, by looking at what works when there is no route available ever
between the source and destination, that is, the network is always partitioned, by
looking at what works there, we're able to come back to mesh networks and get
significant performance improvements even in well connected mesh
environments.
So this is a reason to look at DTNs not just because they're interesting as
[inaudible] would have it, for interplanetary communication but for real benefits
and real environments.
Okay. The second fundamental problem with wireless transport today is the high
overhead associated with treating a packet as a unit of control.
So what do I mean by this? A packet is a unit of control or reliability at the link
level. So ARQ is per packet. Channel access is per packet. So you're paying
some overhead for all of this.
And, yes, 802.11 is designed to keep these overheads low, but let's just look at
802.11 as an example examine see how these overheads catch up.
Channel access overhead is non-trivial. You have to wait until you get access to
a channel, there's a DIFFs [phonetic] overhead and then if the contiguous
segment of DIFF sensing doesn't go through, you back off and then you try again
and then there may or may not be an RTS or CTS overhead per packet
depending on how you configured your network. So there's significant channel
access overhead.
The second is link layer ARQ. Link layer ARQ is a necessary to make TPC work
as is like in wired networks on wireless networks, but then ARQ introduces its
own overhead. Every time a packet transmission fails, you're going to back off
for increasing amounts of time and then try again. This increases the overhead
per packet.
These overheads are not trivial. I'll show you some numbers. These add up to
like 60 percent or 2x of an overhead in real environments under load. Actually,
both with and without load. Sorry.
Okay. The third fundamental problem with wireless transport is the complex
cross layer interaction that it introduces. So as an example -- I'll show you many
examples of such cross layer interaction through this talk, but the simplest
example is that of TPC rate control with the variable [inaudible] round trip time
introduced because of link layer ARQ. Every time a packet transmission fails
you're going to again sense carrier, back off, and try again, and you will keep
backing off exponentially. How much you back off depends on the
implementation specific details, but then it introduces a significant amount of
variability.
You can tune TPC retransmission time or destination estimates to specific
environments, but then that leaves so much there to play with. I mean, you can't
you do much better than TPC once you take that approach.
So this kind of a cross-layer interaction is a fundamental problem. ARQ is an
example. Bit rate control is a -- wireless bit rate control is another example.
You're doing rate control at two different levels and they interact negatively.
So that's the motivation for why we need to think from a clean slate perspective
about wireless transport.
I'll next show you a protocol, Hop. First its design and then experimental
evaluation over a mesh testbed we have at UMass in the computer science
building, and then I'll present some high-level arguments for why the key idea in
Hop, which is switching blocks or big chunks of data as opposed to small
packets, has fundamental benefits for the design of a network protocol stack
that's robust across a variety of different wireless environments.
Okay. So the key idea is in Hop. So Hop is a clean slate redesign of wireless
transport. The three key ideas basically build upon the three motivational points I
introduced. End-to-end -- problems with end-to-end techniques, treating packets
as a unit of control, and then complex cross-layer interaction.
The solution for the problems with end-to-end techniques is to go to hop-by-hop
transport protocol. You don't have end-to-end rate control anymore. You have
hop-by-hop congestion control, which I'll show is significantly more robust than
end-to-end rate control.
To deal with the second problem, packets as a unit of control, that's where the
key idea of blocks comes in. What is a block? A block is simply a contiguous
segment of data. Just think of it as a big packet at a high level, but the difference
is that it's not a big link layer frame, it's a bunch of packets for over which
overhead of ARQ and channel access gets amortized. So you're sending these
packets in a block in a burst mode, and that's what's amortized is the overhead.
And this overhead head is significant. It's one of the important points in this talk.
So address complex cross-layer interaction, for the lack of a word, I'm going to
call this minimalism. We try to remove redundant functionality across layers, like
ARQ, redundantly add link layer at the transport layer, rate control at the link
layer, at the transport layer. So try to minimize this to the extent possible.
Okay. So Hop consists of five main components. The bottom part are the
per-hop or components that give you single hop performance benefits. The top
two components give you performance benefit over multi-Hop networks.
The most fundamental component of Hop is this building block that I'll call a
reliable -- a per-hop reliable block transfer to reliably one block, transfer it to the
next Hop along the route.
Here's where you get most of the benefits of aggression or blocks as opposed to
packets.
So for a protocol to have end-to-end reliability, it's not enough to just do per-hop
reliability. You need some end-to-end acknowledgments as well. This Hop
accomplishes using a novel scheme called virtual retransmission. Any reliable
transport protocol must do condition control. Otherwise, it will just be inefficient.
It will under utilize network capacity.
But instead of doing TPC-style AIMD end-to-end rate control, we do
backpressure. Backpressure has some very attractive properties. It's been well
studied, and it's much more robust because it's a hop-by-hop condition control
scheme, and it obviates the problem of rate estimation. I'll show you how it does
that.
These two components are per-hop components. The ACK withholding
component is a scheme that comes for free with the basic design of Hop itself,
and gives a significant savings in the presence of hearing [phonetic] terminals. If
you're blasting blocks without -- a huge chunk of data without doing carrier
sense, in 802.11-like environments they're going to exacerbate the hearing
terminal problem, and ACK withholding is a simple solution that sort of is more or
less overhead free.
The last component is for delays. It's great to show significant throughput
benefits for bulk transfer, but TPC is TPC because it's universal. It works for
SSH, SMS, and what have you. And Hop achieves these delay benefits by doing
prioritization based on the block sizes.
Yes?
>>: So you're using backpressure [inaudible] complexity and make this more
[inaudible] backpressure request signaling [inaudible].
>> Arun Venkataramani: Good point. I don't have a good defense for that. As I
said, for the lack of a better term, I'm using the word minimalism. What I
intended there was to remove redundant functionality to the extent possible and
minimize interaction across layers. So wireless bit rate control, TPC rate control,
ARQ the link layer, reliability of the TPC layer.
So these control loops at two different layers interact negatively, and that's the
intention, trying to remove as much features as possible. If you're going to do it
at a higher level, then just do it at the higher level and don't worry about doing it
at the lower level.
Yes, backpressure introduces some signaling. But I'll show you that this
signaling is -- it gives us significant benefits even in a single hop network. The
signaling overhead is really, really small, and that's one of the contributions of
this work, making the signaling overhead negligible. Because backpressure as
an idea has been around for almost two decades now, but the big stumbling
block is to translate those throughput optimality properties that they recite on
paper to a practical implementation. And this is one such effort, and there were
other such efforts, recent efforts, trying to do the same.
Okay. So let's begin with the most basic building block of Hop, the per-hop
reliable block transfer component. The mechanism has two key ideas, burst
mode and a block-based ARQ scheme.
Burst mode just means that you send a batch of frames as a burst without doing
carrier sense in between. So without the channel access overhead in between.
We call this a TXOP, or a transmit opportunity. This exists in 802.11, standard
802.11 implementations. In fact, in more recent 802.11 standards like n, this is -it's particularly optimized for this kind of a burst mode transfer.
The second idea of a block based ARQ scheme reduces -- amortizes the
overhead of ARQ by spreading it over a batch of packets inside a block as
opposed to paying that overhead for each frame.
Yes?
>>: [inaudible].
>> Arun Venkataramani: You're saying will exacerbate the fade sharing problem
by going to blocks? You also said something about partial backup recovery. Let
me say that we're not -- partial backup recovery techniques are orthogonal and
can recite underneath a protocol like Hop.
We're looking at -- we're not -- much of Hop's design is at the link layer or above.
So pick whatever frames as you want, mistake the link as good as you like using
whatever techniques under the link layer you like, and then we will reduce the
overhead using Hop compared to the combination of 802.11 and TPC.
And as for the fade sharing point, it is true that wireless losses are often bursty,
that if one packet gets lost, it's quite likely that the high loss rate will remain for
some time or the channel effects will remain for some time. It's hard to quantify
that other than experimenting. And I'll show you that. It gives you significant
improvements --
>>: [inaudible] confusion in your question I think. Once a packet in a block is
lost, only that packet is retransmitted [inaudible].
>> Arun Venkataramani: Maybe it will become clear when I tell you the bitmap.
Why not just go into this bitmap protocol.
>>: But there's an interesting question related to what you're saying. If your link
layer is so efficient because it's using partial packet recovery and you're using
[inaudible] that you're losing the least possible amount of data, yet you still need
all these things on top, but where is the line?
>> Arun Venkataramani: Good question. I don't know the answer. We won't
know until we experiment with those underneath Hop and see how much benefit
we get out of that, how much benefit we get out of that this.
One thing I would like to point out is the most significant benefits in Hop come
under high load. And so link level techniques will make a single link really good,
but then under high load, TPC has some very fundamental problems that Hop
addresses. So that's orthogonal to optimizations that you introduce below the
link. Hop can incorporate that --
>>: [inaudible].
>> Arun Venkataramani: Right. Backpressure gives you benefits of the network
and transport layer. It's complementary to optimizations underneath the link.
>>: Basically you don't want to transmit more on a packet than a packet can
handle.
>> Arun Venkataramani: Yeah.
>>: How should I think about the tradeoff of doing burst mode where I have
perhaps a bunch of smaller packets being transmitted back to back versus simply
extending the maximum packet size?
>> Arun Venkataramani: Very good question. I'll touch upon this a little bit more
towards the end.
In Hop we use just standard frames 1500-byte frames through all the
experiments. It's not a good idea to increase the frame size too much because
then you're exponentially increasing the likelihood of at least one bit getting
corrupted --
>>: [inaudible].
>> Arun Venkataramani: Right. So you can use -- I'll touch upon this. I'll tell you
how these techniques can be incorporated underneath Hop as well.
So if you do extend where the frame is, then you need to have sort of more
error-correcting information inside it. And with that you can have smaller frames.
So, again, inside a block, you can have smaller frames.
The point here is the channel access and ARQ-like overheads that you pay,
instead of paying it over something that's the size of one packet, pay it over a
much larger chunk of data.
>>: Part of the reason I was asking was because one of the overheads you
didn't mention is the packet preamble, which is actually pretty substantial when
you're sending small data packets.
>> Arun Venkataramani: Right.
>>: This could potentially reduce --
>> Arun Venkataramani: Why don't we keep this question on the stack and we'll
come back to it in the end.
Okay. So what is the block transfer, the basic per-hop reliable block transfer
itself? It starts with a handshake. There's the same packet. There's an
acknowledgment packet. The send is a request to send a block of packets.
An example of a maximum block size is one megabyte. That's the default
implementation in Hop.
The ACK tells the sender what packets or what frames in that block have already
been received. So it's a bitmap. So the sender only has to send the remaining
packets. So this is a multi-round protocol. So in every round there's a SYN ACK,
B-SYN/B-ACK handshake. The B just transferred block.
After the handshake the transmission itself proceeds through a bunch of
transmission opportunities at TXOPs. Each TXOP -- in 802.11, A, B and G, the
implementation that lasts about six to eight frames without carrier sense, but
there is carrier sense in between TXOPs.
We had to do this because we didn't have access to n, which allows sort of a
whole frame -- a really big jumbo frame to go without any carrier sense in
between. But this is how we implement it on A, B, G networks.
So there's a CSMA overhead, there's a TXOP that's about six to eight frames,
and then there's a CSMA overhead between any two TXOPs, but the control
overhead, there's no liability in between. There's no ARQ. There's a SYN ACK
and then one megabyte of data with carrier sensing once every TXOP, which is
one to every six or eight typical sized frames.
>>: [inaudible].
>> Arun Venkataramani: [inaudible].
>>: Oh, okay.
>> Arun Venkataramani: Right. So [inaudible] basically gives you optimizations
called WMM optimizations. We used some of those, both the prioritization
optimizations as well as these TXOPs. The protocol itself is not or the [inaudible]
are not fundamentally different from ABG, so I'm just -- I'm grouping them
together.
Okay. So that's it. I mean, this is at the heart of Hop. This is the building block.
There's no rate control here. There's no ARQ here. Or, sorry, there's no
per-packet ARQ here. The goal is still to transmit the block reliably over multiple
rounds of these B-SIN/B-ACK handshakes and bursts.
Okay. So per-hop reliability, not enough for end-to-end reliability. Hop uses a
novel retransmission scheme called virtual retransmissions. That, again, is
possible because these blocks are big. You can afford to spend more overhead
on big blocks.
Having designed a per-hop reliable block transfer protocol itself or component of
Hop itself, virtual retransmission actually comes for me. The idea here is to
leverage in-network cache ing, so if a block has gone multiple hops into the
network and for whatever reason does not make it further, let's say because of a
node crash, right? The destination in this case is D. Then you want to retransmit
it from the point B, and to do this basically what Hop does is when there's a
timeout at the source -- a timeout is simply a moving average a previously
observed round trip transfer of blocks. It doesn't have to be robust. Any simple
estimate is good enough.
At the timeout a special B-SYN packet -- it's special because of virtual
retransmission flags in the B-SYN packet. You bet back a B-ACK saying, oh, I
already have the packet, because B already has the packet. When B tries this
with E, E doesn't have the packet. So you only retransmit the packet over the
parts of the route that have changed.
So if this is a relatively static network, like our mesh network, then retransmission
overheads and the wasteful end-to-end retransmissions are significantly
alleviated by this.
Yes?
>>: [inaudible] complexity point. And I keep [inaudible] you're waiting for things
like a [inaudible].
>> Arun Venkataramani: The upper layer is the application, because we are the
transport layer.
Let's look at what metrics are important to applications. These are throughput
and delay. Throughput for large files, delay for small files. And that's -- those are
the metrics we evaluate.
So it's a good question. Let's wait for the experiments.
>>: [inaudible].
>> Arun Venkataramani: We'll look at that too. We will look at sort of the effect
on [inaudible] traffic.
Yes?
>>: How many blocks are in flight?
>> Arun Venkataramani: How many blocks are in flight? This will become clear
when I introduce the backpressure component, which is next. There's one block
buffer at each node. That's the default implementation in Hop. And I'll tell you
sort of what role that level of buffering plays in this protocol.
Okay. And finally there is an end-to-end acknowledgment because we want
end-to-end reliability in this protocol. So that's it for end-to-end reliability.
Any reliable transfer protocol must have condition control. If you try using UDP
under load, the utilization of the network would simply collapse.
So Hop uses backpressure to do this. The mechanism is simple to limit the
number of outstanding blocks for each flow at a forwarder. So in this exam, the
number of outstanding blocks is limited to two. So the source, let's say, sends
these blocks. There's a slow link -- the last link is slow between D and E. This
will lead to buffering at the last [inaudible].
At this point, because the limit is two, in this example the limit is two, D will stop
accepting any more blocks from its previous Hop. And so the B-SYNs will keep
getting retransmitted at coarse grain intervals, but then the B-ACK will not be
sent. So there's no extra overhead in doing this. This is just the per-hop reliable
transfer protocol. Or there's minimal extra overhead in implementing
backpressure itself.
And, similarly, this backpressure will propagate all the way back up to the source.
So that's the basic backpressure mechanism.
Why is this useful? Backpressure can give you significant benefits in throughput.
In fact, backpressure is most interesting -- all the theoretical work on
backpressure tells you that it can give you optimal throughput in a network. And
I'll define in a little bit what that means later on in the talk.
But this is an example for why it can significantly increasing utilization in the
network.
So the bottleneck here is the 10 Mbps link, and all the other links are 20 Mbps
except for that one link, which is a bad link, the one Mbps link between F and G.
So if you didn't do anything at all, the blue packets, the blue flow, which is the top
flow, and the red flow is the bottom flow, they'll keep sending packets. And the
effect of throughput is going to be six Mbps because this flow is bottlenecked by
the one Mbps link, this flow, because it shares this link equally with the other
flow, gets only five Mbps, and you'll just have infinite buffering at F.
You can drop those packets, which means you'll have to retransmit them. That
leads to wasted work.
So it's clear here why we need to bound the amount of buffering. If you bound
the amount of buffering to a small value, let's say in this example 1, which is the
default implementation in Hop, then this is what will happen. Blue packets will
keep going to the top link, there will be 1 packet of buffering at each Hop, and
then the blue packets will get all of the 10 Mbps. So the aggregate utilization in
this case is 10 Mbps.
TPC tries to get similar utilization benefits. The difference between what TPC
does and what backpressure does is TPC will simply back off end-to-end
because it will notice that there's a bottleneck link of 1 Mbps. Whereas,
backpressure is doing this in a hop-by-hop manner.
TPC does rate estimation. Backpressure does not do any estimation. It's flow
control at each Hop. It says you can say no versus you cannot say no. So
there's no rate estimation problem like the one that AIMD tries to solve.
>>: But [inaudible] I mean, it's not really rate control. I mean, it is [inaudible] so
there's no explicit rate estimation going on in TPC. You're basically say when
ACKs come back.
>> Arun Venkataramani: Right.
>>: So in some sense there is an end-to-end backpressure there that if your
packets are not making it or if it's really slow because it's ACK based, it has the
same effect, but [inaudible] that effect is kind of like backpressure, but it is
end-to-end [inaudible] when you create that effect on a per-hop basis [inaudible]
but those benefits are not related to explicit or implicit rate estimation because
those two properties are somewhat similar, it seems.
>> Arun Venkataramani: The difference between the way TPC [inaudible] and
backpressure [inaudible] do is TPC tries to do the rate estimation by translating it
to a window estimation problem. So AIMD tells you when you hog the window or
when to increase the window, whereas backpressure does not have such drastic
changes.
>>: [inaudible].
>> Arun Venkataramani: End-to-end flow control.
>>: Right. So [inaudible].
>> Arun Venkataramani: Correct. So the differences between end-to-end versus
Hop. Hop helps make it more robust because you're aggregating all the
fluctuation and noise in estimation over many links if you're trying to do this
end-to-end.
>>: [inaudible].
>> Arun Venkataramani: Large what?
>>: Ethernet networks and [inaudible].
>> Arun Venkataramani: So you sometimes --
>>: You end up doing unnecessary [inaudible]. So if you apply backpressure
[inaudible] then you're basically blocking off all flows that are going through that
switch, some of which may not be either part of the end of the bottleneck, but
[inaudible] someplace else.
>>: [inaudible].
>>: Something like that.
>> Arun Venkataramani: Right. So backpressure as a concept simply refers to
feedback that's propagating at each Hop about how fast you should send or how
often you should send. There's many different ways of implementing this. In fact
the way we implement it in Hop is not how classical backpressure works. ATM
tried to do it in a different way. There's many different implementations. We do
per flow. And you're right, if you don't do it per flow, there's [inaudible] line
blocking issues. We do it per flow in Hop, which means you actually maintain
flow state.
>>: [inaudible].
>> Arun Venkataramani: Yes. This is a wireless network. And, yes, again, this
contradicts, as I'm surprised no one pointed out, how can you call this
minimalistic if you're maintaining state inside a network, which is why I will not
dwell so much on the minimalist point. But we do it per flow because we're not
worried about scalability in a wireless network so much as we are in the internet.
Yes?
>>: There's also the point that nobody really limits you [inaudible] so per flow or
anything that [inaudible].
>> Arun Venkataramani: Right.
>>: [inaudible].
>> Arun Venkataramani: All of Hop's design is --
>>: [inaudible].
>>: [inaudible].
>> Arun Venkataramani: So the short answer is that it gives your senders -- if
you're trying to [inaudible], both TPC and Hop have their own different sets of
problems. We don't address any of those here. Our goal is that of performance
in a benign environment first.
Okay. So the next component is ACK withholding. The benefit here is that it
significantly mitigates the hidden download problem. And again, like
backpressure and virtual retransmission, the overhead comes for free once you
have designed this per-hop reliable transport protocol.
The idea is somewhat similar to RTS and CTS except RTS and CTS have their
own problems. This is the reason why RTS and CTS is typically not turned down
in production networks, because they have a high overhead. The per-packet
overhead is high. It's also conservative. That is, it prevents some transmissions
that could have happened.
The way ACK withholding happens is as the word says. The receiver at each
Hop simply withholds the ACK. You don't send a B-ACK if there's multiple
senders sending you B-SYNs. So you don't send multiple B-ACKs out.
This addresses the common case of single receiver hidden terminals, the classic
three-node hidden terminal problem. So A and B both try to send a B-SYN in this
example, C to Cs, both of them. Let's say it successfully receives both of them,
so the B-SYNs themselves haven't collided in this example. It's going to withhold
one of the B-ACKs, send A to B-ACK but hold B's B-ACK, let A go first and then
let B go next. That's it.
So the B-SYN and B-ACK are playing roles somewhat similar to RTS and CTS
except this is not conservative and the overhead comes for free and is amortized
over a huge block of packets.
Yes?
>>: [inaudible].
>> Arun Venkataramani: Correct.
>>: [inaudible].
>> Arun Venkataramani: Correct.
>>: [inaudible].
>> Arun Venkataramani: Correct. B-SYN is a small packet and it [inaudible]. So
if a B-SYN fails, you have a timeout, and eventually you retransmit the B-SYN.
So you'll corrupt some frames in a concurrently transmitted block. You'll corrupt
one, maybe two frames.
>>: Why don't you send a [inaudible].
>> Arun Venkataramani: Good question. This comes up. I mean, we've played
with different designs. It isn't seem to reduce the overall signaling overhead or
the chances of collisions again. At some point you have to make a call of what
you will send without trying to do an explicit channel access scheme there.
>>: [inaudible] because C doesn't have to [inaudible].
>> Arun Venkataramani: But the catch is D holds can also be lost. What I didn't
tell you what the paper tells you -- what I didn't tell you in the talk is B-SYNs and
B-ACKs [inaudible] turned on because that's are more important than regular
frames.
So you try to make B-SYN and B-ACK transmissions as reliable as you can, but
they can still be lost. And so eventually you need a timeout at the B-SYN and
you have to send the B-SYN again because you don't know if your SYN was ever
received by the receiver. So if you have that, yes, sending a B hold or a
[inaudible] message can help in some cases, but it didn't seem to help much in
our experiments.
>>: So logically B-SYN and B-ACK are synchronous [inaudible] how do you
actually set the timeout for B-ACK?
>> Arun Venkataramani: There is no timeout for B-ACK. There's a timeout for
B-SYN. So you send a B-SYN. Normally you would get a B-ACK right away if
everything is good and if ACK withholding is not happening and backpressure is
not happening. If not, the timeout, I believe it's set to something like a couple
hundred milliseconds, but I don't remember the exactly value. But it's a core
screen timeout.
>>: Can you make the B-ACK [inaudible].
>> Arun Venkataramani: It's possible to do that, yes. Given that the per-hop
level block transfer needs a B-ACK to come before it can start sending the
blocks, I mean, it's implicit. The B-ACK is there any way. But in the specific
example, you could get away with that optimization. There are other cases
where it won't work. Okay?
The last component is for reducing delays. Delays for small transfers. This is
done through just prioritization based on block sizes. There's two components to
this, sender side prioritization and receiver side prioritization.
So let's say SSH traffic is competing with FTP traffic. SSH is sending one byte at
a time. Well, it says 64 bytes here, but it could well be a byte. The sender -- if
the size of the block is less than a threshold, in our implementation this threshold
is 16 kilobytes, the sender will simply send the data along with the B-SYN. It
doesn't wait for the B-SYN and B-ACK handshake.
At the receiver side, the receiver simply prioritizes small blocks over large blocks.
And so you keep the delays of small blocks especially really, really small or
single-packet blocks low.
Because now you have this explicit handshake, B-SYN and B-ACK, you have
this power of the receiver to make a decision -- both at the sender and the
receiver -- to make a decision of how to do this kind of a prioritization. So once
you've defines the per-hop reliable transfer protocol for a single link, all of this
incurs very little overhead because the mechanism is in place already.
With a simple optimization, Hop gives you low delays both for large file transfers,
effectively meaning high throughput, as well as low delays for small file transfers
or even single-packet transfers.
Okay. So all of these ideas are extremely simple. You've probably encountered
them in one way or the other -- at some place or the other. And this will not be
interesting if you're not able to show significant improvements over TPC. And
that's what took us almost two years to show, consistent gains across a wide
range of metrics over TPC for a lot of different work loads and environments.
So that's our testbed. It's a 20-node testbed. It's the second floor of the CS
Building. It's an Mac Mini based testbed. I'll show you experiments with B and G
in this talk. Mostly with B. Some with G.
The first experiment is comparing Hop and TPC's throughput over a single link.
So there's one flow, one link. No competing traffic to the extent we can ensure
that because the experiments were done in the night.
The way this experiment is conducted is that we randomly sample with repetition
links in our testbed. There are about 60 unique links that are active, that work at
all in it testbed. Bit rate control is turned off. It's all set to the max bit rate, which
is 11 Mbps in this case.
And this shows us CDF of the goodput across those links achieved by Hop and
achieved by TPC. What it tells you is that Hop achieves 60 percent better
throughput on the median link compared to TPC in our testbed and even less on
good links. On good links TPC is really good. The median improvement is about
60 percent, which comes from the per-hop protocol.
This is largely attributable to reducing channel access in ARQ overhead and not
doing rate control. There's no load effects here.
What's particularly noteworthy is the bottom quartile links. The first quartile
benefit is 28x. In the first slide I showed you like a 40x difference between UDP
and TPC, and this is how Hop bridges most of that gap.
Yes?
>>: [inaudible].
>> Arun Venkataramani: Right. So that's a good question. To the extent
possible, we tried to pick the best variant of any protocol we were comparing
against. So thanks for bringing this up.
TPC here is also given the benefit of burst mode. So it's not like Hop gets the
benefit of turning off using TXOP. TPC also gets TXOPs. TPC's ARQ is turned
to the maximum possible limit. We set it to 100 on the card, but we noticed that
the max only went up to 20 or so because of some implementation details in the
firmware.
And we found that setting the ARQ on a single -- TPC consistently improves by
using TXOPs because it reduces channel access overhead, and we're setting
ARQ elements to a high value. On a single link it helps. On a multi-Hop route it
helps simply because if a packet goes two or three hops and again gets dropped,
it's more wasted work. So the more ARQ you give to TPC, the better links, the
better routes you present to TPC, the better TPC performs. So it has a benefit.
Which variant of TPC? We tried using the best -- I want to say this is cubic, but
the details are in the paper.
>>: [inaudible].
>> Arun Venkataramani: We used Westwood [phonetic]. It actually performed
about 10 percent worse consistently than the version used here. I'm not sure the
version here is cubic, but -- I can look that up in a second.
Yeah. So there were different options available in links, and this is cubic, and
cubic performed the best in our testbed.
And Westwood is really designed to give you a lot of benefit in the case where
you're going over a single-hop wireless link and then going to the internet. This
is not the environment. We're not worried about the internet here. So it doesn't
actually help us in this environment. We found it didn't help us.
Yes?
>>: [inaudible].
>> Arun Venkataramani: The traffic here is just a long transfer. We're measure
the throughput over a big file. So most of the experiments are throughput
experiments unless I explicitly tell you delay is the metric being qualified.
Okay. So the second experiment is again a single-flow experiment. There is no
load effects here, but over multiple Hops. So we randomly picked source
destination pairs and we only include source destination pairs that are at least
three hops away. The routes are chosen by OLSR, by running OLSR at the
beginning of the experiment.
Here we are see again the first quartile and median benefits are significant and
more than a single-hop case, but the low quartile is not as impressive. This is
basically because in this testbed OLSR -- we ran a routing protocol. So unlike
the previous experiment, which was a distribution over all links, all working links
in the testbed, this is using only the good links. So the benefit of over TPC of
Hop is lower.
Yes?
>>: How many hops and how many [inaudible] wireless?
>> Arun Venkataramani: At least three hops. And the number of pods is, I
believe, hundred, with repetition. So what's being chosen is a nodes that are at
least three hops away randomly.
Okay. So let's take a more detailed look into this robustness behavior. What
exactly is happening in this lower quartile. Why does TPC degrade? What
happens to grace under pressure for TPC and Hop?
So this is a very simple experiment where we modify the driver to drop a certain
fraction of packets after 802.11 has happened, above 802.11, below TPC.
What this shows you is that -- this shows you the goodput as a function of the lot
rate. Single link, no interference, no load. What this shows you is that at
20 percent loss rate, TPC throughput goes down to zero. This is not surprising.
This is expected. We all expect this to happen. The square root P relationship
holds when the loss rates are small, but N percent loss rate is disastrous for
TPC.
Hop, interestingly, shows a near linear degradation, which is about the best you
can expect. I mean, the 1 minus P factor is about the best you can expect if a
loss rate of P is what the channel gives you.
And so this is where most of the single link or single hop benefits of Hop are
coming from. No rate control, removing unnecessary overhead.
>>: This is a very strange place in this talk where induced loss, it seems,
because you're not letting link layer recovery take over.
>> Arun Venkataramani: Correct. So the previous experiments actually showed
you what link layer recovery. We also did some experiments where we tried to
actually make links bad by causing interference by use a jammer. I don't have
those experiments right now. But this really brought out the point, just the
transport-level point, which is -- imagine a link that despite all of its optimizations
gives you a certain loss rate. How does transport behavior depend on that loss
rate?
>>: [inaudible] I've never seen like 10 percent loss rate at the transport
[inaudible].
>>: [inaudible].
>> Arun Venkataramani: Yes. That's for a single link. When you aggregate it
over multiple links, over multiple paths, those losses are going to exponentially
aggregate.
>>: So the jammer experiment, did it look like this or did it not look like this?
>> Arun Venkataramani: There's a lot of variance in the jammer experiment.
That's one of the reasons why we didn't include it in the paper. It doesn't show
you as clear cut a difference because it's hard to get the jammer to work correctly
with Hop and TPC. But it shows a significant -- Hop is significantly more robust
than TPC with the jammer, yes.
>>: [inaudible].
>> Arun Venkataramani: Correct. The biggest numbers represent, which is, I
guess, the next slide -- this is all about fairness. This is where Hop really shows
its resilience over TPC, its performance and fairness implement over TPC.
So let me tell you first how this experiment is conducted. This is a load
experiment. So we saw single flow, single hop, single flow multi hop, and this is
multi flow multi hop. 30 flows over multi hop ports.
And like before, the CDF here is slightly different from the ones we saw before.
This is a CDF off the 30 flows. So this is about the bottom seventh flow, and
that's the third flow, this is the 15th flow. It's plotting the distribution of flows, all
done at the same time.
What this shows you is that the median flow in Hop gets a 20x improvement in
throughput and two orders of magnitude better throughput compared to TPC for
the lower quartile.
The purple line here, we included this line here as opposed to the previous ones
because it gave more improvement -- it performed better than TPC. This is the
hop-to-hop version of TPC. It's more robust than TPC because instead of doing
end-to-end rate control, it's stitching together a bunch of TPCs. And this is an
idea that's been floating around for a while. It's a common optimization to make
TPC rate control more robust.
We implemented this, and interestingly, this implementation also gets some
benefit of backpressure because the sockets at each Hop effectively are doing
flow control on each TPC. So it has some benefit of backpressure.
So it does give you better improvement. It gives you about 4x of a median
improvement than TPC. The number -- Hop compared to TPC is actually 90x.
Hop compared to the hop-to-hop TPC is 20x. So it gives you improvement, but
it's not quite enough. There's still a lot of room for improving fairness.
Why is this fairness? Because it's a distribution of flow throughputs. So the
question to ask is what happens to the mean throughput or the aggregate
throughput if some of the throughputs are [inaudible]?
Here it's very hard to beat TPC. This is where TPC's strengths on are -- I haven't
seen a protocol that can beat TPC that has been shown to beat TPC under
heavy load. This is very difficult to do.
Hop gives a very -- gets a very slight improvement, 10 to 15 percent mean
throughput improvement, which is not significant. And the reason is because
TPC's rate control is very unfair. What it does, it kills off most of the flows in back
ports and lets a small number, actually two or three flows out of the 30, to go and
blast at a high speed. And so its aggregate throughput remains high, but it's very
unfair. Many of the flows get near zero throughput. This is a log scale, the x
axis.
Questions?
So this is the centerpiece. This is where Hop shines, really. So the biggest
numbers come as fairness gains, not as mean or aggregate throughput gains
under heavy load.
Yes?
>>: Aggregate retransmission, do you know if any analysis of that makes a
difference?
>> Arun Venkataramani: Aggregate retransmissions?
>>: Just on a per -- if you added up all the devices [inaudible] how readily an
access -- is it showing the same kind of fairness? I'm just trying to understand if
-- [inaudible] I'll talk to you later.
>> Arun Venkataramani: Okay. So the performance are being done. Hop has
five different components. Which component gives you how much of a benefit.
I'll show you the micro bot [phonetic] prioritization separately because it backs
delay. But the main component in Hop are really ACK withholding for hidden
terminals and backpressure for congestion control.
What this experiment shows is the goodput for three different experiments, one
with 10 concurrent flows, 20 concurrent flows, and 30 concurrent flows. So ACK
withholding gives you 20 percent and 20 percent, and backpressure respectively
gives you 20 percent each, and they add up to a 40 percent improvement with
the load is low.
When the load is really high -- so under low load, they're both important. When
the load is high, backpressure is a significant contributor. ACK withholding by
itself gives you 2.5. Backpressure by itself gives you 3.7. They both together
give you 4.x improvement over the default hop. This is not TPC. This is a
default hop with no congestion control and nothing for hidden terminals.
The one thing I should mention is here. We also did UDP, of course, and
variants of UDP as well. It's not worth showing UDP on this graph. I mean, it's
just -- most flows get zero throughput because a congestion collapse happens.
So one lesson here is when you show load experiments even with three to five
flows, UDP is not the protocol to compare against. If you want to compare
throughput under load, TPC is the protocol to compare against, and it's very hard
to beat TPC under load.
Okay. So under load, backpressure is the most important contributor.
The experiments, all of them so far, were over this -- for a mesh-like
environment. I'll show you next the benefits over a typical W-line environment.
Here the experiment is that a bunch of senders are sending to a common AP.
So you compare Hop against TPC and TPC with RTS/CTS turned on. So there's
seven senders here. Six pairs form hidden terminals in this example.
What you can see is Hop's mean throughput is, again, not much more than
TPC's mean throughput. TPC is unfair. The median is 244. Hop's median is
652 because it kills off most of the flows and lets one or two flows go ahead and
blast.
TPC with RTS/CTS turned on reduces the mean throughput but makes it more
fair, as expected.
What happens with hidden terminals -- actually, TPC has an interesting solution,
if you will, for hidden terminals. It will just kill off some of the flows and very likely
just let one flow go ahead. This is TPC's solution for hidden terminals. So the
throughput remains high, but the fairness is affected.
Low day -- yes?
>>: I have a question on the previous slide [inaudible].
>> Arun Venkataramani: How this is -- this experiment is just with one AP and
seven senders.
>>: No, what I mean to say, if I were deploying [inaudible].
>> Arun Venkataramani: That's a good question.
>>: I think it's like --
>> Arun Venkataramani: There's a few on each flow that we have in the building.
>>: [inaudible] done measurements off my deployed [inaudible] I find it really
hard to actually come across hidden terminals because the AP density tends to
be high, it used to be on different channels, everybody's actually in some sense
in the same contention domain in realistic deployments.
>> Arun Venkataramani: Right. So in our building with the default transmit
power levels, it's about two to three hops of a diameter going from this end to that
end because there's walls in between and whatever other environmental effects
come into the picture.
So hidden terminals -- we actually checked that six pairs formed hidden
terminals, which is when they're sent simultaneously, the throughput is really,
really small --
>>: So what I meant to say is if you actually do experiments with realistic
[inaudible] you will not see such high [inaudible] on hidden terminals among
[inaudible] that are far away are more likely to connect to different APs and they
also are likely, they're coming close enough but not far enough, they're also likely
to be on different channels.
>> Arun Venkataramani: I think we haven't experimented with those scenarios.
So channel assignment as well as multi AP settings we haven't experimented
with.
Okay. Small delays. So in this experiment there's one small transfer competing
with four large transfers, and the x axis here is the size of the small transfer, the
y axis is the delay. The different bars are the delays. This is on a large scale.
The right line is Hop -- Hop consistently achieves lower delays for all sizes than
TPC. So delay is not being affected, and this is simply because of the
microblock prioritization scheme. So the sender just sends it if it's less than
16 kilobytes and then the receiver prioritizes small blocks over large blocks.
Okay. So I'll skip a few of the other experiments.
Hop is designed for reliable transport protocols. So we don't expect Hop to be
used for wide traffic, but we did evaluate the impact of Hop because it's sending
this burst of packets on VoIP. It turns out that Hop doesn't effect VoIP much -- it
affects VoIP quality slightly more than TPC, but it's not significant.
And Hop's benefits -- these are all A22B [phonetic] experiments, but over 11g
also helps benefit remain the same.
I will show you this experiment because it's relevant to the diverse networks
theme.
Hop is a hop-by-hop protocol. So if an end-to-end route is not available, Hop will
continue to function. TPC will completely break down in a DTN environment.
This is an experiment where we have seven nodes. 2 and 4 cannot talk to each
other without going through 3. Periodically we just bring down 3 and then 5 and
then 3 and then 5 and so on. So the network is always are partitioned. 1 and 7
are always partitioned. So you need a destruction-tolerant protocol to
accomplish transfer from 1 to 7 in this case.
And Hop, with different levels of the backpressure threshold, gives you different
levels of performance. Setting it to a higher value of buffering here gives you
better performance because these nodes are down for a longer time.
The comparison point is DTN, not TPC here, because TPC doesn't work in this -it's not designed for this environment. And DTN 2.5 is TPC-based. So it uses
TPC to transfer a file at each hop. So the benefits we see here are sort of
commensurate to the single hop benefits we saw over TPC earlier.
What we need to do is to look at high route flux environments where -- not a
simplistic experiment like this, and that's when we'll know if Hop really gives you
significant benefits when a network is in a lot of flux.
What we did do is study network and link layer dynamics in the mesh testbed we
have. So there's something interesting going on here. The red line again is Hop,
and we turn on OLSR, the routing protocol. Instead of fixing statistic routes, we
keep it on. So OLSR is working, recomputing, adapting to load all the time. So
there's all these cross-layer interactions with the routing protocol also going on
now, as well as bit rate controls. So the bit rate control algorithm is just a sample
rate algorithm that comes -- that's the default on the [inaudible] drivers.
>>: What link metric is [inaudible] using?
>> Arun Venkataramani: ETX [phonetic].
So interestingly, OLSR improves TPC fairness. And we don't quite understand
why this is happening. And this sort of brings out why generally analyzing
cross-layer interaction is difficult. But we believe that OLSR periodically shuts off
some flows. So some high-performing flows get shut off so the other flows get to
go forth. So this improves fairness.
Bit rate control slightly degrades performance. By turning on wireless bit rate
control you do the search for bit rates and you end up choosing a here bit rate.
And this again shows you an example of an interaction between bit rate control
and TPC rate control. So the means of TPC now if below. So the two points on
the left are with bit rate turned down to two points higher up or without bit rate but
with or without OLSR. And then Hop is the red line, as before.
The benefits with all of these turned down is less. It's about 4x in median and
13x for the lower quartile.
>>: But if your routes are changing and if you have a buffer [inaudible] changes,
you won't have the benefit of [inaudible].
>> Arun Venkataramani: You'll have the benefit only on the part of the route
that's common. And so the numbers you see here are inclusive of those effects.
So here in B bit rate control doesn't help much. In G bit rate control helps,
because G has a much wider range of bit rates to choose from, and so the purple
line is the one TPC with bit rate, this is default TPC and that's Hop set to a fixed
bit rate. Actually, the maximum bit rate.
So Hop doesn't get the benefit of bit rate. It's just set to the maximum bit rate in
this experiment. And the benefits are less compared to what we saw before, the
median and lower quartile benefits.
The reason why we didn't actually experiment with Hop with bit rate control
turned on is because bit rate control depends on acknowledgments, and Hop
doesn't use acknowledgments. So we need to devise -- this is one of the next
things to do, to devise a bit rate control scheme that can actually benefit from
Hop's design.
And I will give you -- should I stop?
>> Ratul Mahajan: Let's say five more minutes or something.
>> Arun Venkataramani: Okay.
So one of the reasons -- let me skip over related work. There's nothing here to
say except all of the ideas are bread and butter ideas that people know about.
We just put them together into a package that performs significantly better.
So the reason I'm interested in this, like I said before, is to look at this extreme
point in this design space, which is DTNs, where it's a high level of mobility and
then come back and then design a network protocol stack that gives you high
performance and maintains robustness of the performance across a variety of
different environments. So WLANs and meshes at one end of the spectrum and
DTNs on the other end of the spectrum and all points in between.
Why may you want something like this? I don't know. This is a thought
experiment where you have a town mesh network, let's call this town Amherst,
you have a network of busses, let's call this DieselNet, and then you have a
sensor network deployed along the river that's close to a bus stop, and the
busses have 802.11 and even 3G on them. This is along the Fort River. And
you may just want to hook up all of these to -- one of the goals of the DieselNet
deployment and the town mesh deployment is to save bills for monitoring town
equipment. So instead of using something like 3G you can get a lot of -- you can
save a lot of individual phone bills by using a network like this if you could just put
them all together.
But how do you put them all together if in this environment you have a whole
range of protocols, ETX, ETT, EDR, WC, ETT, RTT [inaudible]. In many
environments of geographic -- or just do a Wikipedia search and it will give you a
hundred different routing protocols. And do the same for DTNs. It will give you
tens of protocols for DTNs. How do you maintain this network?
Yes?
>>: [inaudible].
>> Arun Venkataramani: So the benefits that I showed for the single hop
experiments, it's about 2x to 2.5x benefit. That's the benefit you can expect over
WLANs. There are some optimizations that blocks allow for in terms of how to
do bit rate control better. That's a single hop benefit. But the most important --
the biggest numbers come from multi hoppers. So mesh-like environments
which are seeing increasing levels of deployment, it's anyone's guess if this will
keep growing or not.
And DTNs -- I believe that from a research perspective, it's useful to look at this
extreme point in design space. And an example of why it's useful is Hop. I
mean, the design of Hop is -- Hop is a DTN protocol. It's a better DTN 2.5, if you
will. Except it also gives you significant performance improvements over WLANs
and meshes.
That's the reason to look at it from a research perspective even if DTNs never
see the light of day.
>>: [inaudible].
>>: [inaudible] I think the Amherst town mesh, a lot of people are using it, and it
is [inaudible].
>> Arun Venkataramani: To give you some idea of the numbers, there's 40
busses. If you had to monitor like 50 different town equipment things and you
have a 3G card for each of them, you're paying 50 bucks a month. 50 bucks a
month for 40 to 50 devices adds up to about 25-, $30,000 a year. Setting up a
mesh network is a one-time expenditure of a few thousand dollars. So there are
applications for which even today there's value. How far they'll go is anyone's
guess. It depends on how the technologies evolve.
>>: But that comparison is not quite fair because [inaudible].
>>: Keep in mind most of the data actually generated in an urban code -- the
argument is not you do away with 3G, the argument is like reducing per byte
costs.
>> Arun Venkataramani: To answer [inaudible] question, I will not argue in favor
of mesh networks or even single-hop WiFi access points over 3G. In fact, I don't
think that's the argument to make here. There is value to looking at how these
two can co-exist. There are problems with 3G. 3G is not the solution, and we're
looking at some of these problems too, and we find this is -- some of this work is
collaboration with Ratul as well -- WiFi can augment 3G in a useful way, in a
practical, useful way.
So I'll skip over some of the research and just realize this vision of a network
protocol stack that works over, across environments, and I will go to just bit rate
control, like I said.
So there is an interaction going on between transport level rate control like in
TPC and bit rate control at a lower level. And this is a negative interaction. As
we can see in experiments, in B it hurts and in G it helps, and we don't know
what will happen in N. Seems like bit rate control will be useful as opposed to
not having bit rate control.
But this is the problem. Let's look at what is the problem bit rate control is trying
to ideally. You're trying find a modulation [phonetic] rate r that maximizes r times
1 minus loss rate achieved for that r. That's the problem you're trying to solve.
But if you have ARQ in the system, whose overhead is significant, you want to
keep the loss rate low because at high loss rates, ARQ overhead catches up. So
the space of rates you explore is lowered.
If you have TPC on top of this, you really want to keep loss rates low. So the
space of rates you can explore is lower.
This is an example of an interaction. So ideally what you could do with bit rate
control is now impacted by inefficiencies introduced or constraints placed by
ARQ and TPC. And so doing away optimizing for these two, ARQ and rate
control overhead like blocks do, is an opportunity for doing better bit rate control,
which is work in progress. It can also help detect interference versus channel
losses, so that's a normal block.
Like you guys pointed out in the beginning, you can have these frames much
smaller and get all the benefit of having small frames or the exponential reduction
in bit construction. So think of each fragment or frame here as just something
that's associated with a CRC or some kind of an error-detection entity,
error-detection mechanism.
Once you go to this, once you go to the smallest possible frame size, then there's
no benefit to going to smaller frames because now you have to pick the best rate
that's possible, the best bit rate that works for you, wireless bit rate that works for
you, and that itself is a hard problem. It's a well known problem of distinguishing
interference versus channel loss. You want to adapt bit rate to channel losses,
not adapt to interference losses.
If you go down to small fragments inside a block, none of the protocol above
changes. You get all the higher level network layer transport layer benefits, but
you have these kinds of benefits. You can look at the pattern of corruptions. So
the dark ones are the corrupted frames, light ones are the non-corrupted frames,
and you can distinguish between interference and loss, and you can use this in
your bit rate selection algorithm. All these kinds of things come free in Hop.
Not new ideas, again, but when put inside this package of blocks, they fit in well
into a stack that can give you benefits both in single hop network as well as multi
hop mesh DTN, many like environments.
So I'll conclude with that. I put this because a bunch of people here are working
on wide space stuff.
Hop source code is available at hop.cs.umass.edu. I welcome you to use it.
Please download it. Any transport or any optimizations you come up with for
transport, compare it against this. We're trying to make this as usable as we can.
>> Ratul Mahajan: Thank you.
[applause]
Download