Peers

advertisement
PPSP Protocol Considerations
and Tracker Protocol
draft-gu-ppsp-tracker-protocol-01
Y. J. Gu, David A. Bryan, Y. Zhang, H. Liao
IETF-78 Maastricht, PPSP Session
My Thoughts…
• Trying to share a picture of what a PPSP
deployment might look like
– When possible, want to reuse protocols. Where
can we and what?
• Given an architecture, what does the tracker
protocol for that look like?
– Short overview of the protocol details of
BitTorrent
– These are are own (very early!) thoughts, so
may be wrong, but hope to stimulate discussion
1
What are we trying to do?
• There are two basic protocols (or protocol
operations)
– Tracker protocol
• How to find other peers that share the information
• Focus of this talk, but need to discuss peer protocol a bit to make
sense…
– Peer protocol
• How to get information from those peers once you have found
them
• But we seem to be looking at two different tasks
– Offline/timeshifted media (essentially file sharing)
– Streaming/realtime media
2
Tracker Protocol
• Need some way to locate the peers that are sharing same
content
• Don’t have a direct protocol to reuse from IETF
• Basic idea is that tracker functionality is on a single server,
but could be distributed (DHT)
– Note that this is essentially a client-server protocol. Could distribute
as a DHT underneath, but the tracker-peer operation is basically C/S
– I’ve heard proposal to use RELOAD, this works if the tracker is made
up of a distributed set of peers
– Anaheim meeting indicated interest in tracker as a server, so seems
RELOADs only application here is possibly as an implementation
detail underneath or as an alternate distributed implementation
3
BitTorrent as a Model?
• Approach seems well suited to offline use case
• May not have all the information we want/need for
streaming
– Need to find peers “nearby” in the stream
– Should tracker attempt to do this?
– If not, many peers may need to be contacted to find one
in “right place” (depending on window size, pause, etc)
• Possible issues
– Security (or lack of)
– HTTP approach is somewhat heavy (but easy)
– Do we want to incorporate metadata into tracker (not
offline in torrent file)?
• Need to specify syntax for these metadata files
4
Some BT Basics
• BT’s primary purpose/design is for file
sharing (not originally designed for live
streaming)
• Peers that share a particular file cluster
together to share portions of a file form a
swarm
5
BitTorrent Entities
• Peers: Hosts that hold some portion of the swarm are
peers. Peers exchange blocks, and a set of blocks makes
up a piece of the file
– A Seed is a peer with the entire file
• Tracker: A central server that stores a mapping between a
swarm and the peers participating in that swarm
– Tracker doesn’t store which peers have which pieces, just list of the
peers
– Tracker is located offline…
6
File and Metadata
• Original person sharing the file splits it up
into pieces, and performs an MD5 hash on
each piece
• The list of pieces and their hashes, and the
location of a tracker that will serve peers
sharing this file are placed into a metadata
file called a torrent
• Original user places torrent on a web server,
and subscribes to tracker with all chunks, as
the initial seed
7
Example Startup
1) File is chopped up, MD5 sum generated
Tracker
Chunk1
Chunk2
Chunk3
…
2) Torrent file lists chunks, sums, and tracker to use
for swarm
3) Torrent file is stored on web server
Tracker
Chunk1
Chunk2
Chunk3
…
P
T
4) Peer connects to Tracker with entire file
as seed
8
Example Join Swarm
1) Peer connects to web site and obtains
torrent file to locate tracker
Tracker
Chunk1
Chunk2
Chunk3
…
P
P
T
2) Peer connects to Tracker to find other
peers
P
P
P
3) Peer connects to other peers in swarm
and exchanges chunks
P
9
Peer Exchange
• Peers exchange blocks or chunks
– Swap smaller bits than described in metafile, MD5 verify
assembled chunks
• Simple gossip protocol
–
–
–
–
–
–
Generally unstructured, not a DHT
Connect to peers that may have desired chunks
Exchange bitfield msgs indicating which chunks it has
Request to ask for a chunk, returned in a piece msg
Once a piece is downloaded, advertised with have msg
Interested/not interested and choked/not choked are
used in flow control
– keep-alive message
10
BitTorrent Protocol Details
• Regular HTTP is used to obtain the torrent
from a web server
• The tracker protocol is also HTTP, essentially
GET to ask for a list of peers/join swarm
• The peer protocol is a TCP wire protocol
(binary)
11
Our (Strawman) Proposal
• New -01 version coming soon
– Yingjie Gu, David Bryan, Yunfei Zhang, Hongluan Liao
• Currently propose binary protocol (but open)
– Light weight, aesthetic considerations
– Could also use HTTP with XML or something similar
• Messages to connect to a tracker/disconnect
– Credential verification, verify peer-ID later used by peer protocol
– Credential issuance/peer-ID assignment not (necessarily) by tracker
• Messages to join/leave a swarm (and get list of peers)
– Currently can store location in stream/get peers at this location…may
be hard to implement
• Diagnostics between peers and tracker, keep-alive
messages, query list of swarms from tracker
• Will describe in detail in later talk
12
Peer Protocol Considerations
• New transport is out of scope.
– Offline and Streaming scenarios
• Need to reuse existing protocols
• SRTP/RTP for streaming. ??? For offline
• Should work of LEDBAT be leveraged here?
• Lightweight gossip protocol between peers
– Typical for BT is 20-50 peers, in an unstructured
way
– Is RELOAD suited for this, or will we need
something lighter?
• Try a RELOAD usage and find out?
13
What might this look like?
Offline/Time Shifted Scenario
T
New peer protocol (BT based?) to
find peers, get metadata (not
specific chunks)
P
P
Existing transport
protocol to obtain chunks
from peer (leverage
LEDBAT?)
Lightweight gossip protocol
P between peers to find chunks
(RELOAD usage or new?)
P
14
What might this look like?
Streaming Scenario
T
New peer protocol (BT based?) to find peers,
get metadata, where in stream peer is?
P
P
Existing transport
protocol to stream the
data (RTP/SRTP?)
Lightweight gossip protocol
P between peers to start/stop
stream to other peers
P
15
Protocol Reuse?
• May be many places where we can reuse
existing protocols, but in some cases, using
for things we haven’t done before
– LEDBAT?
– RELOAD for lightweight/gossip protocol (not
DHT)?
– New protocol for Tracker or HTTP with XML
bodies or something similar?
16
Tracker Protocol Proposal
• New -01 version released (quite a few changes)
– Yingjie Gu, David A. Bryan, Yunfei Zhang, Hongluan Liao
• Key changes:
– Changed names of several messages
• Name and semantic meaning not aligning
– Added XML/HTTP Encoding
• Authors don’t view encoding as that important right not -- set of messages and
semantic meaning is what is critical
• Encodings are proof of concept
• This is a basic overview, not a detailed description (can read draft
shortly when we iterate for all details)
• Still very early work -- primary focus is exploring problems through
design/early implementation
• A number of hard questions are being left open for WG input, and I’ll talk
about those today
• This is by no means complete right now -- lots of work left to do!!!
• Authors are very interested in suggestions
17
Messages
• CONNECT/DISCONNECT
– Associate with server, verify credentials
– DISCONNECT removes from all swarms, leaves system
• JOIN/LEAVE
– Participate in a particular swarm (streaming or file for
VoD)
– Possibly a JOIN_CHUNK to allow for specifying where in
a live streaming (but big can of worms…)
• FIND
– Given a swarm, locate a number of peers
– Leaving room to specify a criteria (where in stream (if we
allow JOIN_CHUNK), capacity, possibly certain layers
– Quality (ALTO in the tracker, or do peers do ALTO?)
18
Messages, Continued
• STATs messages
– STAT_QUERY/STAT_REPORT
– Send info to tracker / query about other peers
– Tracker can also poll peers
• KEEPALIVE
– Limit on live time for peers to tracker, so if no requests in
a certain time, refresh connections
– Another option is to expire either CONNECT or JOIN and
require a subsequent call…
• QUERY (do we need this?)
– Search for swarms/list swarms (Tracker protocol or
should this be something else?)
– Not currently in the draft
19
Open Issues/Considerations
• Binary vs. text encoding
– Transport/security mechanism if not HTTP
• Need to define format for metadata
describing the file
• Peer-IDs used in many messages, but
assignment is offline. Do we want a version
of connect that issues IDs?
• Response of list of peers depends on peer
protocol used -- IP address vs. Peer-ID only
• NAT traversal needs to be considered
20
Metafile
• Differs slightly for streaming/VoD
• VoD:
– Needs to describe chunk format, number of
chunks, break down (sizing), support for layered
encodings, codec
– MD5 sums of blocks (or collections of blocks)
• Streaming
– Codec, chunk size, but likely some (number of
chunks, MD5 sums don’t make sense)
• It is very important to get this right, but hard!
21
Impact of Peer Prot. on Tracker Prot.
• If we use a DHT style peer protocol, with
lookup, then at CONNECT time, the peers
need to insert into the overlay
– Tracker then only needs return peer-ID (after
bootstrap to locate peer when connecting)
• If be use an unstrutured/gossip protocol, not
clear this is ideal
– Random connection to 20 peers in a system of
millions likely means you need to provide an
address
– Or, just use a DHT for routing, nothing else?
22
NAT Traversal
• Easy for managed systems -- server is
placed in a reachable location.
• Issue: unmanaged systems:
– Can’t we guarantee tracker in public space
• Bigger issue: distributed tracker
– Peers may very well be behind NATs
– Fully distributed RELOAD may solve some of
this
– If just a few “super-peers”, how do we decide
promotion (NAT detection is provably
difficult/impossible…)
23
Conclusion
• This is early work -- much still to do
• Decoupling the design of the peer and
tracker protocols may help with design, but
some aspects are intertwined (for example
peer list structure)
• Very much need advice drawn from existing
implementations
– Biggest question: Do we have the right
commands?
24
PPSP Peer Protocol
draft-gu-ppsp-peer-protocol-00
Y.J. Gu, David A. Bryan
IETF-78 Maastricht, PPSP Session
Overview
• Draft covers:
–
–
–
–
Requirements for Peer Protocol
Simple example of a possible flow
Discussion of open issues
Skeleton (currently empty) for new protocol
• Define binary and text strawman proposals such as in tracker
protocol?
• Very early work (less developed than tracker
protocol, but we will be extending/expanding)
– Conversations with some vendors with deployment
experience – including some since -00
• Main emphasis currently around requirements and
open issues
26
Peer Protocol Requirements
• Location/Connection Requirements
– Req 1: Once a client has a peerlist it SHOULD
be able to locate peers and connect to them with
no or minimal Tracker's assistance.
– Req 2: The Peer Protocol MUST provide a
mechanism for peers behind different
NATs/Firewalls to connect with each other.
27
Peer Protocol Requirements
• Information Exchange Requirements
– Req 3: The Peer Protocol SHOULD enable
peers to request/return/exchange peerlists.
– Req 4: The Peer Protocol SHOULD enable
peers to request/return/exchange data availability,
e.g. bitmap of chunks.
– Req 5: The Peer Protocol MUST be able to carry
different data structures for different applications.
28
Peer Protocol Requirements
• Transportation/Negotiation Requirements
– Req 6: The Peer Protocol MUST be able to
negotiate a transportation protocol that both
peers can support.
• Security Requirements
– Req 7: The Peer Protocol MUST guarantee
peers' privacy.
– Req 8: Peers SHOULD be able to verify the
identity of remote peers.
29
Protocol Overview
• Currently looking at peer protocol in two logical parts
– Location Portion
• Locate and connect with Remote Peers
– Signaling Portion
• Get additional peerlist (optional?)
• Exchange data availability (e.g. bitmap)
• Negotiate transport mechanism (e.g. protocol, port…)
– Could be two protocols…
– Actual transfer is yet another aspect, but may be out of scope here
(beyond negotiation using signaling portion)
30
Location Protocol
• Ties in intimately with tracker protocol
• The decision of identity type included in the
peerlists in the tracker protocol will influence
the choice of peer location mechanism
– IP address, Peer-ID w/DHT, other?
• Effectiveness/decision on these options will
influence the design of the tracker protocol.
31
Candidate Protocols
• Candidate protocols include RELOAD, SIP, a new protocol,
combination…
– RELOAD
• In theory, works for unstructured, but untested.
• Could use structured simply to connect peers/create connections?
– SIP
• Potentially heavier than RELOAD.
• No direct mechanism to support location by Peer-ID?
– RELOAD w/SIP
• Use RELOAD to locate/connect, SIP to negotiate?
– New protocol
• Single new protocol for location and negotiation? New for only some
functions with reuse for others?
32
NATs and Choice of Identifier
• Big issue in identifiers: NAT
• In current network, most peers are behind NAT.
• If IP addresses are used, ICE/STUN/TURN server functions
may need to be deployed in PPSP system:
– Tracker: Natural server/relay location, but will increase the burden on
tracker and need long-term connection between peers and tracker.
– 3rd party server?
• If Peer-IDs are used (particularly reusing RELOAD) can
potentially leverage ICE support, add tracker as a
relay/STUN server when needed?
33
Conclusions
• One of the biggest questions – and one that
has strong bearing on the tracker protocol is
the use of IP addresses or Peer-IDs (and a
mechanism to locate them)
• NAT traversal is critical
• Need to identify set of operations, determine
if reuse is possible
• Perhaps decide on identifier issue, then work
on tracker protocol first?
34
Acknowledgements
• We would like to acknowledge the following
for providing advice/suggestions/questions:
– Roni Even, Yunfei Zhang, Hongluan Liao, Ning
Zong, Daniel De Vera, Matias Barrios
35
Backup Slides
HTTP/XML Approach
• HTTP POST method used
– May not be best approach
• Bodies are encoding in XML
• Tags defined in draft
37
Encoding, Transport for Binary
• Currently, we have proposed a binary protocol
– Light weight, low bandwidth for mobile devices
– Basic ideas would apply to other encodings
• Right now, the transport is left unspecified -- looking for
feedback from group
– Since this is essentially a client-server like operation, preference is
on a persistent secure connection approach (TLS/DTLS)
• May want a different approach if we do a distributed tracker
– Is there a good reason to use DTLS (UDP)?
• Fragmentation mechanism borrowed from RELOAD
(first/last bits, offset)
38
Basic Shared Header
0
1
2
3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
PPSP Tracker Protocol Token
|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
Version
|
Reserved
|
Method
|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
Transaction ID
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
Fragmentation
|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
Message Length
|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
39
Example Specific Request (JOIN)
0
1
2
3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
PeerID (160 bits...)
...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
SwarmID (128 bits...)
...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
Expiration Time
|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
40
Messages/Responses
• Transaction ID used for
correlation/retransmission
• Responses are numeric codes, optionally
with bodies (for example when requesting a
list of peers)
• Currently a pure request/response protocol
– No need for anything else so far
41
BitTorrent References
• Official Protocol Specification (very limited!)
is at http://bittorrent.org/beps/bep_0003.html
• Unofficial Specifications (much more
detailed) at theory.org:
http://wiki.theory.org/BitTorrentSpecification
42
Download