Protocols

advertisement
Protocols
SPL/2010
1
Application Level Protocol Design
●
atomic units used by protocol: "messages"
●
encoding
●
reusable, protocol independent, TCP server,
●
LinePrinting protocol implementation
SPL/2010
2
Protocol Definition
●
●
set of rules, governing the communication
details between two parties (processes)
different forms and levels;
●
●
●
protocols for exchange bits across a wire
protocols governing administration of super
computers.
application level protocols - define
interaction between computer applications
SPL/2010
3
Protocol Communication Rules
●
●
●
syntax : how do we phrase the information
we exchange.
semantics : what actions/response for
information received.
synchronization : whose turn it is to speak
(given the above defined semantics).
SPL/2010
4
Protocols Skeleton
●
●
●
all protocols follow a simple skeleton.
exchange information using messages, which
define the syntax.
difference between protocols: syntax used
for messages, and semantics of protocol.
SPL/2010
5
Protocol Initialization (hand-shake)
●
●
communication begins when party sends
initiation message to other party.
synchronization - each party sends one
message in a round robin fashion.
SPL/2010
6
TCP 3-Way Handshake
●
●
●
Establish/ tear down TCP socket
connections
computers attempting to communicate can
negotiate network TCP socket connection
both ends can initiate and negotiate separate
TCP socket connections at the same time
SPL/2010
7
TCP 3-Way Handshake (SYN,SYN-ACK,ACK)
SPL/2010
8
●
A sends a SYNchronize packet to B
●
B receives A's SYN
●
B sends a SYNchronize-ACKnowledgement
●
A receives B's SYN-ACK
●
A sends ACKnowledge
●
B receives ACK.
●
TCP socket connection is ESTABLISHED.
SPL/2010
9
HTTP (Hyper Text Transfer Protocol)
●
●
exchanging special text files over the
network.
brief (not complete) protocol description:
●
●
●
synchronization: client initiates connection,
sends single request, receive reply from server.
syntax: text based, see rfc2616.
semantics: server either sends to the client the
page asked for, or returns an error.
●
SPL/2010
10
What next?
●
●
syntax and semantics aspects of protocols.
assume: synchronization works in round
robin, i.e., each party sends one message at a
time.
SPL/2010
11
Message Format
●
Protocol syntax: message is the atomic unit
of data exchanged throughout the protocol.
●
message = letter
●
concentrate on the delivery mechanism.
SPL/2010
12
Framing
●
streaming protocols - TCP
●
separate between different messages
●
●
●
all messages are sent on the same stream, one
after the other,
receiver should distinguish between different
messages.
Solution: message framing - taking the content
of the message, and encapsulating it in a frame
(letter - envelop).
SPL/2010
13
Framing – what is it good for?
●
●
●
sender and receiver agree on the framing
method beforehand
framing is part of message format/protocol
enable receiver to discover in a stream of
bytes where message starts/ends
SPL/2010
14
Framing – how?
●
Simple framing protocol for strings:
●
●
●
●
special FRAMING character (e.g., a line break).
each message is framed by two FRAMING
characters at beginning and end.
message will not contain a FRAMING character
framing protocol by adding a special tag at
start and end.
●
●
message can be framed using <begin> / <end>
strings.
avoid having <begin> / <end> in message body.
SPL/2010
15
Framing – how?
●
framing protocol by employing a variable
length message format
●
●
special tag to mark start of a frame
message contains information on message's
length
SPL/2010
16
17
SPL/2010
Textual data
●
Many protocols exchange data in textual
form
●
strings of characters, in character encoding,
(UTF-8)
●
very easy to document/debug - print messages
●
Limitation: difficult to send non-textual data.
–
SPL/2010
how do we send a picture? video? audio file?
18
Binary Data
●
●
●
non-textual data is called binary data.
all data is eventually encoded in "binary"
format, as a sequence of bits
"binary data" = data that cannot be encoded
as a readable string of characters?
SPL/2010
19
Binary Data
●
Sending binary data in raw binary format in a
stream protocol is dangerous.
●
●
may contain any byte sequence, may corrupt
framing protocol.
Devising a variable length message format.
SPL/2010
20
Base64 Encoding Binary Data
encode binary data using encoding algorithm
●
Base64 encoding - encodes binary data into a
string
●
●
Convert every 2 bytes sequence from the binary
data into 3 ASCII characters.
used by many "standard" protocols (email to
encode file attachments of any type of data).
SPL/2010
21
Encoding using Poco
●
●
In C++, Poco library includes module for
encoding/decoding byte arrays into/from
Base64 encoded ASCII data.
functionality is modeled as a stream "filter"
●
●
performs encode/decode on all data flowing
through the stream
classes Base64Encoder / Base64Decoder.
SPL/2010
22
Encoding in Java
●
●
iharder library.
modeled as stream filters (wrappers around
Input/Output Java streams).
SPL/2010
23
Encoding binary data
●
●
●
advantage: any stream of bytes can be
"framed" as ASCII data regardless of
character encoding used by protocol.
disadvantage - size of the message,
increased by 50%.
(we will use UTF-8 encoding scheme)
SPL/2010
24
Protocol and Server Separation
SPL/2010
25
Protocol and Server Separation
code reuse is one of our design goals!
●
●
generic implementation of server, which
handles all communication details
generic protocol interface:
●
handles incoming messages
●
implements protocol's semantics
●
generates the reply messages.
SPL/2010
26
Protocol-Server Separation: protocol object
●
protocol object is in charge of implementing
expected behavior of our server:
●
●
What actions should be performed upon the
arrival of a request.
requests may be correlated one to another,
meaning protocol should save an appropriate
state per client.
SPL/2010
27
Example: authenticated session
●
●
●
protocols require user authentication (login),
only authorized users can perform certain
actions.
protocol is statefull - serving requests of
client can be in at least 2 distinct states:
1.
authenticated (user has already logged in)
2.
non-authenticated (user has not provided login).
●
by state of the protocol object, behavior of
protocol object is different
SPL/2010
28
Protocol and Server Separation
separate different tasks server must perform.
● Accept new connections from new clients.
● Receive new bytes from connected clients.
● Parse incoming bytes from clients into messages
("de-serialization" / "unframing").
● Dispatch message to right method on server
side to execute requested operation.
● Send back an answer to a connected client
after an action has been executed.
SPL/2010
29
a software architecture that separates
tasks into separate interfaces
SPL/2010
30
●
The key participants in this architecture are:
●
●
Tokenizer - syntax, tokenizing a stream of data
into messages.
MessagingProtocol – semantics, handling
received messages and generating responses.
SPL/2010
31
●
implementations of interfaces:
●
generic server
●
MessageTokenizer
●
LinePrinitingProtocol,
SPL/2010
32
Interfaces
●
1.
implement separation between protocol and
server. Define:
message (can be encoded in various ways:
Base64, XML, text).
●
2.
3.
Our messages encoded as plain UTF-8 text.
framing of messages - delimiters between
messages sent in stream.
protocol interface which handles each
individual message.
SPL/2010
33
ConnectionHandler
●
●
●
server accepted new connection from client.
server creates ConnectionHandler - will
handle all incoming messages from this client.
ConnectionHandler - maintains state of
connection for specific client
●
Ex: user perform "login" - ConnectionHandler
object remembers this in its state
SPL/2010
34
ConnectionHandler - Socket
●
ConnectionHandler has access to Socket
connecting server to client process.
●
●
TCP server - Socket connection is viewed as a
pair of InputStream and OutputStream.
streams of bytes – client and the server
exchange a bunch of bytes.
SPL/2010
35
Tokenizer - in charge of parsing a stream of
bytes into a stream of messages
●
●
Tokenizer interface: filter between Socket input
stream and protocol
Protocol accesses the input stream only through the
tokenizer.
●
●
instead of "seeing" a stream of bytes, it sees a stream of
messages.
Many libraries model such "filters" on streams as
wrappers around a lower-level stream.
●
●
OutputStreamWriter - wraps stream and performs
encoding from one character encoding to another
BufferedReader - adds a layer of buffering around a
non-buffered input stream.
SPL/2010
36
Tokenizer
●
●
●
splits incoming bytes from the socket into
messages.
For simplicity, we model the Tokenizer as an
iterator…
protocol will see the input stream from the
socket as an iterator over messages (instead
of an iterator over bytes).
SPL/2010
37
SPL/2010
38
Messaging Protocol
●
protocol interface
●
wraps together: socket and Tokenizer
●
●
Pass incoming messages to MessagingProtocol
- execute action requested by client.
●
look at the message and decide on action
●
decision may depend on the state
Once the action is performed - answer back
from the MessagingProtocol.
SPL/2010
39
SPL/2010
40
●
●
We use a String to pass data from Tokenizer
to Protocol, and back from Protocol.
Serialization/Deserialization (encode/decode
parameters to/from Strings) performed by
Protocol - and not by the Tokenizer.
●
Tokenizer is only in charge of deframing (split
bytes into messages).
SPL/2010
41
Implementations
SPL/2010
42
Connection Handler
●
active object:
●
●
●
handles one connection to one client for the
whole period during which the client is connected
(from the moment the connection is accepted,
until one of the sides decides to close it).
modeled as a Runnable class.
SPL/2010
43
Connection Handler
●
holds references to:
●
●
●
TCP socket connected to the client,
Tokenizer
an instance of the MessagingProtocol.
SPL/2010
44
●
●
●
connection handler is generic, works with any
implementation of a messaging protocol.
assumes data exchanged between client and
server is in form of encoded strings
encoder passed to constructor as
an Encoder interface.
SPL/2010
45
SPL/2010
46
What’s left?
●
●
only need to implement:
●
specific framing handler (tokenizer)
●
specific protocol we wish to use.
continue our line printing example…
SPL/2010
47
Message Tokenizer
●
●
we use a framing method based on a single
character delimiter.
assume stream of messages, delimited
by FRAMING = we will use the character '\0‘
SPL/2010
48
SPL/2010
49
●
●
important part is connection termination and
exception handling at any moment
most of the code in low-level input/output
and socket manipulation relates to error
handling and connection termination.
SPL/2010
50
Line Printing Protocol
●
●
●
implement a specific protocol on the server side.
when receives a message, prints it on the server side
screen and adds a line number.
line number is the state of the protocol.
●
●
each client has its own line number. Two clients
connected at the same time will see each one its own
version of the line number.
when protocol processes a message, - sends back
message to client: ": printed" + date-time value when
the message was processed (on the server side).
●
timestamp acknowledgments.
SPL/2010
51
SPL/2010
52
A Client
●
●
before ConnectionHandler, review code of
compatible TCP client for protocol we have
just described.
no new idea - it is similar to the TCP client
we have reviewed in the previous section.
SPL/2010
53
SPL/2010
54
Concurrency Models of TCP Servers
Server quality criteria:
●
●
●
●
Scalability: capability to server a large
number of concurrent clients.
Low accept latency: acceptance wait time
Low reply latency: reply wait time after
message received.
High efficiency: use little resources on the
server (RAM, number of threads CPU usage).
SPL/2010
55
●
●
model the concurrency model of the server,
define interface which controls concurrency
application of each connection handler
SPL/2010
56
●
Given:
●
Encoder
●
Tokenizer
●
Protocol
●
ServerConcurrencyModel
defined the MessagingServer
SPL/2010
57
SPL/2010
58
●
●
●
To obtain good quality, a TCP server will most
often use multiple threads.
3 simple models of concurrency servers
3 implementations of preparing the
ServerConcurrencyModel interface
SPL/2010
59
Server Model 1: Single Thread
●
1 thread for;
●
●
accepting a new client
dealing requests, by applying run method of the
passive ConnectionHandler object.
SPL/2010
60
SPL/2010
61
Single Thread Model: Quality
●
●
●
●
no scalability: at any given moment, it can
serve at most one client.
high accept latency: a second client must
wait until first client disconnects
low reply latency: all resources are
concentrated on serving one client.
Good efficiency: server uses exactly the
resources needed to serve one client
●
SPL/2010
62
When is model appropriate?
●
●
time to process a full connection from one
client is guaranteed to remain small.
Example: server provides date and time value
on the server machine.
●
sends one string to the client then disconnects.
SPL/2010
63
Server Model 2: Thread per Client
●
assigns a new thread, for each connected
client, by invoking the 'start' method over
the runnable ConnectionHandler object.
SPL/2010
64
SPL/2010
65
Model Quality: Scalability
●
server can serve several concurrent clients,
up to max threads running in the process.
●
●
●
RAM of the host is used - each thread allocates
a stack and thus consumes RAM
Approx. 500 - 1000 threads become active in a
single process.
process does not defend itself – keeps creating
new threads - dangerous for the host.
SPL/2010
66
Model Quality: Latency
●
Low accept latency: time from one accept to
the next ~ time to create a new thread –
●
●
short compared to delay in incoming client
connections.
Reply latency: resources of the server are
spread among concurrent connections.
●
reasonable number of active connections
(~hundreds), load requested relatively low in CPU
and RAM,
SPL/2010
67
Model Quality: Efficiency
●
Low efficiency: server creates full thread per
connection,
–
–
●
connection may be bound to Input/Output operations.
ConnectionHandler thread will be blocked waiting for IO,
,still use the resources of the thread (RAM and Thread).
Reactor architecture …
SPL/2010
68
Server Model 3: Constant Number
of Threads
●
●
constant number of 10 threads (given by the
Executor interface of Java)
adding runnable ConnectionHandler object to
task queue of a thread pool executor
SPL/2010
69
Model Quality
●
●
●
●
avoids server causing host crash when too
many clients connect at the same time
up to N concurrent client connections server behaves as "thread-per-connection"
above N, accept latency will grow
scalability is limited to amount of concurrent
connections we believe we can support.
SPL/2010
70
Download