Sockets

advertisement
Networked Applications: Sockets
Assembled by Ossi Mokryn, based on Slides by Jennifer Rexford, Princeton,
And on data from beej’s guide : http://beej.us/guide/bgnet
1
Socket programming
Goal: learn how to build client/server application
that communicate using sockets
Socket API




2
introduced in BSD4.1 UNIX, 1981
explicitly created, used, released
by apps
client/server paradigm
two types of transport service via
socket API:
 unreliable datagram
 reliable, byte stream-oriented
socket
a host-local,
application-created,
OS-controlled
interface (a “door”)
into which
application process
can both send and
receive messages
to/from another
application process
Application Layer
Clients and Servers
 Client program



Running on end host
Requests service
E.g., Web browser

Server program



Running on end host
Provides service
E.g., Web server
GET /index.html
3
“Site under construction”
Client-Server Communication
Client “sometimes on”





4
Initiates a request to the
server when interested
E.g., Web browser on your
laptop or cell phone
Doesn’t communicate
directly with other clients
Needs to know the server’s
address

Server is “always on”




Services requests from
many client hosts
E.g., Web server for the
www.cnn.com Web site
Doesn’t initiate contact
with the clients
Needs a fixed, well-known
address
Client and Server Processes
Program vs. process



Program: collection of code
Process: a running program on a host
Communication between processes


Same end host: inter-process communication


Governed by the operating system on the end host
Different end hosts: exchanging messages

Governed by the network protocols
Client and server processes



5
Client process: process that initiates communication
Server process: process that waits to be contacted
Socket: End Point of Communication
Sending message from one process to another


Message must traverse the underlying network
Process sends and receives through a “socket”


In essence, the doorway leading in/out of the house
Socket as an Application Programming Interface


6
Supports the creation of network applications
User process
User process
socket
socket
Operating
System
Operating
System
Identifying the Receiving Process
Sending process must identify the receiver



Name or address of the receiving end host
Identifier that specifies the receiving process
Receiving host



Destination address that uniquely identifies the host
An IP address is a 32-bit quantity
Receiving process




7
Host may be running many different processes
Destination port that uniquely identifies the socket
A port number is a 16-bit quantity
Using Ports to Identify Services
Server host 128.2.194.242
Client host
Service request for
128.2.194.242:80
(i.e., the Web server)
Web server
(port 80)
OS
Client
Echo server
(port 7)
Service request for
128.2.194.242:7
(i.e., the echo server)
Client
Web server
(port 80)
OS
Echo server
(port 7)
8
Knowing What Port Number To Use
Popular applications have well-known ports



E.g., port 80 for Web and port 25 for e-mail
Well-known ports listed at http://www.iana.org
Well-known vs. ephemeral ports


Server has a well-known port (e.g., port 80)


Between 0 and 1023
Client picks an unused ephemeral (i.e., temporary) port

Between 1024 and 65535
Uniquely identifying the traffic between the hosts



9
Two IP addresses and two port numbers
Underlying transport protocol (e.g., TCP or UDP)
Delivering the Data: Division of Labor

Network



Operating system



Deliver data packet to the destination host
Based on the destination IP address
Deliver data to the destination socket
Based on the protocol and destination port #
Application


10
Read data from the socket
Interpret the data (e.g., render a Web page)
Windows Socket API

Socket interface




In UNIX, everything is like a file




Originally provided in Berkeley UNIX
Later adopted by all popular operating systems
Simplifies porting applications to different OSes
All input is like reading a file
All output is like writing a file
File is represented by an integer file descriptor
System calls for sockets


Client: create, connect, write, read, close
Server: create, bind, listen, accept, read, write, close
Winsock Programmer's FAQ:
http://tangentsoft.net/wskfaq/newbie.html#interop

11
Typical Client Program

Prepare to communicate




Exchange data with the server




Create a socket
Determine server address and port number
Initiate the connection to the server
Write data to the socket
Read data from the socket
Do stuff with the data (e.g., render a Web page)
Close the socket
12
Socket programming with UDP
UDP: no “connection” between client
and server
 no handshaking
 sender explicitly attaches IP
address and port of destination to application viewpoint
each packet
UDP provides unreliable
 server must extract IP address,
transfer of groups of bytes
(“datagrams”)
port of sender from received
between client and server
packet
UDP: transmitted data may be
received out of order, or lost
13
Application Layer
Client/server socket interaction: UDP
Server (running on hostid)
create socket,
port=x, for
incoming request:
serverSocket =
DatagramSocket()
read request from
serverSocket
write reply to
serverSocket
specifying client
host address,
port number
Client
create socket,
clientSocket =
DatagramSocket()
Create, address (hostid, port=x,
send datagram request
using clientSocket
read reply from
clientSocket
close
clientSocket
14
Application Layer
Creating a Socket: socket()

Operation to create a socket



af An address family specification. only format currently
supported is PF_INET, which is the ARPA Internet
address format.
Type: semantics of the communication



SOCKET WSAAPI socket ( int af, int type, int protocol);
SOCK_STREAM: reliable byte stream
SOCK_DGRAM: message-oriented service
Protocol: specific protocol


15
0: unspecified
(PF_INET and SOCK_STREAM already implies TCP, PF_INET
and SOCK_DGRAM implies UDP).
Establishing the server’s name and port history




struct sockaddr_in server;
server.sin_family = AF_INET;
server.sin_addr.s_addr = inet_addr(IPstring);
server.sin_port = htons(ServerPort);
16
Sending Data

int WSAAPI
sendto ( SOCKET s, const char FAR * buf, int len,
int flags, const struct sockaddr FAR * to, int tolen );

s
A descriptor identifying a (possibly connected)
socket.
buf A buffer containing the data to be transmitted.
len The length of the data in buf.
flags Specifies the way in which the call is made.
to An optional pointer to the address of the target
socket.
tolen The size of the address in to.





20
Slides by Jennifer Rexford
Receiving Data

int WSAAPI recvfrom (
SOCKET
char FAR*
int
int
s,
buf,
len,
flags,
struct sockaddr FAR*
int FAR*

from,
fromlen
);

s
A descriptor identifying a bound socket.

buf
A buffer for the incoming data.

len
The length of buf.

flags
Specifies the way in which the call is made.

from An optional pointer to a buffer which will hold the source address upon
return.

fromlen
21
An optional pointer to the size of the from buffer.
Slides by Jennifer Rexford
Byte Ordering: Little and Big Endian

Hosts differ in how they store data


Little endian (“little end comes first”)  Intel PCs!!!



Low-order byte stored at the lowest memory location
Byte0, byte1, byte2, byte3
Big endian (“big end comes first”)



E.g., four-byte number (byte3, byte2, byte1, byte0)
High-order byte stored at lowest memory location
Byte3, byte2, byte1, byte 0
IP is big endian (aka “network byte order”)


22
Use htons() and htonl() to convert to network byte order
Use ntohs() and ntohl() to convert to host order
Why Can’t Sockets Hide These Details?

Dealing with endian differences is tedious



No, swapping depends on the data type




Couldn’t the socket implementation deal with this
… by swapping the bytes as needed?
Two-byte short int: (byte 1, byte 0) vs. (byte 0, byte 1)
Four-byte long int: (byte 3, byte 2, byte 1, byte 0) vs. (byte 0,
byte 1, byte 2, byte 3)
String of one-byte charters: (char 0, char 1, char 2, …) in both
cases
Socket layer doesn’t know the data types


23
Sees the data as simply a buffer pointer and a length
Doesn’t have enough information to do the swapping
Servers Differ From Clients

Passive open




Hearing from multiple clients



Prepare to accept connections
… but don’t actually establish one
… until hearing from a client
Allow a backlog of waiting clients
... in case several try to start a connection at once
Create a socket for each client


24
Upon accepting a new client
… create a new socket for the communication
Typical Server Program

Prepare to communicate



Wait to hear from a client (passive open)



In TCP: Indicate how many clients-in-waiting to permit
Accept an incoming connection from a client
Exchange data with the client over new socket





Create a socket
Associate local address and port with the socket
Receive data from the socket
Do stuff to handle the request (e.g., get a file)
Send data to the socket
Close the socket
Repeat with the next connection request
25
Server Preparing its Socket

Bind socket to the local address and port number
(Associate a local address with a socket)

#include <winsock2.h>

int WSAAPI bind ( SOCKET
s, struct sockaddr* name, int namelen);
Arguments: socket descriptor, server address, address length


s
A descriptor identifying an unbound socket.
name The address to assign to the socket. Often, you bound your listening socket
to the special IP address INADDR_ANY. This allows your program to work
without knowing the IP address of the machine it was running on, or, in the case of
a machine with multiple network interfaces, it allows your server to receive packets
destined to any of the interfaces. When sending, a socket bound with
INADDR_ANY binds to the default IP address, which is that of the lowestnumbered interface.


26
Namelen The length of the name.
Returns 0 on success, and -1 if an error occurs
Putting it All Together
Server
socket()
bind()
Client
listen()
accept()
socket()
connect()
block
read()
process
request
write()
27
write()
read()
Serving One Request at a Time?


Serializing requests is inefficient

Server can process just one request at a time

All other clients must wait until previous one is done
Need to time share the server machine


Alternate between servicing different requests

Do a little work on one request, then switch to another

Small tasks, like reading HTTP request, locating the associated file,
reading the disk, transmitting parts of the response, etc.
Or, start a new process to handle each request


28
Allow the operating system to share the CPU across processes
Or, some hybrid of these two approaches
TCP info for later in the course
29
Socket-programming using TCP
Socket: a door between application process and end-endtransport protocol (UCP or TCP)
TCP service: reliable transfer of bytes from one process to
another
controlled by
application
developer
controlled by
operating
system
socket
TCP with
buffers,
variables
host or
server
30
process
process
internet
controlled by
application
developer
socket
TCP with controlled by
operating
buffers,
system
variables
host or
server
Application Layer
Putting it All Together
Server
socket()
bind()
Client
listen()
accept()
socket()
connect()
block
read()
process
request
write()
31
write()
read()
TCP info for later in the course
32
Serving One Request at a Time?


Serializing requests is inefficient

Server can process just one request at a time

All other clients must wait until previous one is done
Need to time share the server machine


Alternate between servicing different requests

Do a little work on one request, then switch to another

Small tasks, like reading HTTP request, locating the associated file, reading the
disk, transmitting parts of the response, etc.
Or, start a new process to handle each request


34
Allow the operating system to share the CPU across processes
Or, some hybrid of these two approaches
Blocking and non blocking


"block" is a techie jargon for "sleep“
Lots of functions block. accept() blocks. All the recv()
functions block.


Why? Because we programmed them this way:
When you first create the socket with socket(), the kernel
sets it to blocking.
Can we set the socket to be non-blocking? What does it
mean?




35
Yes. It means you have to poll it for data (check on it)
If there’s no data yet, you get -1 (err) and have to try again.
Isn’t that CPU intensive? Yes!
So, you can use select()…
Reading or writing to multiple sockets



A server wants to listen for incoming connections as well
as keep reading from the connections it has.
Select – a way to monitor several sockets at the same
time, and know their situation.
select()—Synchronous I/O Multiplexing




36
Manipulates sets of sockets and lets you know:
which ones are ready for reading
which are ready for writing
How? By creating and handling of sets of sockets, and obtaining
additional information for each.
Manipulating Sets for the select() func.
FD_SET(int fd, fd_set *set);
Add fd to the set.
FD_CLR(int fd, fd_set *set);
Remove fd from the set.
FD_ISSET(int fd, fd_set *set);
Return true if fd is in the set.
FD_ZERO(fd_set *set);
Clear all entries from the set.
Note: fd in Windows in the Socket structure
37
Additional select() info


What happens if a socket in the read set closes the
connection? Well, in that case, select() returns with that
socket descriptor set as "ready to read". When you
actually do recv() from it, recv() will return 0. That's
how you know the client has closed the connection.
One more note of interest about select(): if you have a
socket that is listen()ing, you can check to see if there is
a new connection by putting that socket's file descriptor
in the readfds set.
38
Wanna See Real Clients and Servers?

Apache Web server




Mozilla Web browser


http://www.sendmail.org/
BIND Domain Name System



http://www.mozilla.org/developer/
Sendmail


Open source server first released in 1995
Name derives from “a patchy server” ;-)
Software available online at http://www.apache.org
Client resolver and DNS server
http://www.isc.org/index.pl?/sw/bind/
…
39
A final note:
what is Peer-to-Peer Communication

No always-on server at the center of it all



Hosts can come and go, and change addresses
Hosts may have a different address each time
Example: peer-to-peer file sharing



40
Any host can request files, send files, query to find
where a file is located, respond to queries, and
forward queries
Scalability by harnessing millions of peers
Each peer acting as both a client and server
Download