Sockets and intro to IO multiplexing Goals We are going to study sockets programming as means to introduce IO multiplexing problem. We will revisit socket programming in lecture 3. Socket Reference Check out Beej’s guide to network programming for free help. (http://beej.us/guide/bgnet) For the homework you will also need Beej’s guide for IPC. Chapter 16 (16.1-16.5) of APUE Before we begin All the system calls described here are UNIX system calls. Microsoft windows actually has system calls with exactly matching names and prototypes. (since socket programming was pretty much borrowed from UNIX) Microsoft programming required different #include and link options. it is beyond scope but covered in beej’s guide. Socket - OS view Whenever a process want to open a communication it open’s a socket. Socket is literally one side of a communication. (1to-1 communication is two sockets) In UNIX sockets are described by file descriptors just like open files. The types of sockets There are many types of sockets. We will deal with 2 types. We will deal with 2 domains of sockets. So as a total we will deal with 4 kinds of sockets. There are LOTS more socket types. Socket types STREAM socket - data is transferred as an ordered stream. no packets. (packets may be stacked or broken on receives). no losses. DGRAM (sort for datagram) sockets - data is transferred as packets. (no breaking or stacking) may not be in order. may have losses. Domain of sockets INTERNET - communication using TCP/UDP/other internet protocol. can communicate between hosts. has some overhead. may have losses. UNIX DOMAIN - communication using similar commands between processes on the same host. removes network overhead. no losses. Summary Internet stream sockets - Use TCP. allow stream communications between processes on different or same hosts. no losses. Internet datagram sockets - Use UDP. allow datagram communication between processes on different or same hosts. prune to packet loss. UDS - stream - allow stream communication on the same host. no losses. UDS - dgram - allow datagram communication on the same host. no losses. Socket(2) • SOCKET(2) • NAME • • BSD System Calls Manual SOCKET(2) socket -- create an endpoint for communication SYNOPSIS • #include <sys/socket.h> • int • socket(int domain, int type, int protocol) The parameters Domain - in our case we will deal with only TWO domains AF_UNIX (for UDS) and AF_INET (for internet) Type - SOCK_STREAM or SOCK_DGRAM protocol - in our case we will always use 0. (used if there are multiple sockets of the same domain and type) Return value is an int (socket file descriptors). is just an int. (same as the int for file descriptor we get for open) contains no info in itself. it is actually an index for the OS for a place in the file descriptors table (so the OS will know which fd we want) - or as we said - just an int. what socket do and does not do Socket create the OS “name” for a communication end point. Socket only create the end point of communication not the communication itself. Server side programming The first thing we do in a TCP (AF_INET, SOCK_STREAM) server is to create a listening socket. The listening socket will be used to create new communication. (only for that!) The listening socket will NOT be used for communication with clients. we will have a new socket for that. Port Port is a logical address within the machine for communication. your machine can have hundreds of processes waiting for communications (there can also be an multiple ports per process) on different ports. lay man terms : IP is the address and port is the mailbox number. bind • BIND(2) • NAME • • BSD System Calls Manual BIND(2) bind -- bind a name to a socket SYNOPSIS • #include <sys/socket.h> • int • bind(int socket, const struct sockaddr *address, socklen_t address_len); what bind do? bind is actually attaching socket to a port. if the port is available bind is successful. future bind to the port will fail (unless the socket is released) The socket can now receive information on that port. Struct sockaddr • bind(int socket, const struct sockaddr *address, socklen_t address_len); Struct sockaddr is “base class” for all communication We never use struct sockaddr. we use struct sockaddr_in for internet sockets. We use struct sockaddr_un for uds using bind for TCP • my_addr.sin_family = AF_INET; • my_addr.sin_port = htons(MYPORT); • my_addr.sin_addr.s_addr = INADDR_ANY; • memset(my_addr.sin_zero, '\0', sizeof my_addr.sin_zero); • bind(sockfd, (struct sockaddr *)&my_addr, sizeof my_addr); // short, network byte order Endianess how do bytes are ordered in an integer? (if our integer consist of 4 bytes do we save it as ABCD or DCBA?) There are two answers to that question. most significant bit last or first. Network order tends to be MSB first. hosts tend to be MSB last(depending on vendor. the function htons will rearrange bytes if needed (or nothing) Listen(2) After attaching a the socket to a port using bind we call listen(2) listen is a system call that tells the OS to open a queue for new communications. man 2 listen • LISTEN(2) • NAME • • BSD System Calls Manual listen -- listen for connections on a socket SYNOPSIS • #include <sys/socket.h> • int • listen(int socket, int backlog) LISTEN(2) what does listen do allocate memory for queue.(enough for backlog connections) Tells the OS to put incoming connections in the queue. Accept(2) Accept is a BLOCKING function. Accept tells the machine to wait until a new connection is available. Accept is used to create a connection with an incoming client Accept return NEW FILE DESCRIPTOR for the new connection! man 2 accept • ACCEPT(2) • NAME • • BSD System Calls Manual accept -- accept a connection on a socket SYNOPSIS • #include <sys/socket.h> • int • ACCEPT(2) accept(int socket, struct sockaddr *restrict address, socklen_t *restrict address_len); Blocking vs. non-Blocking functions Some system calls can take a long time and we are dependent on some external condition. for example when we receive communication or wait for clients to connect. We call those clients Blocking. Blocking functions are functions Communication we communicate using send(2) and recv(2) function that do just that. • SEND(2) • NAME • • BSD System Calls Manual SEND(2) man 2 send send, sendmsg, sendto -- send a message from a socket SYNOPSIS • #include <sys/socket.h> • ssize_t send(int socket, const void *buffer, size_t length, int flags); • ssize_t sendmsg(int socket, const struct msghdr *message, int flags); • ssize_t sendto(int socket, const void *buffer, size_t length, int flags, • • const struct sockaddr *dest_addr, socklen_t dest_len); DESCRIPTION • Send(), sendto(), and sendmsg() are used to transmit a message to another • socket. Send() may be used only when the socket is in a connected state, • while sendto() and sendmsg() may be used at any time. man 2 recv • RECV(2) • NAME • • • • BSD System Calls Manual RECV(2) recv, recvfrom, recvmsg -- receive a message from a socket LIBRARY Standard C Library (libc, -lc) SYNOPSIS • #include <sys/socket.h> • ssize_t • recv(int socket, void *buffer, size_t length, int flags); coding a TCP client Client calls socket just like a server. However, client does not need well known port and uses connect(2) to create communication instead of bind(2)+listen(2)+accept(2). Once connection is open we use send(2)/recv(2) normally. • SYNOPSIS • #include <sys/types.h> • #include <sys/socket.h> • int • connect(int socket, const struct sockaddr *address, • man 2 connect socklen_t address_len); • CONNECT(2) • NAME • • BSD System Calls Manual connect -- initiate a connection on a socket SYNOPSIS CONNECT(2) one more thing to terminate a connection we use close(2) just like closing a file. Coding silly client and silly server Silly client connects to server and send’s “hello world\n”. Silly server wants to receive connections from silly clients and print what silly client sends. silly client • int main() { • struct sockaddr_in sin; • int sockfd=socket(AF_INET,SOCK_STREAM,0); • sin.sin_family=AF_INET; • sin.sin_port=htons(1234); • sin.sin_addr.s_addr=inet_addr("127.0.0.1"); • memset(&sin.sin_zero,0,sizeof(sin.sin_zero)); • connect(sockfd, &sin, sizeof(sin)); • send(sockfd, "hello world\n",12,0); • close(sockfd); • } silly server • int main() • { • struct sockaddr_in sin,theirsin; • int len, bufsize=1000 ,newfd; • char buf[1000]; • int sockfd=socket(AF_INET,SOCK_STREAM,0); • sin.sin_family=AF_INET; • sin.sin_port=htons(1234); • sin.sin_addr.s_addr=inet_addr("127.0.0.1"); • memset(&sin.sin_zero,0,sizeof(sin.sin_zero)); silly server 2 • bind (sockfd, &sin, sizeof(sin)); • listen (sockfd, 10); • newfd=accept(sockfd, &theirsin, &len); • int numbytes=recv(newfd, buf, bufsize ,0); • buf[numbytes]='\0'; • printf ("client says : %s",buf); • close(newfd); • close(sockfd); • } multiple clients? Silly server support only the first client. Even if we will loop and accept new connections, since RECEIVE and ACCEPT are blocking we cannot (at least until we learn something) do them both Multi tasking and IO multiplexing In the 2nd half of the lecture we will discuss doing things in parallel. Either handling multiple I/O (I/O multiplexing) or doing several things together (for example handling I/O while we do some complex math)