Overview • Last Lecture – I/O multiplexing – Source: Chapter 6 of Stevens’ book • This Lecture – – – – Socket options Source: Chapter 7 of Stevens’ book Elementary UDP sockets Source: Chapter 8 of Stevens’ book • Next Lecture – Name and address conversions – Source: Chapter 11 of Stevens’ book TELE 402 Lecture 5: Socket options 1 Socket options 1 • There are many socket options for programmers to set for fine control of the underlying system and protocols – – – – – Generic socket options IPv4 socket options IPv6 socket options (to be discussed later) ICMPv6 socket options (to be discussed later) TCP socket options TELE 402 Lecture 5: Socket options 2 Socket options 2 • Options can be manipulated with the following functions – getsockopt and setsockopt – fcntl (recommended by Posix) – ioctl TELE 402 Lecture 5: Socket options 3 getsockopt and setsockopt 1 • These two functions apply only to sockets int getsockopt(int sockfd, int level, int optname, void *optval, socklen_t *optlen); TELE 402 Lecture 5: Socket options 4 getsockopt and setsockopt 2 int setsockopt(int sockfd, int level, int optname, const void *optval, socklen_t optlen); TELE 402 Lecture 5: Socket options 5 getsockopt and setsockopt 3 – Both return: 0 if OK, -1 on error – sockfd must refer to an open socket descriptor – level specifies the code in the system to interpret the option (the general code or protocol-specific code). See Figure 7.1 & 7.2 – optname is an integer representing the specific option. Use the macro definitions in Figure 7.1 & 7.2 – optval is a pointer to a variable storing the option value. (new value for setsockopt; current value for getsockopt) – Optlen is a result-value parameter referring to the size of optval TELE 402 Lecture 5: Socket options 6 Option values 1 • There are two types of option values – Binary options that enable or disable a certain feature (flags with • in Figure 7.1 & 7.2) • 0 for disable • nonzero for enable. – Options that fetch and return specific values that we can either set or examine (values). The actual values are passed between the kernel and the user spaces. TELE 402 Lecture 5: Socket options 7 Option values 2 • Data types for values – Most option values are integer – Some are structures, such as timeval, linger, and character array – Refer to Datatype column in Figure 7.1 & 7.2 for details TELE 402 Lecture 5: Socket options 8 Socket states 1 • Some socket options have timing considerations about when to set or fetch the option due to the state of the socket • The following options are inherited by a connected TCP socket from the listening socket – SO_DEBUG, SO_DONTROUTE, SO_KEEPALIVE, SO_LINGER, SO_OOBINLINE, SO_RCVBUF, SO_SNDBUF, SO_RCVLOWAT, SO_SNDLOWAT, TCP_MAXSEG, AND TCP_NODELAY TELE 402 Lecture 5: Socket options 9 Socket states 2 – For TCP, the connected socket is not returned to a server by accept until the three-way handshake is completed by the TCP layer. – To ensure that one of the above options is set for the connected socket when the three-way handshake completes, we must set that option for the listening socket. TELE 402 Lecture 5: Socket options 10 Generic socket options 1 • SO_BROADCAST – Enable or disable the ability of the socket to send broadcast messages • SO_DEBUG – Supported only by TCP. When enabled, the kernel keeps track of detailed info about all the packets sent or received by the socket • SO_DONTROUTE – Bypass the normal routing mechanism of the underlying protocol TELE 402 Lecture 5: Socket options 11 Generic socket options 2 • SO_ERROR – Get pending error and clear • so_error is set to a Exxxx value • Called a pending error • Two ways for process to be immediately notified • • – If blocked in a call to select, select returns – If using signal-driven I/O, SIGIO signal is generated Process can obtain so_error by fetching SO_ERROR option. If so_error is nonzero – If read is called and no data to return, -1 is returned and errno=so_error – If read is called and data is queued, data is returned instead of the error condition – If write is called, -1 is returned and errno=so_error TELE 402 Lecture 5: Socket options 12 Generic socket options 3 • SO_KEEPALIVE – If there is no data exchanged in either direction for 2 hours, a probe is sent to the peer if this option is set. One of three scenarios exists: • Peer responds with an ACK • Peer responds with a RST • No response TELE 402 Lecture 5: Socket options 13 Ways to detect TCP conditions TELE 402 Lecture 5: Socket options 14 Linger 1 • SO_LINGER – Specifies how close operates for a connection-oriented protocol – The following structure is used: struct linger { int l_onoff; int l_linger; } // l_onoff - 0=off; nonzero=on // l_linger specifies seconds – Three scenarios: TELE 402 Lecture 5: Socket options 15 Linger 2 • SO_LINGER (cont) – If l_onoff is 0, close returns immediately. If there is any data still remaining in the socket send buffer, the system will try to deliver the data to the peer. The value of l_linger is ignored. – If l_onoff is nonzero and linger is 0, TCP aborts the connection when close is called. TCP discards data in the send buffer and sends RST to the peer. TELE 402 Lecture 5: Socket options 16 Linger 3 • SO_LINGER (cont) – If l_onoff is nonzero and linger is nonzero, the kernel will linger when close is called. • If there is any data in the send buffer, the process is put to sleep until either: – the data is sent and acknowledged Or – the linger time expires (for a nonblocking socket the process will not wait for close to complete) • When using this feature, the return value of close must be checked. If the linger time expires before the remaining data is send and acknowledged, close returns EWOULDBLOCK and any remaining data in the buffer is ignored. TELE 402 Lecture 5: Socket options 17 close scenarios 1 TELE 402 Lecture 5: Socket options 18 close scenarios 2a TELE 402 Lecture 5: Socket options 19 close scenarios 2b TELE 402 Lecture 5: Socket options 20 close scenarios 3 TELE 402 Lecture 5: Socket options 21 close scenarios 4 • Application ACK to confirm the receipt of data TELE 402 Lecture 5: Socket options 22 close and shutdown TELE 402 Lecture 5: Socket options 23 Generic socket options 4 • SO_OOBINLINE – Out-of-band data can be placed in the normal input queue • SO_RCVBUF and SO_SNDBUF – Get/set the send buffer size and receive buffer size – These sizes are related to capacity of the connection • SO_RCVLOWAT and SO_SNDLOWAT – Decide the conditions for readable and writable TELE 402 Lecture 5: Socket options 24 Generic socket options 5 • SO_RCVTIMEO and SO_SNDTIMEO – Place a timeout on socket receives and sends – They affect read and write function families • SO_TYPE – Returns the socket type such as SOCK_STREAM and SOCK_DGRAM • SO_USELOOPBACK – Applies only to routing sockets TELE 402 Lecture 5: Socket options 25 Generic socket options 6 • SO_REUSEADDR – Allows a listening server to start and bind its wellknown port even if previously established connections exist that use this port as their local port – Allows multiple instances of the same server to be started on the same port, as long as each instance binds a different local IP address. – Allows a single process to bind the same port to multiple sockets, as long as each bind specifies a different local IP address – Allows completely duplicate bindings: a bind of an IP address and port, when that same IP address and port are already bound to another socket (normally for support of multicasting). TELE 402 Lecture 5: Socket options 26 IPv4 socket options 1 • IP_HDRINCL – For raw IP socket to build user specified header – The header is completely built, except the checksum, ID field, and the source IP address if the address is set INADDR_ANY • IP_OPTIONS – Allow us to set IP options in the IP header • IP_RECVDSTADDR – Causes the destination IP address of a received UDP datagram to be returned as ancillary data by recvmsg TELE 402 Lecture 5: Socket options 27 IPv4 socket options 2 • IP_RECVIF – Causes the index of the interface on which a UDP datagram is received to be returned as ancillary data by recvmsg • IP_TOS – Get/set the type-of-service field in the IP header • IP_TTL – Get/set the default TTL TELE 402 Lecture 5: Socket options 28 TCP socket options 1 • TCP_KEEPALIVE (new with Posix) – Specifies the idle time in seconds for the connection before TCP starts sending keepalive probe – It is effective only when SO_KEEPALIVE socket option is enabled • TCP_MAXRT (new with Posix) – Specifies the amount of time before a connection is broken once TCP starts retransmitting data TELE 402 Lecture 5: Socket options 29 TCP socket options 2 • TCP_MAXSEG – Get/set the maximum segment size for a TCP connection • TCP_NODELAY – Disable the delay algorithm (Nagle algorithm) TELE 402 Lecture 5: Socket options 30 Nagle algorithm TELE 402 Lecture 5: Socket options 31 fcntl function 1 • fcntl int fcntl(int fd, int cmd, …) – Returns: depends on cmd if OK, -1 on error TELE 402 Lecture 5: Socket options 32 fcntl and ioctl TELE 402 Lecture 5: Socket options 33 fcntl function 2 • fcntl provides the following features related to network programming – Nonblocking I/O (be aware of error-handling in the following code) flags=fcntl(fd, F_GETFL, 0); flags |= O_NONBLOCK; fcntl(fd, F_SETFL, flags); – Signal driven I/O flags=fcntl(fd, F_GETFL, 0); flags |= O_ASYNC; fcntl(fd, F_SETFL, flags); – Set socket owner to receive SIGIO signals fcntl(fd, F_SETOWN, getpid()); TELE 402 Lecture 5: Socket options 34 UDP client/server TELE 402 Lecture 5: Socket options 35 recvfrom and sendto 1 ssize_t recvfrom(int sockfd, void *buff, size_t nbytes, int flags, struct sockaddr *from, socklen_t *addrlen); ssize_t sendto(int sockfd, const void *buff, size_t nbytes, int flags, const struct sockaddr *to, socklen_t addrlen); • Both return: number of bytes read or written if OK, -1 on error TELE 402 Lecture 5: Socket options 36 recvfrom and sendto 2 • sockfd, buff, and nbytes are identical to read/write • flags is normally set 0, but can be set for advanced functions • The final two arguments to recvfrom are similar to the final two arguments to accept. They can be NULL. • The final two arguments to sendto are similar to the final two arguments to connect • Send 0 bytes is ok; likewise receive 0 bytes is ok TELE 402 Lecture 5: Socket options 37 Simple UDP C/S • Refer to udpcliserv/udpserv01.c, lib/dg_echo.c, udpcliserv/udpcli01.c, and lib/dg_cli.c for details • Sockets are created with type SOCK_DGRAM TELE 402 Lecture 5: Socket options 38 UDP C/S with two clients • If a datagram is lost, the client will wait forever • Timeout is needed, but not enough (duplicate problem) TELE 402 Lecture 5: Socket options 39 Comparison with TCP C/S TELE 402 Lecture 5: Socket options 40 Verifying Received Response • Any process that knows the client’s ephemeral port could send datagrams to the client. • Change call to recvfrom to return the IP address and port of the replies. • Check they are from the intended server. TELE 402 Lecture 5: Socket options 41 Server Not Running • What happens and why? • ARP request maybe • Server host responds with ICMP ‘port unreachable’ message • Client blocks forever in call to recvfrom • Called asynchronous error • Basic rule - asynchronous errors are not returned for a UDP socket unless it is connected. TELE 402 Lecture 5: Socket options 42 Server not running • If the server is not running, when the client send a datagram to the server, the server host responds with an ICMP packet (port unreachable) • But the client just blocks in the call to recvfrom, with no error returned – This error is called asynchronous error, since it is caused by a previous sendto • The basic rule is that asynchronous errors are not returned for UDP sockets unless the socket is connected – The reason is that, if a socket is not connected, it is not clear which destination is not reachable if multiple destinations are involved TELE 402 Lecture 5: Socket options 43 Summary of UDP C/S TELE 402 Lecture 5: Socket options 44 Summary of UDP C/S (cont.) TELE 402 Lecture 5: Socket options 45 Socket address information TELE 402 Lecture 5: Socket options 46 Connected UDP sockets 1 • Three things change with connected UDP sockets – We can no longer specify the destination IP address and port for an output operation. – No point to use recvfrom, since the datagrams received come from the same source as the socket is connected to – Asynchronous errors are returned TELE 402 Lecture 5: Socket options 47 Connected UDP sockets 2 TELE 402 Lecture 5: Socket options 48 Connected UDP sockets 3 TELE 402 Lecture 5: Socket options 49 Connected UDP sockets 4 • A UDP client or server can call connect only if that process uses the UDP socket to communicate with exactly one peer TELE 402 Lecture 5: Socket options 50 Multiple calls to connect • We can call connect again on a connected UDP socket – Specify a new IP address and port for connect – Unconnect the socket by setting the family member of the socket address structure to AF_UNSPEC. This might return an error of EAFNOSUPPORT but that is ok • Performance – When an application needs to send multiple datagrams to the same peer, it is more efficient to use a connected socket than an unconnected socket. • Outgoing interface for connected UDP sockets – The outgoing interface (local IP address and port) can be determined using getsockname TELE 402 Lecture 5: Socket options 51 Flow control in UDP • There is no flow control in UDP – If a client sends UDP datagrams very fast to a slow server, the number of datagrams received may be smaller than the number of datagrams sent. – The reason is that the receive buffer of the UDP socket is limited. When it is full, datagrams are dropped. • Increasing the socket receive buffer may help to some extent, but can’t fix the problem – You can try to change the size of the UDP socket receive buffer with the socket option SO_RCVBUF TELE 402 Lecture 5: Socket options 52