Overview • Last Lecture – Daemon processes and advanced I/O functions • This Lecture – Unix domain protocols and non-blocking I/O – Source: Chapters 15&16&17 of Stevens’ book • Next Lecture – Advanced UDP sockets and Threads – Source: Chapters 22&26 of Stevens’ book TELE 402 Lecture 10: Unix domain … 1 Unix domain sockets • A way of performing client-server communication on a single host using the same socket API • Two types: stream and datagram • Why use Unix domain sockets? – Unix domain sockets are twice as fast as a TCP socket • Example: X Window System – Can be used to pass descriptors between processes on the same host – Can provide the client’s credentials (user ID and group IDs) to the server for additional security check (newer implementations) TELE 402 Lecture 10: Unix domain … 2 Unix domain socket protocol address • Are pathnames within the normal filesystem • Cannot read from or write to these files except as a socket TELE 402 Lecture 10: Unix domain … 3 Socket address structure struct sockaddr_un {sa_family_t sun_family; char sun_path[104]; } • sun_family should be AF_LOCAL • sun_path is a pathname string terminated with a \0. The unspecified address is indicated by a null string as the pathname. • The pathname should be an absolute pathname, not a relative pathname. • The macro SUN_LEN calculates the length of a sockaddr_un structure. TELE 402 Lecture 10: Unix domain … 4 sockpair function 1 • Creates two sockets that are then connected together int socketpair(int int int int family, type, protocol, sockfd[2]) • family must be AF_LOCAL • protocol must be 0 • type can be either SOCK_STREAM or SOCK_DGRAM. TELE 402 Lecture 10: Unix domain … 5 sockpair function 2 • The two sockets created are returned as sockfd[0] and sockfd[1], which are unnamed. • There is no implicit bind involved. • They form a stream pipe if their type is SOCK_STREAM. The pipe is full-duplex. TELE 402 Lecture 10: Unix domain … 6 Differences from inet sockets 1 – Default file permissions for a pathname created by bind should be 0777, modified by umask value. – The pathname associated with a Unix domain socket should be an absolute pathname, not a relative name. – The pathname specified in a call to connect must be a pathname currently bound to an open Unix domain socket of the same type. – A bind will fail if the pathname already exists (use unlink before bind) – The permission testing for connect of a Unix domain socket is the same as if open had been called for writeonly access. TELE 402 Lecture 10: Unix domain … 7 Differences from inet sockets 2 – Unix domain stream sockets are similar to TCP sockets • They provide a byte stream with no record boundaries. – If a call to connect finds that the listening socket’s queue is full, ECONNREFUSED is returned immediately – DGRAM sockets are similar to UDP sockets • They provide an unreliable datagram service that preserves record boundaries. – Sending a datagram on an unbound Unix domain datagram socket does not bind a pathname to the socket (bind must be called). TELE 402 Lecture 10: Unix domain … 8 Passing descriptors 1 • Descriptors can be shared between processes in the following ways – A child process shares all the open descriptors with the parent after a call to fork – All descriptors normally remain open when exec is called – Pass descriptors using Unix domain sockets and recvmsg TELE 402 Lecture 10: Unix domain … 9 Passing descriptors 2 • Steps involved in passing a descriptor – Create Unix domain sockets (preferably SOCK_STREAM) and connect them for communication between a server and a client – One process opens a descriptor. Any type of descriptor can be exchanged. – Sender builds a msghdr structure containing the descriptor to be passed, and calls sendmsg with the structure across one of the Unix domain sockets – Reciever calls recvmsg to receive the descriptor from the other Unix domain socket. • Client and server must have an application protocol so they know when the descriptor is to be passed. TELE 402 Lecture 10: Unix domain … 10 Example 1 • Refer to unixdomain/mycat.c, unixdomain /myopen.c, unixdomain/openfile.c, lib/read_fd.c, and lib/write_fd.c TELE 402 Lecture 10: Unix domain … 11 Example 2 TELE 402 Lecture 10: Unix domain … 12 Passing user credentials 1 • User credentials (user ID, group IDs) can be passed along a Unix domain socket as the fcred structure struct fcred { uid_t fc_ruid; gid_t fc_rgid; char fc_login[MAXLOGNAME]; uid_t fc_uid; short fc_ngroups; gid_t fc_groups[NGROUPS];} TELE 402 Lecture 10: Unix domain … 13 Passing user credentials 2 • The above information is always available on a Unix domain socket, subject to the following conditions – The credentials are sent as ancillary data when data is sent on the Unix domain socket, but only if the receiver of the data has enabled the LOCAL_CREDS socket option. The level for this option is 0. – On a datagram socket, the credentials accompany every datagram. On a stream socket, the credentials are sent only once (the first time data is sent) – Credentials cannot be sent along with a descriptor – Users are not able to forge credentials TELE 402 Lecture 10: Unix domain … 14 Distributed Shared Memory • Use local duplicate of the shared memory • Consistency maintenance – – – – Message passing based on UDP Stop and wait protocol Client/server model Two connections between any pair of nodes TELE 402 Lecture 10: Unix domain … 15 Blocking and nonblocking 1 • Input operations: read, readv, recv, recvfrom, and recvmsg – Blocking: if there is no data available in the socket receive buffer, the process is put to sleep – Nonblocking: if there is no data available, the process is returned an error of EWOULDBLOCK TELE 402 Lecture 10: Unix domain … 16 Blocking and nonblocking 2 • Output operations: write, writev, send, sendto, and sendmsg – Blocking: if there is no room in the socket send buffer, the process is put to sleep – Nonblocking: if there is no room at all in the socket send buffer, the process is returned an error of EWOULDBLOCK – In general UDP does not block since it does not have a socket send buffer. Some implementations might block in the kernel due to buffering and flow control. TELE 402 Lecture 10: Unix domain … 17 Blocking and nonblocking 3 • Accepting incoming connections: accept – Blocking: if there is no new connection available, the process is put to sleep – Nonblocking: if there is no new connection available, the process is returned an error of EWOULDBLOCK • Initiating outgoing connections: connect – Blocking: the process is blocked for at least the round trip time (RTT) to the server – Nonblocking: if a connection cannot be established immediately, the connection establishment is initiated but the error of EINPROGRESS is returned • Some connections can be established immediately, e.g. when the server and the client are on the same host TELE 402 Lecture 10: Unix domain … 18 Example1 • Nonblocking reads and writes for str_cli function • refer to strclinonb.c TELE 402 Lecture 10: Unix domain … 19 Example1 (cont.) • Buffer for data from standard input going to the socket TELE 402 Lecture 10: Unix domain … 20 Example1 (cont.) • Buffer for data from the socket going to standard output TELE 402 Lecture 10: Unix domain … 21 Example Nonblocking Timeline TELE 402 Lecture 10: Unix domain … 22 Simple version of example1 • Use fork to remove blocking factors – Refer to strclifork.c TELE 402 Lecture 10: Unix domain … 23 Nonblocking connect 1 • To set a socket nonblocking, use fcntl to set O_NONBLOCK flag • Three uses for nonblocking connect – Overlap other processing with the three-way handshake • Should use select to test the connection later – Establish multiple connections at the same time – Shorten the timeout for connect using select with a specified time limit • Example – Overlap other processing with three-way handshake – Use select to shorten timeout. TELE 402 Lecture 10: Unix domain … 24 Nonblocking connect 2 • There are a couple of details to attend if we use this technique: – If the server is on the same host, the connection is normally established immediately. We need to handle this. – Berkeley derived have the following rules: • The descriptor is writable when the connection completes successfully • If connection establishment encounters an error, the descriptor becomes both readable and writable TELE 402 Lecture 10: Unix domain … 25 Web client • Use multiple connections to send requests and to receive response. (refer to nonblock/web.c) • Control flow – Associate each request with a nonblocking socket whose connection is initiated, depending on the maximum allowable connections. – Use select to wait for any socket to be ready – Scan the request array to find out if their sockets are readable or writable, and react to each situation accordingly • Writable: send request. • readable: receive response. – Repeat the above until all requests are processed TELE 402 Lecture 10: Unix domain … 26 Nonblocking accept • Normally nonblocking accept is not necessary if we use select, since when select returns there must be a completed connection • However, there is a possibility (because the server is doing something else) that between the call select and the call accept, the client sends a RST to close the connection, which will cause accept to block • To fix the problem – Always set a listening socket nonblocking – Ignore the following errors on the call to accept: EWOULDBLOCK, ECONNABORTED, EPROTO, and EINTR TELE 402 Lecture 10: Unix domain … 27 ioctl • ioctl has traditionally been the system interface used for everything that did not fit into some other nicely defined category • Posix is getting rid of ioctl • However, numerous ioctls remain for implementation-dependent features related to network programming – Obtaining the interface information – Accessing the routing table – Accessing the ARP cache • The ioctls introduced here are implementation dependent and may not be supported by Linux TELE 402 Lecture 10: Unix domain … 28 ioctl int ioctl(int fd, int request, void *arg) • Requests can be divided into six categories – – – – – – Socket operations File operations Interface operations ARP cache operations Routing table operations Streams system TELE 402 Lecture 10: Unix domain … 29 Interface configuration • Get interface configuration information – Use SIOCGIFCONF, SIOCGIFFLAGS, and SIOCGIFBRDADDR requests – And others • ifconf structure is used as the argument TELE 402 Lecture 10: Unix domain … 30 SIOCGIFCONF TELE 402 Lecture 10: Unix domain … 31 SIOCGIFCONF TELE 402 Lecture 10: Unix domain … 32