Operating Systems: IPC Inter-Process Communication : Message Passing • Processes can communicate through shared areas of memory – the Mutual Exclusion problem and Critical Sections • Semaphores - a synchronisation abstraction • Monitors - a higher level abstraction • Inter-Process Message Passing much more useful for information transfer – can also be used just for synchronisation – can co-exist with shared memory communication • Two basic operations : send(message) and receive(message) – message contents can be anything mutually comprehensible » data, remote procedure calls, executable code etc. – usually contains standard fields » destination process ID, sending process ID for any reply » message length » data type, data etc. 1 Operating Systems: IPC • Fixed-length messages: – simple to implement - can have pool of standard-sized buffers » low overheads and efficient for small lengths - copying overheads if fixed length too long – can be inconvenient for user processes with variable amount of data to pass » may need a sequence of messages to pass all the data » long messages may be better passed another way e.g. FTP - copying probably involved, sometimes multiple copying into kernel and out • Variable-length messages: – more difficult to implement - may need a heap with garbage collection » more overheads and less efficient, memory fragmentation – more convenient for user processes 2 Operating Systems: IPC • Communication links between processes – not concerned with physical implementation e.g. shared memory, processor bus, network etc. – rather with issues of its logical implementation • Issues: – how are links established? – can a link be associated with more than two processes? – how many links can there be between every pair of processes? – what is the capacity of a link? » i.e. buffer space and how much – fixed v. variable length messages – unidirectional v. bidirectional ? » can messages flow in one or both directions between two linked processes » unidirectional if each linked process can either send orreceive but not both and each link has at least one receiver process connected to it 3 Operating Systems: IPC • Naming of links - direct and indirect communications • Direct: – each process wanting to communicate must explicitly name the recipient or sender of the communication – send and receive primitives defined: send ( P, message ) : send a message to process P receive ( Q, message ) : receive a message from process Q – a link established automatically between every pair of processes that want to communicate » processes only need to know each other’s identity – link is associated with exactly two processes – link is usually bidirectional but can be unidirectional – Process A while (TRUE) { produce an item send ( B, item ) } Process B while (TRUE) { receive ( A, item ) consume item } 4 Operating Systems: IPC • Asymmetric addressing: – only the sender names the recipient – recipient not required to name the sender - need not know the sender send ( P, message ) receive ( id, message ) : send message to process P : receive from any process, id set to sender • Disadvantage of direct communications : – limited modularity - changing the name of a process means changing every sender and receiver process to match – need to know process names • Indirect communications : – messages sent to and received from mailboxes (or ports) » mailboxes can be viewed as objects into which messages placed by processes and from which messages can be removed by other processes – each mailbox has a unique ID – two processes can communicate only if they have a shared mailbox 5 Operating Systems: IPC send ( A, message ) receive ( A, message ) : send a message to mailbox A : receive a message from mailbox A – a communications link is only established between a pair of processes if they have a shared mailbox – a pair of processes can communicate via several different mailboxes if desired – a link can be either unidirectional or bidirectional – a link may be associated with more than two processes » allows one-to-many, many-to-one, many-to-many communications – one-to-many : any of several processes may receive from the mailbox » e.g. a broadcast of some sort » which of the receivers gets the message? - arbitrary choice of the scheduling system if many waiting? - only allow one process at a time to wait on a receive – many-to-one : many processes sending to one receiving process » e.g. a server providing service to a collection of processes » file server, network server, mail server etc. » receiver can identify the sender from the message header contents 6 Operating Systems: IPC • many-to-many : – e.g. multiple senders requesting service and a pool of receiving servers offering service - a server farm • Mailbox Ownership – process mailbox ownership : » only the process may receive messages from the mailbox » other processes may send to the mailbox » mailbox can be created with the process and destroyed when the process dies - process sending to a dead process’s mailbox will need to be signalled » or through separate create_mailbox and destroy_mailbox calls - possibly declare variables of type ‘mailbox’ – system mailbox ownership : » mailboxes have their own independent existence, not attached to any process » dynamic connection to a mailbox by processes - for send and/or receive 7 Operating Systems: IPC • Buffering - the number of messages that can reside in a link temporarily – Zero capacity - queue length 0 » sender must wait until receiver ready to take the message – Bounded capacity - finite length queue » messages can be queued as long as queue not full » otherwise sender will have to wait – Unbounded capacity » any number of messages can be queued - in virtual space? » sender never delayed • Copying – need to minimise message copying for efficiency – copy from sending process into kernel message queue space and then into receiving process? » probably inevitable in a distributed system » advantage that communicating processes are kept separate - malfunctions localised to each process 8 Operating Systems: IPC – direct copy from one process to the other? » from virtual space to virtual space? » message queues keep indirect pointers to message in process virtual space » both processes need to be memory resident i.e. not swapped out to disc, at time of message transfer – shared virtual memory » message mapped into virtual space of both sender and receiver processes » one physical copy of message in memory ever » no copying involved beyond normal paging mechanisms » used in MACH operating system Aside : Mach’s Copy-on-Write mechanism (also used in Linux forks) : » single copy of shared material mapped into both processes virtual space » both processes can read the same copy in physical memory » if either process tries to write, an exception to the kernel occurs » kernel makes a copy of the material and remaps virtual space of writing process onto it » writing process modifies new copy and leaves old copy intact for other process 9 Operating Systems: IPC • Synchronised versus Asynchronous Communications • Synchronised: – send and receive operations blocking » sender is suspended until receiving process does a corresponding read » receiver suspended until a message is sent for it to receive – properties : » processes tightly synchronised - the rendezvous of Ada » effective confirmation of receipt for sender » at most one message can be outstanding for any process pair - no buffer space problems » easy to implement, with low overhead – disadvantages : » sending process might want to continue after its send operation without waiting for confirmation of receipt » receiving process might want to do something else if no message is waiting to be received 10 Operating Systems: IPC • Asynchronous : – send and receive operations non-blocking » sender continues when no corresponding receive outstanding » receiver continues when no message has been sent – properties : » messages need to be buffered until they are received - amount of buffer space to allocate can be problematic - a process running amok could clog the system with messages if not careful » often very convenient rather than be forced to wait - particularly for senders » can increase concurrency » some awkward kernel decisions avoided - e.g. whether to swap a waiting process out to disc or not – receivers can poll for messages » i.e. do a test-receive every so often to see if any messages waiting » interrupt and signal programming more difficult » preferable alternative perhaps to have a blocking receive in a separate thread 11 Operating Systems: IPC • Other combinations : – non-blocking send + blocking receive » probably the most useful combination » sending process can send off several successive messages to one or more processes if need be without being held up » receivers wait until something to do i.e. take some action on message receipt - e.g. a server process - might wait on a read until a service request arrived, - then transfer the execution of the request to a separate thread - then go back and wait on the read for the next request – blocking send + non-blocking receive » conceivable but probably not a useful combination – in practice, sending and receiving processes will each choose independently • Linux file access normally blocking : – to set a device to non-blocking (already opened with a descriptor fd) : fcntl ( fd, F_SETFL, fcntl ( fd, F_GETFL) | O_NDELAY ) 12 Operating Systems: IPC • Missing messages ? – message sent but never received » receiver crashed? » receiver no longer trying to read messages? – waiting receiver never receives a message » sender crashed? » no longer sending messages? • Crashing processes : – kernel knows when processes crash – can notify waiting process » by synthesised message » by signal » terminate process 13 Operating Systems: IPC • Time-outs – add a time-limit argument to receive receive ( mailbox, message, time-limit ) – if no message received by time-limit after issuing the receive: » kernel can generate an artificial synthesised message saying time-out » could signal receiver process with a program-error - allows receiver to escape from current program context – sending process can also be protected using a handshake protocol : – sender : send (mailbox, message) receive (ackmail, ackmess, time-limit) – receiver : receive (mailbox, message, time-limit) if ( message received in time ) send (ackmail, ackmess) 14 Operating Systems: IPC • Lost messages – particularly relevant in distributed systems – OS’s need to be resilient i.e. not crash either the kernel or user processes – options: » kernel responsible for detecting message loss and resending - need to buffer message until receipt confirmed - can assume unreliability and demand receipt confirmation within a time limit » sending process responsible for detecting loss - receipt confirmation within time-limit and optional retransmit » kernel responsible for detecting loss but just notifies sender - sender can retransmit if desired – long transmission delays can cause problems » something delayed a message beyond its time limit but still in transit - if message retransmitted, multiple messages flying about - need to identify messages e.g. serial numbers, and discard any duplicates » scrambled message - checksums etc. » see Communications module! 15 Operating Systems: IPC • Message Queuing order – first-come-first served » the obvious, simple approach » may not be adequate for some situations - e.g. a message to a print server from device driver saying printer gone off-line - could use signals instead – a priority system » some types of message go to head of the queue - e.g. exception messages – user selection » allow receiving process to select which message to receive next - e.g. from a particular process, to complete some comms protocol » to select a message type - e.g. a request for a particular kind of service (different mailbox better?) » to postpone a message type - an inhibit request (implemented in EMAS) • Authentication – of sending and receiving processes - signatures, encryption etc. 16 Operating Systems: IPC Process Synchronisation using Messages • A mailbox can correspond to a semaphore: non-blocking send + blocking receive – equivalent to : signal (V) by sender + wait (P) by receiver • Mutual Exclusion : – initialise : » create_mailbox (mutex) send (mutex, null-message) – for each process : » while (TRUE) { receive (mutex, null-message); critical section send (mutex, null-messge); } – mutual exclusion just depends on whether mailbox is empty or not » message is just a token, possesion of which gives right to enter C.S. 17 Operating Systems: IPC • Producer / Consumer problem using messages : – Binary semaphores : one message token – General (counting) semaphores : more than one message token – message blocks used to buffer data items – scheme uses two mailboxes » mayproduce and mayconsume – producer : » get a message block from mayproduce » put data item in block » send message to mayconsume – consumer : » get a message from mayconsume » consume data in block » return empty message block to mayproduce mailbox 18 Operating Systems: IPC – parent process creates message slots » buffering capacity depends on number of slots created » slot = empty message capacity = buffering capacity create_mailbox ( mayproduce ); create_mailbox ( mayconsume ); for (i=0; i<capacity; i++) send (mayproduce, slot); start producer and consumer processes – producer : » while (TRUE) { receive (mayproduce, slot); slot = new data item send (mayconsume, slot); } – consumer : » while (TRUE) { receive (mayconsume, slot); consume data item in slot } send (mayproduce, slot); 19 Operating Systems: IPC – properties : » no shared global data accessed » all variables local to each process » works on a distributed network in principle » producers and consumers can be heterogeneous - written in different programming languages - just need library support - run on different machine architectures - ditto • Examples of Operating Systems with message passing : • EMAS - Edinburgh Multi-Access System – all interprocess comms done with messages » even kernel processes – system calls used messages – signals to processes used messages 20 Operating Systems: IPC • QNX Real-Time Op. Sys. – tightly synchronised send and receive » after send, sending process goes into send-blocked state » when receiving process issues a receive : » message block shared between the processes - no extra copying or buffering needed » sending process changed to reply-blocked state » had the receiver issued a receive before sender issued a send, receiver would have gone into receive-blocked state » when receiver has processed the data, it issues a reply, which unblocks the sender and allows it to continue - which process runs first is left to the scheduler – in error situations, blocked processes can be signalled » process is unblocked and allowed to deal with the signal » may need an extra monitoring process to keep track of which messages actually got through 21 Operating Systems: IPC • MACH Op. Sys. – two mailboxes (or ports) created with each process » kernel mailbox : for system calls » notify mailbox : for signals – message calls : msg_send, msg_receive and msg_rpc » msg_rpc (remote procedure call) is a combination of send and receive sends a message and then waits for exactly one return message – port_allocate : creates a new mailbox » default buffering of eight messages – mailbox creator is its owner and is given receive permission to it » can only have one owner at a time but ownership can be transferred – messages copied into queue as they are sent » all with same priority, first-come-first-served – messages have : fixed length header, variable length data portion » header contains sender and receiver Ids » variable part a list of typed data items - ownership & receive rights, task states, memory segments etc. 22 Operating Systems: IPC – when message queue full, sender can opt to : » wait indefinitely until there is space » wait at most n milleseconds » do not wait at all, but return immediately » temporarily cache the message - kept in the kernel, pending - one space for one such message - meant for server tasks e.g. to send a reply to a requester, even though the requestor’s queue may be full of other messages – messages can be received from either one mailbox or from one of a set of mailboxes – port_status call returns number of messages in a queue – message transfer done via shared virtual memory if possible to avoid copying » only works for one system, not for a distributed system 23 Operating Systems: IPC • Linux Op. Sys. IPC – shared files, pipes, sockets etc. • Shared files : – one process writes / appends to a file – other process reads from it – properties: » any pair of process with access rights to the file can communicate » large amounts of data can be transferred » synchronisation necessary • Pipes : – kernel buffers for info transfer, typically 4096 bytes long, used cyclically – e.g. used by command shells to pipe output from one command to input of another : $ ls | wc -w » command shell forks a process to execute each command » normally waits until sub-process terminates before continuing - but continues executing if ‘&’ appended to command line 24 Operating Systems: IPC #include <unistd.h> #include <stdio.h> main() { int fda[2]; // file descriptors [0] : read end, [1] write end char buf[1]; // data buffer if ( pipe(fda) < 0 ) error (“create pipe failed\n”); } switch ( fork() ) { // fork an identical sub-process case -1 : error (“fork failed\n”); case 0: // child process is pipe reader close ( fda[1] ); // close write of pipe read ( fda[0], buf, 1 ); // read a character printf (“%c\n”, buf[0] ); break; default: // parent process is pipe writer close ( fda[0] ); // close read end of pipe write (fda[1], “a”, 1); // write a character break; } – fork returns 0 to child process, sub-process ID to parent – by default : write blocks when buffer full, read blocks when buffer empty 25 Operating Systems: IPC • Aside : I/O Redirection – used by a command shell – dup() system call duplicates a file descriptor using first available table entry – example : dup() file descriptor number 2 : file descriptors open file descriptors file descriptors 0 0 1 1 2 2 3 3 – example on next slide : open file descriptors $ ls | wc -w » ls executed by child process, wc executed by parent process » file descriptors manipulated with close() and dup() so that pipe descriptors become standard output to ls and standard input to wc » execlp replaces current process with the specified command 26 Operating Systems: IPC #include <unistd.h> #include <stdio.h> main() { int fda[2]; // file descriptors if ( pipe(fda) < 0 ) error (“create pipe failed\n”); } switch ( fork() ) { // fork an identical sub-process case -1 : error (“fork failed\n”); case 0: // run ls in child process close (1); // close standard output dup ( fda[1] ); // duplicate pipe write end close ( fda[1] ); // close pipe write end close ( fda[0] ); // close pipe read end execlp (“ls”, “ls”, 0); // execute ls command error (“failed to exec ls\n”); // should not get here break; default:// run wc in parent close (0); // close standard input dup (fda[0] ); // duplicate pipe read end close ( fda[0] ); // close read end of pipe close (fda[1] ); // close write end of pipe execlp (“wc”, “wc”, “-w”, 0); // execute wc command error (“failed to execute4 wc\n”); break; } 27 Operating Systems: IPC – redirection to file similarly » closing standard input and output descriptors followed by opening the redirection files to re-use the standard I/O descriptor numbers – example : $ cat <file1 >file2 #include <stdio.h> #include <sys/stat.h> #include <fcntl.h> #define WRFLAGS (O_WRONLY | O_CREAT | O_TRUNC) #define MODE600 (S_IRUSR | S_IWUSR) main() { close (0); // close standard input if (open(“file1”, O_RDONLY) == -1) error(“open input file failed\n”); close (1); // close standard output if (open(“file2”, WRFLAGS, MODE600) == -1) error(“open output failed\n”); execlp (“cat”, “cat”, 0); } // execute cat command error(“failed to execute ‘cat’\n”); 28 Operating Systems: IPC • FIFOs - Named Pipes – can be used by any set of processes, not just forked from a common ancestor which created the anonymous pipe – FIFO files have names and directory links – created with a mknod command or system call : #define MYNODE (S_IFIFO | S_IRUSR | S_IWUSR) mknod (“myfifo”, MYMODE, 0); – can be opened and used by any process with access permission – same functionality as anonymous pipes with blocking by default – concurrent writers are permitted – writes of up to 4096 bytes are guaranteed to be atomic • System V Unix Compatibility – shared memory - shmget() creates a sharable memory segment identified by an ID which can be used by any other process by attaching it with a shmat() system call – semaphores (extremely messy!) and messages also available » see : Advanced Programming in the UNIX Environment by W.R. Stevens 29 Operating Systems: IPC • Sockets – intended for communications across a distributed network – uses the connectionless Internet Protocol and IP addresses at the lowest level e.g. 129.215.58.7 – datagram packets transmitted to destination IP host – User Datagram Protocol (UDP) or Transmission Control Protocol (TCP) at the application level – TCP is a connection based protocol, with order of packet arrival guaranteed – a socket is a communication end-point – once a TCP-socket connection between two processes is made, end-points made to act like ordinary files, using read() and write() system calls. 30 Operating Systems: IPC – a Client-Server system : » creating a socket : - sd = socket ( family, type, protocol ); » binding to a local address : - bind ( sd, IP address, addrlen ); by client process : - connect ( sd, IP address, addrlen ); // address includes port number connection // servers IP address » server listens for client connection requests : - listen ( sd, queuelen ); // number of requests that can be queued » and accepts the request : - newsd = accept ( sd, IP address, addrlen ); – accept() normally blocks if no client process waiting to establish a connection – can be made non-blocking for server to enquire whether any clients waiting – connectionless communication with UDP datagram sockets also possible using sendto() and recvfrom() system calls 31 Operating Systems: IPC // Server-side socket demo progam #include #include #include #include <fcntl.h> <linux/socket.h> <linux/in.h> <errno.h> void close_socket(int sd) { int cs; if ((cs = close(sd)) < 0) { printf(“close socket failed: %s\n”, strerror(errno)); exit(1); } } #define SERVER (129<<24 | 215<<16 | 58<<8 | 7) #define MESSAGELEN 1024 #define SERVER_PORT 5000 void main() { int ssd, csd; struct sockaddr_in server, client; int sockaddrlen, clientlen, ca; char message[MESSAGELEN]; int messagelen; sockaddrlen = sizeof(struct sockaddr_in); 32 Operating Systems: IPC // create socket if ((ssd = socket (AF_NET, SOCK_STREAM, 0)) < 0) { printf(“socket create failed: %s\n”, strerror(errno)); exit(1): } else printf(server socket created, ssd = %d\n”, ssd); // bind socket to me server.sin_family = AF_INET; server.sin_port = htons(SERVER_PORT); // big/little-endian conversion server.sin_addr.s_addr = htonl(SERVER); bzero(&server.sin_zero, 8); if (bind(ssd, (struct sockaddr *) &server, sockaddrlen) < 0) { printf(“server bind failed: %s\n”, strerror(errno)); exit(1): } // listen on my socket for clients if (listen(ssd, 1) < 0) { printf(“listen failed: %s\n”, strerror(errno)); close_socket(ssd); exit(1); } // make socket non-blocking fcntl(ssd, F_SETFL, fcntl(ssd, F_GETFL) | O_NDELAY); 33 Operating Systems: IPC // accept a client (non-blocking) clientlen = sockaddrlen; while ((csd = accept(ssd, &client, &clientlen)) < 0) { if (errno == EAGAIN) { printf(“no client yet\n”); sleep(1); // wait a sec } else { printf(“accept failed: %s\n”, strerror(errno)); close_socker(ssd); exit(1); } ca = ntohl(client.sin_addr.s_addr); printf(“client accepted, csd = %d, IP = %d.%d.%d.%d\n”, csd, (ca>>24)&255, (ca>>16)&255, (ca>>8)&255, ca&255); // send message to client sprintf(message, “Server calling client : hi!\n”); messagelen - strlen(message)+1; if (write(csd, message, messagelen) != messagelen) { printf(write failed\n”); close_socket(ssd); exit(1); } else printf(“message sent to client\n”); // receive message from client if (read(csd, message, MESSAGELEN) < 0) { if (errno == EAGAIN) { 34 Operating Systems: IPC printf(“no client message yet\n”); sleep(1); } else { printf(“read failed: %s\n”, strerror(errno)); close_socket(ssd); exit(1); } printf(“client message was:\n%s”, message); close_socket(ssd); } 35 Operating Systems: IPC // Client-side socket demo program #include <fcntl.h> #include <linux/socket.h> #include <linux/in.h> #include <errno.h> void close_socket(int sd) { int cs; if ((cs = close(sd)) < 0) { printf(“close socket failed: %s\n”, strerror(errno)); exit(1); } } #define SERVER (129<<24 | 215<<16 | 58<<8 | 7) #define MESSAGELEN 1024 #define SERVER_PORT 5000 void main() { int ssd, csd; struct sockaddr_in server, client; int sockaddrlen, clientlen, ca; char message[MESSAGELEN]; int messagelen; sockaddrlen = sizeof(struct sockaddr_in); 36 Operating Systems: IPC // server address server.sin_family = AF_INET; server.sin_port = htons(SERVER_PORT); server.sin_addr.s_addr = htonl(SERVER); for (;;) { //create socket if ((csd = socket(AF_INET, SOCK_STREAM, 0)) < 0) { printf(“client socket create failed: %s\n”, strerror(errno)); exit(1); } else prinf(“client socket create, csd = %d\n”, csd); // try to connect to server if (connect(csd, (struct sockaddr *) &server, sockaddrlen) < 0) { printf(“connect failed: %s\n”, strerror(errno)); // need to destroy socket before trying to connect again close_socket(csd); sleep(1); } else break; } printf(“connected to server\n”); // make socket non-blocking fcntl(csd, F_SETFL, fcntl(csd, F_GETFL) | O_NDELAY); 37 Operating Systems: IPC // receive a message from server while (read(csd, message, MESSAGELEN) < 0) { if (errno == EAGAIN) { printf(“no server message yet\n”); sleep(1); } else { printf(“read failed: %s\n”, strerror(errno)); close_socket(csd); exit(1); } } printf(“server message was:\n%s”, message); // send a message to server sprintf(message, “Client calling server : ho!\n”); messagelen = strlen(message)+1; if (write(csd, message, messagelen) != messagelen) { printf(“write failed\n”); close_socket(csd); exit(1); } else printf(“message sent to server\n”); close_socket(csd); } 38 Operating Systems: IPC • Signals – the mechanism whereby processes are made aware of events occurring – asynchronous - can be received by a process at any time in its execution – examples of Linux signal types: » SIGINT : interrupt from keyboard » SIGFPE : floating point exception » SIGKILL : terminate receiving process » SIGCHLD : child process stopped or terminated » SIGSEGV : segment access violation – default action is usually for kernel to terminate the receiving process – process can request some other action » ignore the signal - process will not know it happened - SIGKILL and SIGSTOP cannot be ignored » restore signal’s default action » execute a pre-arranged signal-handling function - process can register a function to be called - like an interrupt service routine 39 Operating Systems: IPC - when the handler returns, control is passed back to the main process code and normal execution continues » to set up a signal handler: #include <signal.h> #include <unistd.h> void (*signal(int signum, void (*handler)(int)))(int); » signal is a call which takes two parameters - signum : the signal number - handler : a pointer to a function which takes a single integer parameter and returns nothing (void) » return value is itself a pointer to a function which: - takes a single integer parameter and returns nothing – example: » gets characters until a newline typed, then goes into an infinite loop » uses signals to count ctrl-c’s typed at keyboard until newline typed 40 Operating Systems: IPC #include <stdio.h> #include <signal.h> #include <unistd.h> int ctrl_C_count = 0; void (* old_handler)(int); void ctrl_c(int); main () { int c; old_handler = signal (SIGINT, ctrl_c ); while ((c = getchar()) != ‘\n’); printf(“ctrl_c count = %d\n”, ctrl_c_count); (void) signal (SIGINT, old_handler); for (;;); } void ctrl_c(int signum) { (void) signal (SIGINT, ctrl_c); ++ctrl_c_count; } // signals are automatically reset • see also the POSIX sigaction() call - more complex but better 41