Categories of Processes Process: program in execution • Independent process do not affect others • Cooperating process affect each other – Share memory and/or resources. – Overlap I/O and processing – Communicate via message passing. • I/O-bound – Short CPU bursts between I/O • CPU-bound process – Long CPU bursts, little I/O Note: The term, job, normally refers to a batch system process A Process in Memory Process States • As a process executes, it changes state – – – – – new: The process is being created running: Instructions are being executed waiting: The process is waiting for some event to occur ready: The process is waiting to be assigned to a process terminated: The process has finished execution What makes up a process? • Process Control Block – – – – – Process Id Process State Scheduling Information Child and Parent data Per Thread • Program Counter • CPU Registers • Stack – – – – Allocated resources (file list) Accounting information I/O or wait status information Memory-management data • Program Instructions (text area) • Program Data and Heap Process States Processing Management Involves migrating processes between queues Doubly Linked Lists Goal: Mix of I/O bound (filling I/O queues) and CPU bound (filling ready queue) • • • • Job queue – All system processes Ready queue – All processes in memory and ready execute Device queues – All processes waiting for an I/O request completes Process migrate between the various queues depending on their state Process Schedulers • Long-term (What multiprogramming degree?) – Slow; runs infrequently in the background, when processes terminate • Medium-term (Which to swap?) – Efficient; runs roughly every second, when a process needs memory • Short-Term (which to dispatch next?) – Must be fast; runs often (i.e. every 100 ms), after every interrupt Context Switches Taking control from one process and giving it to another • Save the old process state and load the saved new process state • Context-switch time – pure overhead, no useful work done – Cost dependent on hardware support • Time slicing gives CPU cycles in a round-robin manner Process Creation • Parent spawn children, children spawn others • Resource sharing options – Parent and children share all resources – Children share some parent resources – No sharing • Execution options – Parent and children execute concurrently – Parent waits until children terminate • Address space options – Child is a duplicate of the parent – Child space has a program loaded into it A Solaris process spawning tree Root of all user processes File management Support Remote Telnet/FTP Memory management X Windows POSIX Example fork: Child is a clone of the parent exec: replace memory with new program Win32 Example Java Example Runs separate process, which could be a separate JVM Get Process System.out.println output Note A JVM is a separate application, supporting multiple threads Only one process resides within a single JVM Process Termination • Process tells OS that is done (exit(0) or System.exit(0)) – Resources are de-allocated – Status value (usually an integer) is returned • Parent may terminate child processes (abort()) – Child has exceeded allocated resources – Child is no longer needed – Some operating system perform cascading termination, (automatic child process termination if a parent terminates) • Parent may wait for child to terminate (wait()) – The child's pid is returned, facilitating process management – Solaris: if a parent terminates, init becomes the parent Cooperating Processes Definition: Processes that affect each other's state • Reasons – – – – Share Data Speedup: Breaking a task into pieces and execute in parallel Increase modularity of a complex application Enhance user convenience: Pipeline a series of algorithmic steps • Modes of Communication – Shared Memory: • System call maps shared memory into logical space of each process • Mutual exclusion techniques required • Faster; Processes directly access shared memory without OS help – Message Passing: • Easier to program: No critical sections or locking mechanisms needed • Slower: System calls required for send/receive operations Inter-process Communication Message Passing Shared Memory Producer-Consumer Problem Example: Cooperating processes • unbounded-buffer: No practical buffer size limit • bounded-buffer: A limited fixed-sized buffer Note: Java can share memory between threads, but not among separate JVM processes Shared-Memory Solution public interface Buffer { public abstract void produce(Object item); public abstract Object consume(); } public class BoundedBuffer implements Buffer { private int count, in, out; private Object[] buffer; public BoundedBuffer(int size) { count = in = out = 0; buffer = new Object(size); } public void produce(Object item) { while (count == buffer.length) ; buffer[in] = item; in = (in + 1)%buffer.length; count++; } public Object consume() { while (count == 0) ; Object item = buffer[out]; out = (out + 1)%buffer.length; count--; return item; } } Message Passing • Message system – processes send and receive to communicate – send(P, message) // Send to process P – receive(Q, message) // Receive from process Q • To communicate, P and Q must: – establish a communication link between them – exchange messages via send/receive • Communication link can be: – physical (e.g., shared memory, hardware bus) – logical (e.g., logical properties) Implementation Questions • • • • • • • • • • • • How to establish links? Network, memory, hardware, etc.? Number of processes sharing a link? Number of links per process? Capacity? 0-length (no buffering), bounded queue length, unbounded? Variable or fixed sized messages? Unidirectional or bi-directional? Synchronous/asynchronous? Blocking/non-blocking send/receive Symmetry/Asymmetry? Sender and receivers name each other or not? Direct/indirect communication? By hard-coded pid, or mailbox/port? Copy message data or remap memory? Communication standard? Protocol? Persistent or transient? Messges lost after owner terminates? Direct Communication • Processes name each other explicitly: – send (P, message) – send to process P – receive(Q, message) – receive from process Q • Properties of communication link – – – – Links are established automatically A link connects exactly one pair processes Between each pair there exists exactly one link Links are usually bi-directional Indirect Communication • Messages sent/received using ports (mailboxes or ports) – A mailbox has a unique id; processes communicate using the id – Mailboxes may or may not persist when a processer terminates – Mailboxes can be owned by a process or by the OS. • Properties of communication links – – – – Processes share common mailbox links A link may be associated with many processes Processes may share many communication links Link may be unidirectional or bi-directional • Operations or OS system calls – – – – – create or destroy mailboxes send and receive messages through mailbox Delegate send and receive privileges send(A, message) – send a message to mailbox A receive(A, message) – receive a message from mailbox A Indirect Communication Issue • Mailbox sharing – P1, P2, and P3 share mailbox A – P1, sends; P2 and P3 receive – Who gets the message? • Possible Solutions – Links shared by at most two processes – Only one process can receive messages – Arbitrarily select the receiver and notify sender Synchronization • Blocking: Synchronous – Blocking send: The sender block until the message is received – Blocking receive: The receiver block until a message is available • Non-blocking: Asynchronous – Non-blocking send has the sender send the message and continue – Non-blocking receive has the receiver receive a valid message or null Buffering • Buffering: The link queues messages • Queue length 1. Zero capacity – queue length = 0 Sender waits for receiver (rendezvous) 2. Bounded capacity – queue length=n Sender must wait if the link is full 3. Unbounded capacity – infinite length Sender never waits Message Passing Solution public class Unbounded { private Vector queue; public Unbounded() { queue = new Vector(); } public void send(Object item) { queue.addElement(item); } public Object receive() { if (queue.size() == 0) return null; else return queue.removeElementAt(0); } } // Producer while(true) { send(new Date()); } // Consumer while(true) { Date msg = (Date)receive(); if (msg != null) System.out.println(msg); } Message Passing in Windows XP Server establishes a port When client wants to communicate, a handle is sent to server and client Shared memory if >256 bytes Small messages copied, larger messages use memory mapping Socket Communication Definition: A socket is an endpoint for communication • • • • Concatenation of IP address and port (ex: 161.25.19.8:1625) Local host for loopback: 127.0.0.1 Socket port numbers below 1024 are considered well-known Socket class for Connection-oriented (TCP) sockets, DatagramSocket class or MulticastSocket class for connectionless (UDP) sockets, Note: Communication occurs between a pair of sockets Java Socket Server 1. 2. 3. 4. 5. 6. Establish a listening port number Accept a client connection Get the connection's output stream Write the data Close the connection Continue listening for requests Java Socket Client 1. 2. 3. 4. Establish a connection to the server Get the connection's output stream Read the data Close the connection Remote Procedure Calls (RPC) • Remote procedure call (RPC) abstracts procedure calls between processes on networked systems • Higher level abstraction on top of socket communication • The data is very structured (function/method calling data) • Implementation – Client Side (issues an RPC call as if it were local) 1. 2. 3. 4. Calls a stub (a proxy for the server side procedure) The stub marshalls the parameters for communication The message is sent to the server The stub receives the result and returns to the caller – Server-side (listens on a designated RPC port) 1. 2. A stub receives request, un-marshalls data and calls procedure The returned data is sent back to a waiting client RPC Considerations • Reliability: What happens when the server fails? Possible solution: provide redundant servers • Multiple calls: Retries trigger duplicate processing. Solution: Time stamp messages/process duplicate requests "at most once" • Port binding: Solution: Use fixed port numbers or provide a rendezview operation to establish port connections • Data representation – Big endian (most significant byte first) versus little endian (least significant byte first) – Size and format of various data types – Serializing memory addresses – Solution: Machine independent external data representation (XDR) • Disadvantage: Overhead compared to local procedure calls. • Advantage: Flexibility and single-point hosting of data RPC Execution Remote Method Invocation (RMI) • Remote Method Invocation: Java-based mechanism • Difference from RPC: Links to remote Objects, passing serialized (java.io.Serializable) objects as parameters Marshalling Parameters Server Side: Define Remote Objects • Declare an interface to: – Specify the signatures for the accessible methods – The interface extends java.rmi.RemoteException • Implement a program that listens for RMI requests – A main method creates an instance of the remote object and registers it with an appropriate name so it can listen for RMI requests through Java's socket scheme – The main class implements the RMI interface and extends java.rmi.rmi.server.UnicastRemoteObject – A default constructor is necessary, throwing a RemoteException if a network failure occurs // The RMI interface // The implementation RMI Server RMI Client // The same interface as on the server Client RMI Call 1. Lookup and connect to the host 2. Call the method as if it were local Implementation Steps 1. Compile the source files 2. Generate the stub a. Before Java 1.5, rmic RemoteDateImpl b. After Java 1.5, this is done automatically 3. Start the RMI registry in the background a. Unix: rmiregistry & b. Windows: start rmiregistry 4. Connects server RMI objects to registry: java RemoteDateImpl 5. Reference remote objects from client: java RMIClient Advantages compared to Socket Programming 1. RMI is a high level implementation; the registry abstracts socket management 2. Clients do not have to package the data and deal with data representation issues Threads (Linux: tasks) A path of execution through a process Motivation for Threads • Responsiveness: An application continues executing when a blocking call occurs • Resource Sharing: All threads can share an application's resources • Economy: Creating new heavyweight processes is expensive time wise and consumes extra memory • Parallel Processing: Threads can simultaneously execute on different cores Note: Internal kernel threads concurrently perform OS functions. Servers use threads to efficiently handle client requests User and Kernel Threads • User threads - Thread management done by user-level threads library without OS support. Less system calls – more efficient • Kernel threads – Thread management directly supported by the kernel. More OS overhead • Tradeoffs: Kernel thread handling incurs more overhead. User threads stops the application on every blocking call. • Most modern operating systems support kernel threads (Windows, Solaris, Linux, UNIX, Mac OS). Many-to-One Model Thread managed by a run time library (more efficient) • A blocking call will suspend the application • All threads run on a single core • Examples: Green threads (on a virtual machine) and GNU portable threads (OS is not aware of the threads) One to One Thread management is done in the kernel • Disadvantages – Increased overhead – Upper limit on the total number of threads • Advantage: Maximum concurrency; blocking OS calls don't suspend applications • Examples: Windows XP, Linux, Solaris Many-to-Many Thread management shared by kernel and user level libraries • Kernel thread pool assigned to an application and managed by a thread library • Advantages – Eliminates user thread number limit – Applications don't suspend on blocking OS calls – Increased thread efficiency while maintaining concurrency • Disadvantage: Up calls from the kernel to the thread library • Example: Windows XP Thread Fiber library Two level threads: A many-to-many model. The kernel maps threads onto processors, and the run-time thread library maps user threads onto kernel threads Many-to-many Thread Scheduling Issue: How many kernel threads? • Too many means extra OS overhead • Too few means processes block 1. The kernel assigns a group of kernel threads to a process 2. Up-calls: kernel -> thread library a. If the process is about to block b. If a blocked kernel thread becomes ready c. Thread allocations to be released (freed) Note: A Kernel process sitting between user and kernel threads assist with thread management. These are often called light weight processes (LWP) Threading Issues • Does spawning a new process spawn all threads or only the one that is executing? Example: fork() duplicates all threads, exec() replaces the process with a single thread • How do we cancel a thread? What if it is working with system resources? Approaches: asynchronous or Synchronous cancellation • How do we minimize overhead of continually creating threads? Answer: Thread pools • How do threads communicate? Linux: Signals to notify a thread of an event, Windows: Asynchronous procedure calls that function like callbacks. Linux: clone() options determine which resources are shared between threads. • How do we create data that is local to a specific thread? Answer: Thread specific data • How are threads scheduled for execution? One possibility: Light weight processes Pthreads Example (Sum numbers) void main ( int argc , char argv[] ) { thread_t[] handles; int t, threads=strtoi(argv[1],NULL,10); pthread_mutex_t mtx = pthread_mutex_init(&mtx, NULL); handles = malloc(threads*sizeof(pthread_t )); for (t=0;t<threads;t++) pthread_create(&handles[t], NULL, addThem, (void*)&t); for ( t= 0 ; t < threads; t ++) pthread_join(handles[t],NULL); printf ( "Total = %d\n", sum); free(handles) ; pthread_mutex_destroy(&mtx); } void* addThem( void *rank ) { int myRank = (int) (*rank) ; double mySum = 0; int i, myN = 10000/threads; int first = myN*myRank; int last = first+myN; f o r (i=first; i<last ; i++) { mySum += i; } pthread_mutex_lock(&mtx) sum += mySum; pthread_mutex_unlock(&mtx); } openMP Example (Sum numbers) int main ( int argc , char argv[] ) { double sum = 0.0 , int threads = strtoi(argv[1], NULL, 10); #pragma omp parallel num_threads(threads) reduction(+:sum) f o r (i=0; i<last ; i++) { sum+= i; } printf ( "Total = %d\n", sum); } Note: openMP is a popular industry-wide thread standard Java Threads • Java threads are managed by the JVM and generally utilize an OS provided thread-based library • Java threads may be created by: – Implementing the Runnable interface – Extending the Thread class • Java threads start by calling the start method – Allocate memory for the thread – Call the start method • Java threads terminate when they leave the run method Java Thread States • The isAlive() method returns true if a thread is not dead. • Note: The getState() method returns values of DONE, PENDING, STARTED Java Threads (Sum numbers) public class Driver extends Thread { private static double sum = 0.0; private int upper; public Driver(int upper) { this.upper = upper; this.start(); } public static void main(String[] args) { Driver thread = new Driver(Integer.parseInt(args[0])); thread.join(); System.out.println("Total = " + sum); } public void run() { sum = 0; for (int i=0; i<upper; i++) sum += i; } } Java Producer/Consumer class Producer implements Runnable { private Vector<Date> mbox; public Producer(Vector<Date> mbox) { this.mbox = mbox; } public void run() { while(true) { Thread.currentThread.sleep((int)Math.random(100)); mbox.add(new Date()); } } } class Consumer implements Runnable { private Vector<Date> mbox; public Consumer(Channel mbox) { this.mbox = mbox; } public void run() { while(true) { Thread.currentThread.sleep((int)Math.random(100)); if ((!mbox.isEmpty()) System.out.println((Date)mbox.remove(0)); } } public class Factory { public Factory() { Vector<Date> mailBox = new Vector<Date>(); new Producer(mailBox).start(); new Consumer(mailBox).start(); } public static void main(String[] args) { Factory server = new Factory(); } } Java Thread Cancellation Example: Kill the loading of a Web page in a browser • Asynchronous: immediately cancel the thread. Can lead to problems if the thread has partially completed a critical function and thread owned resources may not be reclaimed (Java: stop()) • Deferred: A thread periodically checks for termination at cancellation points. The thread then cancels itself. Spawn the thread and then interrupt it Thread thread = new MyThread().start(); { … } thread.interrupt(); Periodically check if interrupted in the thread's run method Thread.currentThread.isInterrupted() // preserve interrupted signal Thread.currentThread.interrupted() // reset interrupt signal Example Thread thread = new Thread(new InterruptibleThread()); thread.start; . . . thread.interrupt(); class InterruptibleThread implements Runnable { /** This thread runs till interrupted */ public void run() { while(true) { / ** do processing till reach a cancellable point */ if (Thread.currentThread().isInterrupted()) { System.out.println("I'm interrupted!"); return; } } } Signal Handling Example: Illegal memory access • UNIX systems use signals to notify a process that a an event has occurred • Signal handler methods process signals 1. Signal generated by particular event 2. It is delivered to a process 3. It is handled • Options: – – – – Deliver the signal to the thread waiting for the signal (synchronous) Deliver the signal to every thread in the process (ctrl-C) Deliver the signal to selected threads in the process Assign a specific thread to receive all signals • User processes can override the kernel default handler Thread Pools A group of threads created in advance that await work to do • Advantages – Eliminates overhead thread creation and destruction overhead – Applications can control the size of the pool – Avoid creating an unlimited number of threads which can tax system resources • Java Options – Single thread executor - pool of size 1. Executors.newSingleThreadExecutor() – Fixed thread executor - pool of fixed size. Executors.newFixedThreadPool(int nThreads) – Cached thread pool - pool of unbounded size Executors.newCachedThreadPool() Note: Dynamic thread pools adjust the number of threads based on system load Thread Pool Example Import java.util.concurrent public class SomeThread implements Runnable { public void run() { System.out.println(new Date()); } } public class Pool { public static void main(String[] args) { int tasks=Integer.parseInt(args[0].trim()); ExecutorService pool = Executors.newCachedThreadPool(); for (int i=0; i<numTasks; i++) pool.execute(new SomeThread()); pool.shutdown(); } Thread Specific Data Each thread can have its own local data • Motivation: Useful to avoid critical sections in thread pools • Java scoping facilities – Normal scoping: variables belong to instantiated objects – Static scoping: variables belong to the class – ThreadLocal: content of these objects belong to particular threads using get and set methods. They are usually declared to be static class Service { // Thread local objects associate contents with thread instances private static ThreadLocal errorCode = new ThreadLocal(); public static void transaction() { try { // some operation } catch (Exception e) { errorCode.set(e); } } public static Object getErrorCode() { return errorCode.get(); } }