Unix

D u k e S y s t e m s

CPS 310

Unix Broadly Defined

Jeff Chase

Duke University http://www.cs.duke.edu/~chase/cps310

The story so far: process and kernel

• A (classical) OS lets us run programs as processes. A process is a running program instance (with a thread).

– Program code runs with the core in untrusted user mode .

• Processes are protected/isolated.

– Virtual address space is a “fenced pasture”

– Sandbox : can’t get out. Lockbox : nobody else can get in.

• The OS kernel controls everything.

– Kernel code runs with the core in trusted kernel mode .

• OS offers system call APIs.

– Create processes

– Control processes

– Monitor process execution

Processes and the kernel

Programs run as independent processes. data data

Each process has a private virtual address space and one thread.

Protected system calls

...and upcalls

(e.g., signals)

Protected OS kernel mediates access to shared resources.

Threads enter the kernel for

OS services.

The kernel is a separate component/context with enforced modularity.

The kernel syscall interface supports processes, files, pipes, and signals.

Unix fork/exec/exit/wait syscalls

fork parent parent program initializes child context fork child exec wait exit int pid = fork();

Create a new process that is a clone of its parent.

exec*( “ program ” [argvp, envp]);

Overlay the calling process with a new program, and transfer control to it, passing arguments and environment.

exit(status);

Exit with status, destroying the process. int pid = wait*(&status);

Wait for exit (or other status change) of a child , and “ reap ” its exit status.

Recommended: use waitpid() .

Unix fork/exit syscalls

parent fork child int pid = fork();

Create a new process that is a clone of its parent, return new process ID ( pid ) to parent, return 0 to child.

exit(status);

Exit with status, destroying the process.

Note: this is not the only way for a process to exit!

time data data p exit exit pid: 5587 pid: 5588

fork

int pid; int status = 0;

} if (pid = fork ()) {

/* parent */

…..

} else {

/* child */

…..

exit(status);

The fork syscall returns twice:

It returns a zero in the context of the new child process.

It returns the new child process ID

(pid) in the context of the parent.

exit syscall

fork (original concept)

fork in action today

}

{ void dofork() int rc = fork() ; if (rc < 0) { perror("fork failed: "); exit(1);

} else if (rc == 0) { child();

}

} else { parent(rc);

Fork is conceptually difficult but syntactically clean and simple.

I don’t have to say anything about what the new child process “looks like”: it is an exact clone of the parent!

The child has a new thread executing at the same point in the same program.

The child is a new instance of the running program: it has a “copy” of the entire address space. The “only” change is the process ID and return code rc!

The parent thread continues on its way.

The child thread continues on its way.

A simple program: forkdeep

int count = 0; int level = 0;

} void child() { level++; output pids if (level < count) dofork(); if (level == count) sleep(3);

} void parent(int childpid) { output pids

wait for child to finish

How?

}

{ main(int argc, char *argv[]) count = atoi(argv[1]); dofork(); output pid

chase$ ./forkdeep 4

30866-> 30867

30867

30867-> 30868

30868

30868-> 30869

30869

30869-> 30870

30867

30866 chase$

30869

30868

30870

30870

wait

Note : in modern systems the wait syscall has many variants and options.

wait

int pid; int status = 0;

} if (pid = fork()) {

/* parent */

…..

pid = wait(&status);

} else {

/* child */

…..

exit(status);

Parent uses wait to sleep until the child exits; wait returns child pid and status.

Wait variants allow wait on a specific child, or notification of stops and other

“signals”.

Thread states and transitions

We will presume that these transitions occur only in kernel mode. This is true in classical Unix and in systems with pure kernel-based threads.

Before a thread can sleep , it must first enter the kernel via trap (syscall) or fault.

Before a thread can yield , it must enter the kernel, or the core must take an interrupt to return control to the kernel.

blocked sleep running yield preempt dispatch wakeup ready

On entry to the running state, kernel code decides if/when/how to enter user mode, and sets up a suitable context.

STOP wait

Kernel Stacks and Trap/Fault Handling

Threads execute user code on a user stack in the user virtual memory in the process virtual address space.

data stack stack

System calls and faults run in kernel mode on a kernel stack.

Each thread has a second kernel stack in kernel space (VM accessible only in kernel mode).

stack syscall dispatch table stack

Kernel code running in P’s process context has access to

P’s virtual memory.

The syscall handler makes an indirect call through the system call dispatch table to the handler registered for the specific system call.

Running a program

sections code (“text”) constants initialized data segments data

Program

Unix: fork/exec

Process

Thread virtual memory

When a program launches, the OS creates a process to run it, with a main thread to execute the code, and a virtual memory to store the running program’s code and data.

But how do I run a new program?

• The child, or any process really, can replace its program in midstream.

• exec* system call: “forget everything in my address space and reinitialize my entire address space with stuff from a named program file.”

• The exec system call never returns : the new program executes in the calling process until it dies (exits).

– The code from the parent program runs in the child process and controls its future. The parent program selects the program that the child process will run (via exec ), and sets up its connections to the outside world. The child program doesn’t even know!

exec (original concept)

A simple program: forkexec

}

… main(int argc, char *argv[]) { int status; int rc = fork(); if (rc < 0) { perror("fork failed: "); exit(1);

}

} else if (rc == 0) { argv++; execve(argv[0], argv, 0);

} else { waitpid(rc, &status, 0); printf("child %d exited with status %d\n", rc,

WEXITSTATUS(status));

A simple program: prog0

}

{ int main()

chase$ cc –o forkexec forkexec.c

chase$ cc –o prog0 prog0.c

chase$ ./forkexec prog0 child 19175 exited with status 0 chase$

Unix fork/exec/exit/wait syscalls

fork parent parent program initializes child process context fork child exec wait exit int pid = fork();

Create a new process that is a clone of its parent.

exec*( “ program ” [argvp, envp]);

Overlay the calling process with a new program, and transfer control to it, passing arguments and environment.

exit(status);

Exit with status, destroying the process. int pid = wait*(&status);

Wait for exit (or other status change) of a child , and “ reap ” its exit status.

Recommended: use waitpid() .

Mode Changes for Fork/Exit

• Syscall traps and “returns” are not always paired.

– fork “returns” (to child) from a trap that “never happened”

– exit system call trap never returns

– System may switch processes between trap and return parent

Fork call child

Fork entry to user space

Exit call

Fork return

Wait call

Wait return

Exec enters the child by doctoring up a saved user context to “return” through.

transition from user to kernel mode (callsys) transition from kernel to user mode (retsys)

But how is the first process made?

Init and Descendents

Kernel “ handcrafts ” initial process to run

“ init ” program.

Other processes descend from init, including one instance of the login program for each terminal.

Login runs user shell in a child process after user authenticates.

User shell runs user commands as child processes.

Processes: A Closer Look

virtual address space

The kernel must initialize the process memory for the program to run.

+ thread

+ process descriptor (PCB) user ID process ID parent PID sibling links children resources stack

The address space is a private name space for a set of memory segments used by the process.

Each process has a thread bound to the VAS.

The thread has a stack addressable through the

VAS.

The kernel can suspend/restart the thread wherever and whenever it wants.

The OS maintains some state for each process in the kernel’s internal data structures: a file descriptor table, links to maintain the process tree, and a place to store the exit status.

The Shell

• Users may select from a range of command interpreter

(“shell”) programs available. (

Or even write their own!)

– csh, sh, ksh, tcsh , bash: choose your flavor…

• Shells execute commands composed of program filenames, args, and I/O redirection symbols.

– Fork, exec, wait, etc., etc.

– Can coordinate multiple child processes that run together as a process group or job.

– Shells can run files of commands (scripts) for more complex tasks, e.g., by redirecting I/O channels (descriptors).

– Shell behavior is guided by environment variables, e.g., $PATH

– Parent may control/monitor all aspects of child execution.

Unix process: parents rule

Created with fork by parent program running in parent process.

Parent program running in child process, or exec ’d program chosen by parent program.

Inherited from parent process, or modified by parent program in child

Process

Thread

Program kernel state

Virtual address space

(Virtual Memory, VM) text data heap

Clone of parent VM.

Environment (argv[] and envp[]) is configured by parent program on exec.

Exec setup

(ABI)

Unix: upcalls

• The kernel directs control flow into user process at a fixed entry point: e.g., entry for exec() is _crt0 or “main”.

• Process may also register a signal handlers for events relating to the process, (generally) signalled by the kernel.

• To deliver a signal, kernel munges/redirects the thread context to execute the signal handler in user mode.

• Process lives until it exits voluntarily or fails

– “receives an unhandled signal that is fatal by default”.

data data


...and upcalls

(e.g., signals )

Unix: signals

• A signal is a typed upcall event delivered to a process by kernel.

– Process P may use kill* system call to request signal delivery to Q .

– Process may register signal handlers for signal types. A signal handler is a procedure that is invoked when a signal of a given type is delivered.

Runs in user mode, then returns to normal control flow.

– A signal delivered to a registered handler is said to be caught . If there is no registered handler then the signal triggers a default action (e.g., “die”).

• A process lives until it exits voluntarily or receives a fatal signal.

data data


...and upcalls (signals)

Kernel sends signals, e.g., to notify processes of faults.

Unix process view: data

tty pipe socket

I/O channels

(“file descriptors”) stdin stdout stderr

Process

VM mappings text data

Thread anon VM

Files

Program

Labels: uid etc.

etc.

Segments

Unix I/O and IPC

• I/O objects / kernel abstractions / channel types:

– Terminal : user interaction via “teletypewriter”: tty

– Pipe : Inter Process Communication ( IPC )

– Socket : networking

– File : storage

• Signals for IPC, process control, faults

• Some shellish grunge needed for Lab 2

“typewriter”

Unix process view: data

A process has multiple channels for data movement in and out of the process (I/O).

tty

The parent process and parent program set up and control the channels for a child (until exec ).

pipe socket

I/O channels

(“file descriptors”) stdin stdout stderr

Process

Thread

Program

Files

Standard I/O descriptors

I/O channels

(“file descriptors”) stdin stdout tty stderr

Open files or other I/O channels are named within the process by an integer file descriptor value.

Standard descriptors for primary input (stdin=0), primary output (stdout=1), error/status (stderr=2).

These are inherited from the parent process and/or set by the parent program.

By default they are bound to the controlling terminal.

count = read (0, buf, count); if (count == -1) { perror( “ read failed ” ); /* writes to stderr

*/ exit(1);

} count = write (1, buf, count); if (count == -1) { perror( “ write failed ” ); /* writes to stderr

*/ exit(1);

Files

Files

A file is a named, variable-length sequence of data bytes that is persistent: it exists across system restarts, and lives until it is removed.

fd = open (name, <options>); write (fd , “abcdefg”, 7); read (fd, buf, 7); lseek (fd, offset, SEEK_SET); close (fd); creat (name, mode); mkdir (name, mode); rmdir unlink

(name);

(name);

An offset is a byte index in a file. By default, a process reads and writes files sequentially . Or it can seek to a particular offset.

Files: hierarchical name space

root directory applications etc.

mount point user home directory external media volume or network storage

Unix “file descriptors” illustrated

user space kernel space int fd pointer per-process descriptor table system-wide open file table file pipe socket tty

Disclaimer: this drawing is oversimplified

Processes often reference OS kernel objects with integers that index into a table of pointers in the kernel. ( Why ?) Windows calls them handles .

In Unix, processes may share I/O objects (i.e., “files”: in Unix “everything is a file”). But the descriptor name space is per-process: fork clones parent descriptor table for child, but then they may diverge.

Processes reference objects

Files text data anon VM

Fork clones all references

Files text data anon VM

Cloned file descriptors share a read/write offset.

The kernel objects referenced by a process have reference counts .

They may be destroyed after the last ref is released, but not before.

Cloned references to VM segments are likely to be copy-on-write to create a lazy, virtual copy of the shared object.

What operations release refs?

Shell and child

tty

1 stdin stdout stderr dsh fork tcsetpgr p exec wait tty

2 stdin stdout stderr

3 tty stdin stdout stderr dsh

Child process inherits standard I/O channels to the terminal (tty).

If child is to run in the foreground :

Child receives/takes control of the terminal (tty) input (tcsetpgrp).

The foreground process receives all tty input until it stops or exits.

The parent waits for a foreground child to stop or exit.

Pipes

pipe int pfd[2] = {0, 0}; pipe(pfd);

/*pfd[0] is read, pfd[1] is write */ b = write(pfd[1], "12345\n", 6); b = read(pfd[0], buf, b); b = write(1, buf, b);

12345

A pipe is a bounded kernel buffer for passing bytes.

• The pipe () system call creates a pipe object.

• A pipe has one read end and one write end: unidirectional .

• Bytes placed in the pipe with write are returned by read in order.

• The read syscall blocks if the pipe is empty.

• The write syscall blocks if the pipe is full.

• Write fails (SIGPIPE) if no process has the other end open.

A key idea: Unix pipes

[http://www.bell-labs.com/history/unix/philosophy.html]

Unix programming environment

Standard unix programs read a byte stream from standard input (fd==0).

They write their output to standard output (fd==1).

stdin stdout

Stdin or stdout might be bound to a file, pipe, device, or network socket.

If the parent sets it up, the program doesn’t even have to know.

That style makes it easy to combine simple programs using pipes or files.

The processes may run concurrently and are automatically synchronized

Shell pipeline example

tty stdin stdout stderr dsh chase$ who | grep chase chase console Jan 13 21:08 chase ttys000 Jan 16 11:37 chase ttys001 Jan 16 15:00 chase$ tty tty stderr stdin who

Job stdout pipe stdin grep stderr stdout tty tty

But how to rewire the pipe?

1

P creates pipe.

P 2 P forks C1 and C2.

Both children inherit both ends of the pipe, and stdin/stdout/stderr.

Parent closes both ends of pipe after fork.

stdout tty stdin

C1

3A

C1 closes the read end of the pipe, closes its stdout,

“dups” the write end onto stdout, and execs. stdin stdout tty

C2

3B

C2 closes the write end of the pipe, closes its stdin,

“dups” the read end onto stdin, and execs.

Unix dup* syscall

int fd2 = 0 int fd1 = 5 tty in tty out per-process descriptor table pipe out pipe in

Hypothetical initial state before dup2(fd1, fd2) syscall.

dup2(fd1, fd2) . What does it mean?

Yes, fd1 and fd2 are integer variables. But let’s use “fd” as shorthand for “the file descriptor whose number is the value of the variable fd ”.

Then fd1 and fd2 denote entries in the file descriptor table of the calling process.

The dup2 syscall is asking the kernel to operate on those entries in a kernel data structure. It doesn’t affect the values in the variables fd1 and fd2 at all!


int fd2 = 0 int fd1 = 5

>

X tty in tty out per-process descriptor table pipe out pipe in

Final state after dup2(fd1, fd2) syscall.

Then dup2(fd1,fd2) means: “close(fd2), then set fd2 to refer to the same underlying I/O object as fd1.”

It results in two file descriptors referencing the same underlying I/O object. You can use either of the descriptors to read/write .

But you should probably just close(fd1) .


int fd2 = 0 int fd1 = 5 tty in tty out

X per-process descriptor table pipe out pipe in

Final state after dup2(fd1, fd2) syscall.

Then dup2(fd1,fd2); close(fd1) means: “remap the object referenced by file descriptor fd1 to fd2 instead”.

It is convenient for remapping descriptors onto stdin, stdout, stderr , so that some program will use them “by default” after exec *.

Note that we still have not changed the values in fd1 or fd2. Also, changing the values in fd1 and fd2 can never affect the state of the entries in the file descriptor table. Only the kernel can do that.

Simpler example

Feeding a child through a pipe

Parent int ifd[2] = {0, 0}; pipe(ifd); cid = fork(); stdin parent stdout stderr

Child close(0); close(ifd[1]); dup2 (ifd[0],0); close(ifd[0]); execve (…); pipe stdin child cid stdout stderr

Parent close(ifd[0]); count = read(0, buf, 5); count = write(ifd[1], buf, 5); waitpid(cid, &status, 0); printf ("child %d exited…”); chase$ man dup2 chase$ cc -o childin childin.c

chase$ ./childin cat5

12345

12345

5 bytes moved child 23185 exited with status 0 chase$

Notes on pipes

• All the processes in a pipeline run concurrently . The order in which they run is not defined. They may execute at the same time on multiple cores.

• Their execution is “synchronized by producer/consumer bounded buffer”. (More about this next month.) It means that a writer blocks if there is no space to write , and a reader blocks if there are no bytes to read . Otherwise they may both run: they might not, but they could.

• The pipe itself must be created by a common parent. The children inherit the pipe’s file descriptors from the parent on fork . Unix has no other way to pass a file descriptor to another process.

• How does a reader “know” that it has read all the data coming to it through a pipe? Answer : there are no bytes in the pipe, and no process has the write side open. Then the read syscall returns EOF.

By convention a process exits if it reads EOF from stdin .

Pipe reader and writer run concurrently

concept reality context switch

Context switch occurs when one process is forced to sleep because the pipe is full or empty, and maybe at other times.

Platform abstractions

•

Platforms provide “building blocks”…

•

…and APIs to use them.

– Instantiate/create/allocate

– Manipulate/configure

– Attach/detach

– Combine in uniform ways

– Release/destroy

The choice of abstractions reflects a philosophy of how to build and organize software systems.

MapReduce/

Hadoop

Hadoop

Dataflow programming

MapReduce is a filter-and-pipe model for data-intensive cluster computing . Its programming model is “like” Unix pipes. It adds “parallel pipes” (my term) that split output among multiple downstream children.

Sockets

socket

A socket is a buffered channel for passing data over a network.

client int sd = socket (<internet stream>); gethostbyname( “www.cs.duke.edu

”);

<make a sockaddr_in struct>

<install host IP address and port> connect (sd, <sockaddr_in>); write (sd , “abcdefg”, 7); read (sd , ….);

• The socket () system call creates a socket object.

• Other socket syscalls establish a connection (e.g., connect ).

• A file descriptor for a connected socket is bidirectional .

• Bytes placed in the socket with write are returned by read in order.

• The read syscall blocks if the socket is empty.

• The write syscall blocks if the socket is full.

• Both read and write fail if there is no valid connection.

A simple, familiar example

request

“ GET /images/fish.gif HTTP/1.1

” reply client (initiator) sd = socket(…); connect(sd, name ); write(sd , request…); read(sd , reply…); close(sd); server s = socket(…); bind(s, name ); sd = accept(s); read(sd , request…); write(sd , reply…); close(sd);

Unix: simple abstractions?

•

•

users files

•

processes

•

pipes

– which “look like” files

These persist across reboots.

They have symbolic names (you choose it) and internal IDs (the system chooses).

These exist within a running system, and they are transient: they disappear on a crash or reboot. They have internal IDs.

Unix supports dynamic create/destroy of these objects.

It manages the various name spaces.

It has system calls to access these objects.

It checks permissions.

Unix: a lesson here

• Classical Unix programs are packaged software components: specific functions with narrow, text-oriented interfaces.

• Shell is a powerful (but character-based) user interface.

• It is also a programming environment for composing and coordinating programs interactively.

• So: it’s both an interactive programming environment and a programmable user interface .

– Both ideas were “new” at the time (late 1960s).

– Unix has inspired a devoted (druidic?) following over 40+ years.

• Its powerful scripting environment is widely used in large-scale system administration and services.

• And it’s inside your laptop, smartphone, server, cloud.

Unix defines uniform, modular ways to combine programs to build up more complex functionality .

Other application programs cc comp cpp as nroff sh who

Kernel a.out date

Hardware wc ld grep vi ed

Other application programs

Unix: A lasting achievement?

“ Perhaps the most important achievement of Unix is to demonstrate that a powerful operating system for interactive use need not be expensive …it can run on hardware costing as little as $40,000.

”

DEC PDP-11/24

The UNIX Time-Sharing System*

D. M. Ritchie and K. Thompson

1974 http://histoire.info.online.fr/pdp11.html

Let ’s pause a moment to reflect...

Note log scale

10000

1000

From Hennessy and Patterson,

Computer Architecture: A

Quantitative Approach , 4th edition,

2006

52%/year

100

Core Rate

(SPECint)

10

25%/year

??%/year

1

1978 1980 1982 1984 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004 2006

Today Unix runs embedded in devices costing < $100.

Small is beautiful?

The UNIX Time-Sharing System*

D. M. Ritchie and K. Thompson

1974

Unix, looking backward: UI+IPC

• Conceived around keystrokes and byte streams

– User-visible environment is centered on a text-based command shell.

• Limited view of how programs interact

– files: byte streams in a shared name space

– pipes: byte streams between pairs of sibling processes

X Windows (1985)

Big change: GUI.

1.

Windows

2.

Window server

3.

App events

4.

Widget toolkit

Unix, looking backward: security

• Presumes multiple users sharing a machine.

• Each user has a userID.

– UserID owns all files created by all programs user runs.

– Any program can access any file owned by userID.

• Each user trusts all programs it chooses to run.

– We “deputize” every program.

– Some deputies get confused.

– Result: decades of confused deputy security problems.

• Contrary view: give programs the privileges they need, and nothing more.

– Principle of Least Privilege

These slides are relevant to the shell lab, but they weren ’t discussed in class and the material won ’t be tested (unless it also appeared somewhere else).

Jobs and job control

• A job is a set of processes that run together.

– The processes are joined in a process group .

– The process group leader is the first process in the set, e.g., the leftmost process in a pipeline (other jobs have only one process).

• A foreground job receives/takes control of the tty.

– The parent passes control; the job ’s process group takes control; parent waits on processes in group. Call this “ set-fg ”.

• When a foreground job stops or completes, the parent awakes and receives/takes back control of the tty.

• These mechanisms exist to share the controlling tty among multiple processes/jobs that are active at the same time.

• Before the advent of GUIs, each user had only one tty! Job control was an important innovation that enabled users to “multitask”.

Process goups

• The process(es) of a job run as a process group .

– Process group ID ( pgid ) is pid of leader.

– Every process is in exactly one process group.

– Exactly one process group controls the tty.

– The members of the controlling process group are foreground .

• If a user types ctrl-c ( cancel ) or ctrl-z ( snooze ):

– The kernel (tty driver) sends a signal to all members of the controlling process group. They die or stop if they don ’t catch it.

– The shell keeps its own process group separate from its children, so it doesn ’t receive their signals.

• If a process does a read from the tty:

– If it is not in the foreground then it receives a signal (SIGTTIN). It stops if it doesn ’t catch it. Optionally, write generates a signal too (SIGTTOU).

Job states and transitions

User can send a

STOP signal to a foreground process/job by typing ctrl-z on tty.

fork + set-fg

SIGCONT set-fg fg ctrl-z

(SIGTSTP) ctrl-c:

(SIGINT) exit

EXIT exit

Continue a stopped process or job by sending it a SIGCONT signal with “ kill ” or

“ killpg ” syscall.

SIGCONT tty in tty out bg

P receives SIGTTIN if it attempts to read from tty and is not in controlling process group. Optionally, P receives SIGTTOU if it attempts to write to tty and is not in controlling process group. Default action of these signals is STOP .

Unix signals

• Unix signals are “musty”. We don’t teach/test the details.

– Sometimes you can ’t ignore them!

• A signal is an “upcall” event delivered to a process.

– Sent by kernel, or by another process via kill* syscall.

– An arriving signal transfers control to a handler routine in receiving process, designated to handle that type of signal.

– The handler routine returns to normal control flow when done.

• Or: if no designated handler, take a standard action.

– e.g., DEFAULT, IGNORE, STOP, CONTINUE

– Default action is often for receiving process to exit .

– An exit from an unhandled signal is reported to parent via wait* .

Signals and dsh

• Your shell must set up itself and its children properly for

“stuff to work like it should”.

– We have provided you code to initialize shell, parse, etc.

– You do not need to understand that code

– You should not need to mess with that code.

– Please don ’t mess with that code.

– If you are messing with that code, please talk to us.

– What that code does: transfer control of terminal to foreground child job, using process groups and signals, prepare for proper handling of various events that might occur.

You shouldn ’t have to know

}

How to seize control of the terminal if ( tcsetpgrp (0, getpid()) < 0) { perror("tcsetpgrp failure"); exit(EXIT_FAILURE);

Translation : “Terminal control set process group for stdin

(file descriptor 0) to the process group of which I am a leader. Stdin is a tty and I want control of it.

”

You can also use tcsetpgrp to pass control of the terminal to some other process group named by pgid.

Shell and children

dsh

Any of these child jobs/processes could exit or stop at any time.

P1A

&

P1B

Job 1

&

P2A

Job 2

P3A

Job 3

How should a process wait for the next event, if there are many possible events to wait for?

If the process blocks to wait for one event, could it miss another event that happens first?

Monitoring background jobs

dsh

Parent can use a non-blocking wait syscall to poll (query) the status of a child.

waitpid (pid, &status, WNOHANG);

“echo y&”

“jobs” fork exec wait

But how should a parent learn of status changes to a child if parent is busy with something else and does not poll?

“Do you know where your children are?”

EXIT


dsh

What if a child changes state while the shell is blocked on some other event, e.g., waiting for input, or waiting for a foreground job to complete?

“echo y&” coffee ….

fork exec

EXIT

A “real” shell should notice and inform the user immediately. But dsh will not.

Parent is waiting for read to return the next input command.

How should parent learn of the child exit event?


dsh

A “real” shell receives SIGCHLD signals from the kernel when a child changes state.

“echo y&” coffee….

If the shell is blocked on some other system call (e.g., read), then

SIGCHLD interrupts the read.

fork

Event notifications

SIGCHLD

Key point: many programs need to be able to receive and handle multiple event types promptly, regardless of what order they arrive in. Unix has grown some funky mechanisms to deal with that.

SIGCHLD exec

STOP

EXIT

Unix I/O: the basics

char buf[BUFSIZE]; size_t count = 5; count = read (0, buf, count); if (count == -1) { perror( “ read failed ” ); exit(1);

} count = write (1, buf, count); tty input: output: if (count == -1) { perror( “ write failed ” ); exit(1);

} fprintf(stderr, “\n%zd bytes moved\n”, count); chase$ cc –o cat5 cat5.c

chase$ ./cat5

<waits for tty input>

12345

12345

<enter>

5 bytes moved chase$

TTY read blocks until

<enter> or <EOF>, then returns all bytes entered on the line. A read returns 0 after <EOF>.


char buf[BUFSIZE]; size_t count = 5; count = read (0, buf, count); if (count == -1) { perror( “ read failed ” ); exit(1);

} count = write (1, buf, count); if (count == -1) { perror( “ write failed ” ); exit(1);

} fprintf(stderr,

“\n%zd bytes moved\n”, count); chase$ ./cat5

1234

1234

5 bytes moved chase$ ./cat5

123

123


123456

12345

5 bytes moved chase$ 6

-bash: 6: command not found chase$


char buf[BUFSIZE]; size_t count = 5; chase$ ./cat5

1234

1234

TTY read returns the

<enter> as ‘\n’ char.

And prints it.

count = read (0, buf, count); if (count == -1) { perror( “ read failed ” ); exit(1);

} count = write (1, buf, count); if (count == -1) { perror( “ write failed ” ); exit(1);

} fprintf(stderr,

“\n%zd bytes moved\n”, count);


123

123 read syscall returns the number of bytes actually read.


123456

12345

Any bytes not consumed by read are returned by the next read

(in shell process).

5 bytes moved chase$ 6

-bash: 6: command not found chase$

Simple I/O: args and printf

#include <stdio.h>

{ int main(int argc, char* argv[]) int i;

}

} printf("arguments: %d\n", argc); for (i=0; i<argc; i++) { printf("%d: %s\n", i, argv[i]); chase$ cc –o prog1 prog1.c

chase$ ./forkexec prog1 arguments: 1

0: prog1 child 19178 exited with status 0 chase$ ./forkexec prog1 one 2 3 arguments: 4

0: prog1

1: one

2: 2

3: 3 child 19181 exited with status 0

Environment variables (rough)

#include <stdio.h>

#include <stdlib.h> int

{ main(int argc, char* argv[], char* envp[]) int i; int count = atoi(argv[1]);

}

} for (i=0; i < count; i++) { printf("env %d: %s\n", i, envp[i]);

Environment variables (rough)

chase$ cc –o env0 env0.c

chase$ ./env0

Segmentation fault: 11 chase$ ./env0 12 env 0: TERM_PROGRAM=Apple_Terminal env 1: TERM=xterm-256color env 2: SHELL=/bin/bash env 3: TMPDIR=/var/folders/td/ng76cpqn4zl1wrs57hldf1vm0000gn/T/ env 4: Apple_PubSub_Socket_Render=/tmp/launch-OtU5Bb/Render env 5: TERM_PROGRAM_VERSION=309 env 6: OLDPWD=/Users/chase/c210-stuff env 7: TERM_SESSION_ID=FFCE3A14-1D4B4B08… env 8: USER=chase env 9: COMMAND_MODE=unix2003 env 10: SSH_AUTH_SOCK=/tmp/launch-W03wn2/Listeners env 11: __CF_USER_TEXT_ENCODING=0x1F5:0:0 chase$

Environment variables (safer)

#include <stdio.h>

#include <stdlib.h>

{ int main(int argc, char* argv[], char* envp[]) int i; int count; if (argc < 2) { fprintf(stderr,

"Usage: %s <count>\n", argv[0]); exit(1);

} count = atoi(argv[1]);

} for (i=0; i < count; i++) { if (envp == 0) { printf("env %d: nothing!\n", i); exit(1);

} else if (envp[i] == 0) { printf("env %d: null!\n", i); exit(1);

} else printf("env %d: %s\n", i, envp[i]);

}

Where do environment variables come from?

chase$ cc –o env env.c

chase$ ./env chase$ ./forkexec env

Usage: env <count> child 19195 exited with status 1 chase$ ./forkexec env 1 env 0: null!

child 19263 exited with status 1 chase$

forkexec revisited

} char *lala = "lalala\ n"; char *nothing = 0;

… main(int argc, char *argv[]) { int status; int rc = fork(); if (rc < 0) {

…

} else if (rc == 0) { argv++; execve(argv[0], argv, &lala );

} else {

… chase$ cc –o fel forkexec-lala.c

chase$ ./fel env 1 env 0: lalala child 19276 exited with status 0 chase$

forkexec revisited again

}

… main(int argc, char *argv[], char *envp[] ) { int status; int rc = fork(); if (rc < 0) {

…

} else if (rc == 0) { argv++; execve(argv[0], argv, envp );

} else {

… chase$ cc –o fe forkexec1.c

chase$ ./fe env 3 env 0: TERM_PROGRAM=Apple_Terminal env 1: TERM=xterm-256color env 2: SHELL=/bin/bash child 19290 exited with status 0 chase$

How about this?

chase$ ./fe fe fe fe fe fe fe fe fe fe fe fe env 3

<???>

How about this?

chase$ ./fe fe fe fe fe fe fe fe fe fe fe fe env 3 env 0: TERM_PROGRAM=Apple_Terminal env 1: TERM=xterm-256color env 2: SHELL=/bin/bash It is easy for children to inherit environment variables from their child 19303 exited with status 0 child 19302 exited with status 0 child 19301 exited with status 0 child 19300 exited with status 0 child 19299 exited with status 0 child 19298 exited with status 0 child 19297 exited with status 0 child 19296 exited with status 0 child 19295 exited with status 0 child 19294 exited with status 0 child 19293 exited with status 0 child 19292 exited with status 0 chase$ parents.

Exec* enables the parent to control the environment variables and arguments passed to the children.

The child process the parent passes the environment variables program

“to itself” but controls it.