System Programming Chapters 1 & 2 1 UNIX History • Developed in the late 1960s and 1970s at Bell Labs • UNICS – a pun MULTICS (Multiplexed Information and Computer Service) which was supposed to support 1000 on line users but only handled a few (barely 3). (MULTIUNiplexed). • Thomson writes first version of UNICS in assembler for a PDF-7 in one MONTH which contains a new type of file system: kernel, shell, editor and the assembler (one week). • 1969 Thomson writes interpretive B based on BCPL --Ritchie improves on B and called it “C” • 1972 UNIX is re-written in C to facilitate porting 2 UNIX History (cont) • 1973 UNIX philosophy developed: – Write programs that do one thing and do it well – Write programs that work together – Write programs that handle text streams, because that is the universal interface 3 UNIX UNIX System Laboratories (USG/USDL/ ATTIS/DSG/ USO/USL) Berkley Software Distributions Bell Labs Research First Edition Sixth Edition Seventh Edition 1BSD,…, 4.0BSD XENIX System V Release 2,3 Chorus UNIX System V Release 4 Mach SUNOS Solaris Solaris 2 * POSIX.1 (IEEE, ISO) standard! 4.3BSD 4.3BSD Tahoe 4.3BSD Reno 4.4BSD 4 UNIX Today • Supports many users running many programs at the same time, all sharing the same computer system • Information Sharing (which is Ken’s original goal in 1969.) • Geared towards facilitating the job of creating new programs • UNIX system – Sun: SunOS • UNIX-Compatible systems – Solaris; SGI: IRIX; Free BSD; Hewlett Packard: HP-UX; Apple: OS X (Darwin) GNU: Linux; 5 UNIX Architecture User interface System call interface useruser user user user user user Shells, compilers, X, application programs, etc. CPU scheduling, signal handling, virtual memory, paging, swapping, file system, disk drivers, caching/buffering, etc. Kernel interface to the hardware terminal controller, terminals, physical memory, device controller, devices such as disks, memory, etc. UNIX 6 Introduction • Objective – Briefly describe services provided by various versions of the UNIX operating system. • Logging In – /etc/passwd – local machine or NIS DB • root:x:0:1:Super-User:/root:/bin/tcsh • Login-name, encrypted passwd, numeric user-ID, numeric group ID, comment, home dir, shell program – /etc/shadow – with “x” indicated for passwd 7 Introduction • Shell – Command interpreters • Built-in commands, e.g., umask • (External) commands, e.g., ls shell process fork wait shell process zombie process execve() exit child process 8 Introduction • Shells – Bourne shell, /bin/sh • Steve Bourne at Bell Labs – C shell, /bin/csh • Bill Jay at Berkeley – Command-line editing, history, job-control, etc – KornShell. /bin/ksh • David Korn (successor of Bourne shell) • Command-line editing, job-control, etc – .cshrc 9 Introduction • Filesystem – A hierarchical arrangement of directories and files – starting in root / • File – – – – No / or null char in filenames . and .. BSD: 255-char filenames (14 in the past) Attributes – stat() • Type, size, owner, permissions, modification/access time, etc. • Directory – Files with directory entries, e.g., { (filenames, inode) } 10 Introduction – Files and Dir • File – A sequence of bytes • Directory – A file that includes info on how to find other files. / vmunix console dev lp0 bin … csh lib … libc.a usr … include etc … … passwd * Use command “mount” to show all mounted file systems! 11 Introduction • Path name – Absolute path name • Start at the root / of the file system • /user/john/fileA – Relative path name • Start at the “current directory” which is an attribute of the process accessing the path name. • ./dirA/fileB • Links – Symbolic Link – 4.3BSD • A file containing the path name of another file can cross file-system boundaries. (home/prof/cshih>ln –s ../cshih cshih-test) – Hard Link • . or .. 12 Introduction • Directories – /vmunix - binary root image of UNIX – /dev - device special files, e.g., /dev/console – /bin - binaries of UNIX system programs • /usr/ucb - written by Berkley instead of AT&T • /usr/local/bin - written at the local site – – – – /lib - library files, e.g., those for C /user - directories for users, e.g., /user/john /etc - administrative files and programs, e.g., passwd /tmp - temporary files 13 Introduction • Program 1.1 – Page 5 – List all the files in a directory • Note – “apue.h”, err_sys() and err_quit() in Appendix B – opendir(), readdir(), closedir() • No ordering of directory entries • Working directory – Goes with each process • Home Directory 14 Introduction – Input/Output • Operations – open, close, read, write, truncate, lseek, dup, rename, chmod, chown, fcntl, ioctl, mkdir, cd, opendir, readdir, closedir, etc. • File descriptor data block data block i-node i-node i-node sync Read(4, …) Tables of Opened Files (per process) System Open File Table In-core i-node list 15 Introduction • File descriptor – Standard input (stdin), standard output (stdout), standard error (stderr) – Connected to the terminal if nothing special is done. • I/O redirection – ls > file.list • Unbuffered I/O: open, close, read, write, lseek – – – – Program 1.2 – Page 8 Copies of stdin to stdout STDIN_FILENO, STDOUT_FILENO in <unistd.h> , POSIX.1 ls | a.out > datafile 16 Introduction • Advantages of standard I/O functions such as fgets() and printf() – No need to worry about the optimal block size – a buffered interface – Handling of line input • Misc. – <stdio.h> – Figure 1.5 – Page 9 • Copy stdin to stdout using standard I/O 17 Introduction • Programs and Processes – Program – an executable file residing in a disk file – exec(), etc. – Process – an executing instance of a program • Unique Process ID • Figure 1.6 – Page 11 – Print process ID // getpid() 18 Introduction • Process Control – Three primary functions: fork(), exec(), waitpid() – Figure 1.7 – Page 12 • Read commands from stdin and execute them • Note – – – – End-of-file: ^D fork(), execlp(), waitpid() No parsing of the input line execlp(file, arg0 , …, argn, (char *) 0) • If file does not contain a slash character, the path prefix for this file is obtained by a search of the directories passed in the PATH environment variable. 19 Introduction • ANSI C Features – Function Prototypes • ssize_t read(int, void *, size_t); • void *malloc(size_t) – Generic Pointers • void * - avoid type casting – Primitive System Data Types • ssize_t, pid_t, etc. • <sys/types.h> included in <unistd.h> • Prevent programs from using specific data types – each implementation choose its proper data types by “typedef”! 20 Introduction • Error Handling – errno in <errno.h> (sys/errno.h) • • • • E.g., 15 error numbers for open() #define ENOTTY 25 /* Inappropriate ioctl for device Never cleared if no error occurs No value 0 for any error number */ • Functions – char *strerror (int errnum) (<string.h>) – void perror(const char *msg) (<stdio.h>) • Figure 1.8 – Page 15 – demo perror and strerror 21 Introduction • UNIX – A Layer Architecture – System Calls • Programmer Interface to UNIX • Trap 40 – VAX 4.2BSD • R0 – error code – Categories • File Manipulation – Devices are special files under “/dev”! • Process Control • Information Manipulation 22 Introduction • User Identification – Numeric value in /etc/passwd • 0 for root/superuser – Unchangeable and for access permission control • Group ID – Numeric value in /etc/passwd – /etc/group • Supplementary Group ID’s – /etc/group (4.2BSD allows 16 additional groups.) – adm::4:root,adm,daemo • “ls –l” uses /etc/passwd and /etc/group 23 Introduction • Signals – To notify a process that some condition has occurred • Action – Ignore the signal – Execute the default action • E.g., for SIGFPE (divided by zero) – Provide a function • Signal Generation – Terminal keys (^c ~> SIGINT), kill – owner-only • Program 1.8 – Page 19 – Read commands and exec + signal SIGINT 24 Introduction • Time Values – Calendar time • In seconds since the Epoch (00:00:00 January 1, 1970, Coordinated Universal Time, i.e., UTC) • type time_t – Process time • In clock ticks (divided by CLK_TCK -> secs) • type clock_t • Clock time, user/system CPU time > time grep _POSIX_SOURCE */*.h > /dev/null 0.25u 0.25s 0:03.51 14.2% 25 Introduction • System Calls vs Library Functions – System Calls • 50 for Unix Ver 7, 110 for 4.3+BSD, 120 for SVR4 – Unix Technique • Same function names for system calls • Differences – Fixed set, more elaborate functionality • malloc() calls sbrk() better allocated space management • gmtime() calls time() seconds into broken-down time! • fgets() calls read() unbuffered I/O -> buffered I/O • Misc – Process control: fork(), exec(), wait() invoked directly from application code (vs system()). 26 Contents 1. Preface/Introduction 2. Standardization and Implementation 3. File I/O 4. Standard I/O Library 5. Files and Directories 6. System Data Files and Information 7. Environment of a Unix Process 8. Process Control 9. Signals 10. Inter-process Communication 27 Ariane 5 • An European rocket designed to launch commercial payloads (e.g. communications satellites, etc.) into Earth orbit. • Successor to the successful Ariane 4 launchers. • Ariane 5 can carry a heavier payload than Ariane 4. • On June 4th, 1996, Flight 501 was launched. • Approximately 37 seconds after a successful lift-off, the Ariane 5 launcher lost control. 28 Flight 501 Failure • The attitude and trajectory of the rocket are measured by a computer-based inertial reference system. This transmits commands to the engines to maintain attitude and direction. • The software failed and this system and the backup system shut down. • Diagnostic commands were transmitted to the engines which interpreted them as real data and which swiveled to an extreme position resulting in unforeseen stresses on the rocket. 29 What Causes the Failure? • Software failure occurred when an attempt to convert a 64-bit floating point number to a signed 16-bit integer caused the number to overflow. • Why not Ariane 4? – The physical characteristics of Ariane 4 (A smaller vehicle) are such that it has a lower initial acceleration and build up of horizontal velocity than Ariane 5.) – The value of the variable on Ariane 4 could never reach a level that caused overflow during the launch period. 30 What Lesson Can We Learn? • While porting one software from one platform to another platform, we have to keep our eyes on the platform assumption and capabilities. • Software reliability cannot be sacrificed. – Exact copy of the failure software was used as the backup copy. – Exception handler was removed to save the computation power, which is a rare resource on space mission. • How do you make sure your program functions correctly from one Unix platform to another Unix platform? 31 Standardization and Implementation • Why Standardization? – Proliferation of UNIX versions • What should be done? – The specifications of limits that each implementation must define! 32 UNIX Standardization • ANSI C – American National Standards Institute – ISO/IEC 9899:1990 • International Organization for Standardization (ISO) • Syntax/Semantics of C, a standard library – Purpose: • Provide portability of conforming C programs to a wide variety of OS’s. – 15 areas: Fig 2.1 – Page 27 33 UNIX Standardization • ANSIC C – – – – – – – – – – – – – – – <assert.h> - verify program assertion <ctype.h> - char types <errno.h> - error codes <float.h> - float point constants <limits.h> - implementation constants <locale.h> - locale catalogs <math.h> - mathematical constants <setjmp.h> - nonlocal goto <signal.h> - signals <stdarg.h> - variable argument lists <stddef.h> - standard definitions <stdio.h> - standard library <stdlib.h> - utilities functions <string.h> - string operations <time.h> - time and date 34 UNIX Standardization • POSIX.1 (Portable Operating System Interface) developed by IEEE – Not restricted for Unix-like systems and no distinction for system calls and library functions – Originally IEEE Standard 1003.1-1988 – 1003.2: shells and utilities, 1003.7: system administrator, > 15 other communities – Published as IEEE std 1003.1-1990, ISO/IEC9945-1:1990 – New: the inclusion of symbolic links – No superuser notion 35 UNIX Standardization • POSIX.1 – – – – – – – <cpio.h> - cpio archive values <dirent.h> - directory entries <fcntl.h> - file control <grp.h> - group file <pwd.h> - passwd file <tar.h> tar archieve values <termios.h> - terminal I/O – <unistd.h> - symbolic constants – <utime.h> file times – <sys/stat.h> - file status – <sys/times.h> - process times – <sys/types.h> - primitive system data types – <sys/utsname.h> - system name – <sys/wait.h> - process control 36 UNIX Standardization • X/Open – An international group of computer vendors – Volume 2 of X/Open Portability Guide, Issue 3 (XPG3) • XSI System Interface and Headers • Based on IEEE Std. 1003.1 – 1988 (text displaying in different languages) • Built on the draft of ANSI C – Some are out-of-date. – Solaris 2.4 – compliance to XPG4V2 • man xpg4 37 UNIX Standardization • FIPS (Federal Information Processing Standard) 151-1 – IEEE Std. 1003.1-1988 & ANSI C – For the procurement of computers by the US government. – Required Features: • JOB_CONTROL, SAVED_ID, NO_TRUNC, CHOWN_RESTRICTED, VDISIBLE, • NGROUP_MAX >= 8, Group Ids of new files and dir be equal to their parent dir, env var HOME and LOGNAME defined for a login shell, interrupted read/write functions return the number of transferred bytes. 38 UNIX Implementation UNIX System Laboratories (USG/USDL/ ATTIS/DSG/ USO/USL) System V Release 2,3 Chorus UNIX System V Release 4 Berkley Software Distributions Bell Labs Research First Edition Sixth Edition (1976) Seventh Edition (1979) 1BSD,…, 4.0BSD XENIX Mach * POSIX.1 (IEEE, ISO) standard! SUNOS Solaris Solaris 2 4.3BSD 4.3BSD Tahoe 4.3BSD Reno 4.4BSD 39 UNIX Implementation • System V Release 4 - 1989 – POSIX 1003.1 & X/Open XPG3 – Merging of SVR3.2, SunOS, 4.3BSD, Xenix – SVID (System V Interface Definition) • Issue 3 specifies the functionality qualified for SVR4. – Containing of a Berkley compatibility library • For 4.3BSD counterparts 40 UNIX Implementation • 4.2BSD - 1983 – DARPA (Defense Advanced Research Projects Agency) wanted a standard research operating systems for the VAX. – Networking support - remote login, file transfer (ftp), etc. Support for a wide range of hardware devices, e.g., 10Mbps Ethernet. – Higher-speed file system. – Revised virtual memory to support processes with large sparse address space (not part of the release). – Inter-process-communication facilities. 41 UNIX Implementation • 4.3 BSD - 1986 – Improvement of 4.2 BSD • Loss of performance because of many new facilities in 4.2 BSD. • Bug fixing, e.g., TCP/IP implementation. • New facilities such as TCP/IP subnet and routing support. – Backward compatibility with 4.2 BSD. – Second Version - 4.3 BSD Tahoe • support machines beside VAX – Third Version - 4.3 BSD Reno • freely redistributable implementation of NFS, etc. 42 UNIX Implementation • 4.4 BSD - 1992 – POSIX compatibility – Deficiencies remedy of 4.3 BSD • Support for numerous architectures such as 68K, SPARC, MIPS, PC. • New virtual memory better for large memory and less dependent on VAX architecture – Mach. • TCP/IP performance improvement and implementation of new network protocols. • Support of an object-oriented interface for numerous filesystem types, e.g., SUN NFS. 43 UNIX Implementation - Major UCB CSRG Distributions • Major new facilities: – 3BSD, 4.0BSD, 4.2BSD, 4.4 BSD • Bug fixes and efficiency improvement: – 4.1 BSD, 4.3BSD • BSD Networking Software, Release 1.0 (from 4.3BSD Tahoe, 1989), 2.0 (from 4.3BSD Reno, 1991) • Remark: – Standards define a subset of any actual system – compliance and compatibility 44 Limits – ANSI C, POSIX, XPG3, FIPS 151-1 • Compiler-time options and limits (headers) – Job control? – Largest value of a short? • Run-time limits related to file/dir – pathconf and fpathconf, e.g., the max # of bytes in a filename • Run-time limits not related to file/dir – sysconf, e.g., the max # of opened files per process • Remark: implementation-related 45 ANSI C Limits • All compile-time limits - <limits.h> – Minimum acceptable values • E.g., CHAR_BIT, INT_MAX – Implementation-related • char (limits.h), float (FLT_MAX in float.h) • open (FOPEN_MAX & TMP_MAX in stdio.h) #if defined(_CHAR_IS_SIGNED) #define CHAR_MAX SCHAR_MAX #elif defined(_CHAR_IS_UNSIGNED) #define CHAR_MAX UCHAR_MAX 46 POSIX Limits • 33 limits and constants – Invariant minimum values (POSIX defined in Figure 2.3 – Page 33, limits.h) – Corresponding implementation (limits.h) • • • • Invariant SSIZE_MAX Run-time increasable value NGROUP_MAX Run-time invariant values, e.g., CHILD_MAX Pathname variable values, e.g., LINK_MAX – Compile-time symbolic constants, e.g., _POSIX_JOB_CONTROL – Execution-time symbolic constants, e.g., _POSIX_CHOWN_RESTRICTED – Obsolete constant: CLK_TCK 47 POSIX Limits • Limitation of POSIX – E.g., _POSIX_OPEN_MAX in <limits.h> – sysconf(), pathconf(), fpathconf() at run-time – Possibly indeterminate from some • E.g., OPEN_MAX under SVR4 48 XPG3 Limits • 7 constants in <limits.h> - invariant minimum values called by POSIX.1 – Dealing with message catalogs • NL_MSGMAX – 32767 • PASS_MAX – <limits.h> • Run-time invariant value called by POSIX.1 – sysconf() 49 Run-Time Limits • #include <unistd.h> (Figure 2.7 – Page 40: compile/run time limits) • long sysconf(int name); – _SC_CHILD_MAX, _SC_OPEN_MAX, etc. • long pathconf(const char *pathname, int name); • long fpathconf(int *filedes, int name); – _PC_LINK_MAX, _PC_PATH_MAX, _PC_PIPE_BUF, _PC_NAME_MAX, etc. • Various names and restrictions on arguments (Page 35 and Figure 2.5) • Return –1 and set errno if any error occurs. – EINVAL if the name is incorrect. 50 Run-Time Limits • Example Program 2.1 – Page 38 – Print sysconf and pathconf valuesb (Fig 2.6 – Page 39) Limit SunOS4.1.1 CHILD_MAX 133 30 OPEN_MAX 64 LINK_MAX 32767 NAME_MAX 255 _POSIX_NO_TRUC 1 SVR4 4.3+BSD 40 64 64 1000 32767 14/255 255 nodef/1 1 51 Indeterminate Run-Time Limits – Two Cases • Pathname – 4.3BSD: MAXPATHLEN in <sys/param.h>, PATH_MAX in <limits.h> – Program 2.2 – Page 42 • Allocate space for a pathname vs getcwd • _PC_PATH_MAX is for relative pathnames • Max # of Open Files – POSIX run-time invariant – NOFILE (<sys/param.h>), _NFILE (stdio.h>) – sysconf(_SC_OPEN_MAX) – POSIX.1 • getrlimit() & setrlimit() for SVR4 & 4.3+BSD – Program 2.3 – Page 43: OPEN_MAX! 52 MISC • Feature Test Macro – POSIX only • cc –D_POSIX_SOURCE file.c • Or, #define _POSIX_SOURCE 1 – ANSI C only ifdef __STDC__ void *myfunc(const char*, int) #else void *myfunc(); #endif 53 MISC • Primitive System Data Types – Figure 2.8 – Page 45 • Implementation-dependent data types • E.g., caddr_t, pid_t, ssize_t, off_t, etc. – <sys/types.h> • E.g., major_t, minor_t, caddr_t, etc. • Examples: – typedef char * caddr_t; – typedef ulong_t major_t; (SRV4: 14 bits for the major device number, and 18 bits for the minor device number. Traditionally they are all short: 8-bits) 54 MISC • Conflicts Between Standards – clock() in ANSI C and times() in POSIX.1 • clock_t divided by CLOCKS_PER_SEC in <time.h> (while CLK_TCK became obsolete) – different values for clock_t – Implementation of POSIX functions • No assumption on the host operating system. • signal() in SRV4 is different from sigaction() in POSIX.1 55