chapters1_2

advertisement
System Programming
Chapters 1 & 2
1
UNIX History
• Developed in the late 1960s and 1970s at Bell Labs
• UNICS – a pun MULTICS (Multiplexed Information and
Computer Service) which was supposed to support 1000 on
line users but only handled a few (barely 3). (MULTIUNiplexed).
• Thomson writes first version of UNICS in assembler for a
PDF-7 in one MONTH which contains a new type of file
system: kernel, shell, editor and the assembler (one week).
• 1969 Thomson writes interpretive B based on BCPL --Ritchie
improves on B and called it “C”
• 1972 UNIX is re-written in C to facilitate porting
2
UNIX History (cont)
• 1973 UNIX philosophy developed:
– Write programs that do one thing and do it well
– Write programs that work together
– Write programs that handle text streams, because
that is the universal interface
3
UNIX
UNIX System
Laboratories
(USG/USDL/
ATTIS/DSG/
USO/USL)
Berkley
Software
Distributions
Bell Labs
Research
First Edition
Sixth Edition
Seventh Edition
1BSD,…,
4.0BSD
XENIX
System V
Release 2,3
Chorus
UNIX
System V
Release 4
Mach
SUNOS
Solaris
Solaris 2
* POSIX.1 (IEEE, ISO) standard!
4.3BSD
4.3BSD Tahoe
4.3BSD Reno
4.4BSD
4
UNIX Today
• Supports many users running many programs at the
same time, all sharing the same computer system
• Information Sharing (which is Ken’s original goal in
1969.)
• Geared towards facilitating the job of creating new
programs
• UNIX system
– Sun: SunOS
• UNIX-Compatible systems
– Solaris; SGI: IRIX; Free BSD; Hewlett Packard: HP-UX;
Apple: OS X (Darwin) GNU: Linux;
5
UNIX Architecture
User
interface
System call
interface
useruser user
user
user
user
user
Shells, compilers, X, application programs, etc.
CPU scheduling, signal handling,
virtual memory, paging, swapping,
file system, disk drivers, caching/buffering, etc.
Kernel interface
to the hardware
terminal controller, terminals,
physical memory, device controller,
devices such as disks, memory, etc.
UNIX
6
Introduction
• Objective
– Briefly describe services provided by various versions of
the UNIX operating system.
• Logging In
– /etc/passwd – local machine or NIS DB
• root:x:0:1:Super-User:/root:/bin/tcsh
• Login-name, encrypted passwd, numeric user-ID, numeric group ID,
comment, home dir, shell program
– /etc/shadow – with “x” indicated for passwd
7
Introduction
• Shell
– Command interpreters
• Built-in commands, e.g., umask
• (External) commands, e.g., ls
shell
process
fork
wait
shell
process
zombie process
execve()
exit
child process
8
Introduction
• Shells
– Bourne shell, /bin/sh
• Steve Bourne at Bell Labs
– C shell, /bin/csh
• Bill Jay at Berkeley
– Command-line editing, history, job-control, etc
– KornShell. /bin/ksh
• David Korn (successor of Bourne shell)
• Command-line editing, job-control, etc
– .cshrc
9
Introduction
• Filesystem
– A hierarchical arrangement of directories and files – starting in root /
• File
–
–
–
–
No / or null char in filenames
. and ..
BSD: 255-char filenames (14 in the past)
Attributes – stat()
• Type, size, owner, permissions, modification/access time, etc.
• Directory
– Files with directory entries, e.g., { (filenames, inode) }
10
Introduction – Files and Dir
• File
– A sequence of bytes
• Directory
– A file that includes info on how to find other files.
/
vmunix
console
dev
lp0
bin
…
csh
lib
…
libc.a
usr
…
include
etc
…
…
passwd
* Use command “mount” to show all mounted file systems!
11
Introduction
• Path name
– Absolute path name
• Start at the root / of the file system
• /user/john/fileA
– Relative path name
• Start at the “current directory” which is an attribute of the process accessing the
path name.
• ./dirA/fileB
• Links
– Symbolic Link – 4.3BSD
• A file containing the path name of another file can cross file-system
boundaries. (home/prof/cshih>ln –s ../cshih cshih-test)
– Hard Link
• . or ..
12
Introduction
• Directories
– /vmunix - binary root image of UNIX
– /dev - device special files, e.g., /dev/console
– /bin - binaries of UNIX system programs
• /usr/ucb - written by Berkley instead of AT&T
• /usr/local/bin - written at the local site
–
–
–
–
/lib - library files, e.g., those for C
/user - directories for users, e.g., /user/john
/etc - administrative files and programs, e.g., passwd
/tmp - temporary files
13
Introduction
• Program 1.1 – Page 5
– List all the files in a directory
• Note
– “apue.h”, err_sys() and err_quit() in Appendix B
– opendir(), readdir(), closedir()
• No ordering of directory entries
• Working directory
– Goes with each process
• Home Directory
14
Introduction – Input/Output
• Operations
– open, close, read, write, truncate, lseek, dup, rename, chmod, chown,
fcntl, ioctl, mkdir, cd, opendir, readdir, closedir, etc.
• File descriptor
data block
data block
i-node
i-node
i-node
sync
Read(4, …)
Tables of
Opened Files
(per process)
System
Open File
Table
In-core
i-node
list
15
Introduction
• File descriptor
– Standard input (stdin), standard output (stdout), standard error (stderr)
– Connected to the terminal if nothing special is done.
• I/O redirection
– ls > file.list
• Unbuffered I/O: open, close, read, write, lseek
–
–
–
–
Program 1.2 – Page 8
Copies of stdin to stdout
STDIN_FILENO, STDOUT_FILENO in <unistd.h> , POSIX.1
ls | a.out > datafile
16
Introduction
• Advantages of standard I/O functions such as fgets()
and printf()
– No need to worry about the optimal block size – a
buffered interface
– Handling of line input
• Misc.
– <stdio.h>
– Figure 1.5 – Page 9
• Copy stdin to stdout using standard I/O
17
Introduction
• Programs and Processes
– Program – an executable file residing in a disk file
– exec(), etc.
– Process – an executing instance of a program
• Unique Process ID
• Figure 1.6 – Page 11
– Print process ID // getpid()
18
Introduction
• Process Control
– Three primary functions: fork(), exec(), waitpid()
– Figure 1.7 – Page 12
• Read commands from stdin and execute them
• Note
–
–
–
–
End-of-file: ^D
fork(), execlp(), waitpid()
No parsing of the input line
execlp(file, arg0 , …, argn, (char *) 0)
• If file does not contain a slash character, the path prefix for this file is
obtained by a search of the directories passed in the PATH
environment variable.
19
Introduction
• ANSI C Features
– Function Prototypes
• ssize_t read(int, void *, size_t);
• void *malloc(size_t)
– Generic Pointers
• void * - avoid type casting
– Primitive System Data Types
• ssize_t, pid_t, etc.
• <sys/types.h> included in <unistd.h>
• Prevent programs from using specific data types – each implementation
choose its proper data types by “typedef”!
20
Introduction
• Error Handling
– errno in <errno.h> (sys/errno.h)
•
•
•
•
E.g., 15 error numbers for open()
#define ENOTTY 25
/* Inappropriate ioctl for device
Never cleared if no error occurs
No value 0 for any error number
*/
• Functions
– char *strerror (int errnum) (<string.h>)
– void perror(const char *msg) (<stdio.h>)
• Figure 1.8 – Page 15
– demo perror and strerror
21
Introduction
• UNIX – A Layer Architecture
– System Calls
• Programmer Interface to UNIX
• Trap 40 – VAX 4.2BSD
• R0 – error code
– Categories
• File Manipulation
– Devices are special files under “/dev”!
• Process Control
• Information Manipulation
22
Introduction
• User Identification
– Numeric value in /etc/passwd
• 0 for root/superuser
– Unchangeable and for access permission control
• Group ID
– Numeric value in /etc/passwd
– /etc/group
• Supplementary Group ID’s
– /etc/group (4.2BSD allows 16 additional groups.)
– adm::4:root,adm,daemo
• “ls –l” uses /etc/passwd and /etc/group
23
Introduction
• Signals
– To notify a process that some condition has occurred
• Action
– Ignore the signal
– Execute the default action
• E.g., for SIGFPE (divided by zero)
– Provide a function
• Signal Generation
– Terminal keys (^c ~> SIGINT), kill – owner-only
• Program 1.8 – Page 19
– Read commands and exec + signal SIGINT
24
Introduction
• Time Values
– Calendar time
• In seconds since the Epoch (00:00:00 January 1, 1970, Coordinated
Universal Time, i.e., UTC)
• type time_t
– Process time
• In clock ticks (divided by CLK_TCK -> secs)
• type clock_t
• Clock time, user/system CPU time
> time grep _POSIX_SOURCE */*.h > /dev/null
0.25u 0.25s 0:03.51 14.2%
25
Introduction
• System Calls vs Library Functions
– System Calls
• 50 for Unix Ver 7, 110 for 4.3+BSD, 120 for SVR4
– Unix Technique
• Same function names for system calls
• Differences
– Fixed set, more elaborate functionality
• malloc() calls sbrk()  better allocated space management
• gmtime() calls time()  seconds into broken-down time!
• fgets() calls read()  unbuffered I/O -> buffered I/O
• Misc
– Process control: fork(), exec(), wait() invoked directly from application
code (vs system()).
26
Contents
1. Preface/Introduction
2. Standardization and Implementation
3. File I/O
4. Standard I/O Library
5. Files and Directories
6. System Data Files and Information
7. Environment of a Unix Process
8. Process Control
9. Signals
10. Inter-process Communication
27
Ariane 5
• An European rocket designed to launch commercial payloads
(e.g. communications satellites, etc.) into Earth orbit.
• Successor to the successful Ariane 4 launchers.
• Ariane 5 can carry a heavier payload than Ariane 4.
• On June 4th, 1996, Flight 501 was launched.
• Approximately 37 seconds after a successful lift-off, the Ariane
5 launcher lost control.
28
Flight 501 Failure
• The attitude and trajectory of the rocket are measured by a
computer-based inertial reference system. This transmits
commands to the engines to maintain attitude and direction.
• The software failed and this system and the backup system
shut down.
• Diagnostic commands were transmitted to the engines which
interpreted them as real data and which swiveled to an
extreme position resulting in unforeseen stresses on the
rocket.
29
What Causes the Failure?
• Software failure occurred when an attempt to
convert a 64-bit floating point number to a signed
16-bit integer caused the number to overflow.
• Why not Ariane 4?
– The physical characteristics of Ariane 4 (A smaller vehicle)
are such that it has a lower initial acceleration and build up
of horizontal velocity than Ariane 5.)
– The value of the variable on Ariane 4 could never reach a
level that caused overflow during the launch period.
30
What Lesson Can We Learn?
• While porting one software from one platform to
another platform, we have to keep our eyes on the
platform assumption and capabilities.
• Software reliability cannot be sacrificed.
– Exact copy of the failure software was used as the backup
copy.
– Exception handler was removed to save the computation
power, which is a rare resource on space mission.
• How do you make sure your program functions
correctly from one Unix platform to another Unix
platform?
31
Standardization and Implementation
• Why Standardization?
– Proliferation of UNIX versions
• What should be done?
– The specifications of limits that each
implementation must define!
32
UNIX Standardization
• ANSI C
– American National Standards Institute
– ISO/IEC 9899:1990
• International Organization for Standardization (ISO)
• Syntax/Semantics of C, a standard library
– Purpose:
• Provide portability of conforming C programs to a wide
variety of OS’s.
– 15 areas: Fig 2.1 – Page 27
33
UNIX Standardization
• ANSIC C
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
<assert.h> - verify program assertion
<ctype.h> - char types
<errno.h> - error codes
<float.h> - float point constants
<limits.h> - implementation constants
<locale.h> - locale catalogs
<math.h> - mathematical constants
<setjmp.h> - nonlocal goto
<signal.h> - signals
<stdarg.h> - variable argument lists
<stddef.h> - standard definitions
<stdio.h> - standard library
<stdlib.h> - utilities functions
<string.h> - string operations
<time.h> - time and date
34
UNIX Standardization
• POSIX.1 (Portable Operating System Interface)
developed by IEEE
– Not restricted for Unix-like systems and no distinction for
system calls and library functions
– Originally IEEE Standard 1003.1-1988
– 1003.2: shells and utilities, 1003.7: system administrator, >
15 other communities
– Published as IEEE std 1003.1-1990, ISO/IEC9945-1:1990
– New: the inclusion of symbolic links
– No superuser notion
35
UNIX Standardization
• POSIX.1
–
–
–
–
–
–
–
<cpio.h> - cpio archive values
<dirent.h> - directory entries
<fcntl.h> - file control
<grp.h> - group file
<pwd.h> - passwd file
<tar.h> tar archieve values
<termios.h> - terminal I/O
– <unistd.h> - symbolic
constants
– <utime.h> file times
– <sys/stat.h> - file status
– <sys/times.h> - process times
– <sys/types.h> - primitive
system data types
– <sys/utsname.h> - system
name
– <sys/wait.h> - process
control
36
UNIX Standardization
• X/Open
– An international group of computer vendors
– Volume 2 of X/Open Portability Guide, Issue 3 (XPG3)
• XSI System Interface and Headers
• Based on IEEE Std. 1003.1 – 1988 (text displaying in different
languages)
• Built on the draft of ANSI C
– Some are out-of-date.
– Solaris 2.4 – compliance to XPG4V2
• man xpg4
37
UNIX Standardization
• FIPS (Federal Information Processing Standard) 151-1
– IEEE Std. 1003.1-1988 & ANSI C
– For the procurement of computers by the US government.
– Required Features:
• JOB_CONTROL, SAVED_ID, NO_TRUNC, CHOWN_RESTRICTED, VDISIBLE,
• NGROUP_MAX >= 8, Group Ids of new files and dir be equal to their parent dir,
env var HOME and LOGNAME defined for a login shell, interrupted read/write
functions return the number of transferred bytes.
38
UNIX Implementation
UNIX System
Laboratories
(USG/USDL/
ATTIS/DSG/
USO/USL)
System V
Release 2,3
Chorus
UNIX
System V
Release 4
Berkley
Software
Distributions
Bell Labs
Research
First Edition
Sixth Edition (1976)
Seventh Edition (1979)
1BSD,…,
4.0BSD
XENIX
Mach
* POSIX.1 (IEEE, ISO) standard!
SUNOS
Solaris
Solaris 2
4.3BSD
4.3BSD Tahoe
4.3BSD Reno
4.4BSD
39
UNIX Implementation
• System V Release 4 - 1989
– POSIX 1003.1 & X/Open XPG3
– Merging of SVR3.2, SunOS, 4.3BSD, Xenix
– SVID (System V Interface Definition)
• Issue 3 specifies the functionality qualified for SVR4.
– Containing of a Berkley compatibility library
• For 4.3BSD counterparts
40
UNIX Implementation
• 4.2BSD - 1983
– DARPA (Defense Advanced Research Projects Agency) wanted a standard
research operating systems for the VAX.
– Networking support - remote login, file transfer (ftp), etc. Support for
a wide range of hardware devices, e.g., 10Mbps Ethernet.
– Higher-speed file system.
– Revised virtual memory to support processes with large sparse
address space (not part of the release).
– Inter-process-communication facilities.
41
UNIX Implementation
• 4.3 BSD - 1986
– Improvement of 4.2 BSD
• Loss of performance because of many new facilities in 4.2 BSD.
• Bug fixing, e.g., TCP/IP implementation.
• New facilities such as TCP/IP subnet and routing support.
– Backward compatibility with 4.2 BSD.
– Second Version - 4.3 BSD Tahoe
• support machines beside VAX
– Third Version - 4.3 BSD Reno
• freely redistributable implementation of NFS, etc.
42
UNIX Implementation
• 4.4 BSD - 1992
– POSIX compatibility
– Deficiencies remedy of 4.3 BSD
• Support for numerous architectures such as 68K, SPARC, MIPS, PC.
• New virtual memory better for large memory and less dependent
on VAX architecture – Mach.
• TCP/IP performance improvement and implementation of new
network protocols.
• Support of an object-oriented interface for numerous filesystem
types, e.g., SUN NFS.
43
UNIX Implementation - Major UCB CSRG Distributions
• Major new facilities:
– 3BSD, 4.0BSD, 4.2BSD, 4.4 BSD
• Bug fixes and efficiency improvement:
– 4.1 BSD, 4.3BSD
• BSD Networking Software, Release 1.0 (from 4.3BSD
Tahoe, 1989), 2.0 (from 4.3BSD Reno, 1991)
• Remark:
– Standards define a subset of any actual system –
compliance and compatibility
44
Limits – ANSI C, POSIX, XPG3, FIPS 151-1
• Compiler-time options and limits (headers)
– Job control?
– Largest value of a short?
• Run-time limits related to file/dir
– pathconf and fpathconf, e.g., the max # of bytes in a
filename
• Run-time limits not related to file/dir
– sysconf, e.g., the max # of opened files per process
• Remark: implementation-related
45
ANSI C Limits
• All compile-time limits - <limits.h>
– Minimum acceptable values
• E.g., CHAR_BIT, INT_MAX
– Implementation-related
• char (limits.h), float (FLT_MAX in float.h)
• open (FOPEN_MAX & TMP_MAX in stdio.h)
#if defined(_CHAR_IS_SIGNED)
#define CHAR_MAX
SCHAR_MAX
#elif defined(_CHAR_IS_UNSIGNED)
#define CHAR_MAX
UCHAR_MAX
46
POSIX Limits
• 33 limits and constants
– Invariant minimum values (POSIX defined in Figure 2.3 – Page 33,
limits.h)
– Corresponding implementation (limits.h)
•
•
•
•
Invariant SSIZE_MAX
Run-time increasable value NGROUP_MAX
Run-time invariant values, e.g., CHILD_MAX
Pathname variable values, e.g., LINK_MAX
– Compile-time symbolic constants, e.g., _POSIX_JOB_CONTROL
– Execution-time symbolic constants, e.g., _POSIX_CHOWN_RESTRICTED
– Obsolete constant: CLK_TCK
47
POSIX Limits
• Limitation of POSIX
– E.g., _POSIX_OPEN_MAX in <limits.h>
– sysconf(), pathconf(), fpathconf() at run-time
– Possibly indeterminate from some
• E.g., OPEN_MAX under SVR4
48
XPG3 Limits
• 7 constants in <limits.h> - invariant minimum values
called by POSIX.1
– Dealing with message catalogs
• NL_MSGMAX – 32767
• PASS_MAX
– <limits.h>
• Run-time invariant value called by POSIX.1
– sysconf()
49
Run-Time Limits
• #include <unistd.h> (Figure 2.7 – Page 40: compile/run time limits)
• long sysconf(int name);
– _SC_CHILD_MAX, _SC_OPEN_MAX, etc.
• long pathconf(const char *pathname, int name);
• long fpathconf(int *filedes, int name);
– _PC_LINK_MAX, _PC_PATH_MAX, _PC_PIPE_BUF,
_PC_NAME_MAX, etc.
• Various names and restrictions on arguments (Page 35 and
Figure 2.5)
• Return –1 and set errno if any error occurs.
– EINVAL if the name is incorrect.
50
Run-Time Limits
• Example Program 2.1 – Page 38
– Print sysconf and pathconf valuesb (Fig 2.6 – Page 39)
Limit
SunOS4.1.1
CHILD_MAX 133 30
OPEN_MAX
64
LINK_MAX
32767
NAME_MAX
255
_POSIX_NO_TRUC 1
SVR4 4.3+BSD
40
64
64
1000 32767
14/255 255
nodef/1 1
51
Indeterminate Run-Time Limits –
Two Cases
• Pathname
– 4.3BSD: MAXPATHLEN in <sys/param.h>, PATH_MAX in <limits.h>
– Program 2.2 – Page 42
• Allocate space for a pathname vs getcwd
• _PC_PATH_MAX is for relative pathnames
• Max # of Open Files – POSIX run-time invariant
– NOFILE (<sys/param.h>), _NFILE (stdio.h>)
– sysconf(_SC_OPEN_MAX) – POSIX.1
• getrlimit() & setrlimit() for SVR4 & 4.3+BSD
– Program 2.3 – Page 43: OPEN_MAX!
52
MISC
• Feature Test Macro
– POSIX only
• cc –D_POSIX_SOURCE file.c
• Or, #define _POSIX_SOURCE 1
– ANSI C only
ifdef __STDC__
void *myfunc(const char*, int)
#else
void *myfunc();
#endif
53
MISC
• Primitive System Data Types
– Figure 2.8 – Page 45
• Implementation-dependent data types
• E.g., caddr_t, pid_t, ssize_t, off_t, etc.
– <sys/types.h>
• E.g., major_t, minor_t, caddr_t, etc.
• Examples:
– typedef char *
caddr_t;
– typedef ulong_t
major_t;
(SRV4: 14 bits for the major device number, and 18 bits for the minor
device number.
Traditionally they are all short: 8-bits)
54
MISC
• Conflicts Between Standards
– clock() in ANSI C and times() in POSIX.1
• clock_t divided by CLOCKS_PER_SEC in <time.h>
(while CLK_TCK became obsolete) – different values
for clock_t
– Implementation of POSIX functions
• No assumption on the host operating system.
• signal() in SRV4 is different from sigaction() in POSIX.1
55
Download