Advanced UNIX progamming Fall 2002 Instructor: Ashok Srinivasan Lecture 5 Acknowledgements: The syllabus and power point presentations are modified versions of those by T. Baker and X. Yuan Announcements • Reading assignment – APUE Chapter 1 – APUE Chapter 3 • Pages 47-56, 56-62 – APUE Chapter 4 • Pages 77-81, 92-95 – APUE Chapter 5 Review • Review some features of C – Header files – Command line arguments – Utilities • Review some UNIX system calls – system, etc • Portability – Standards: ANSI, POSIX, etc – 32 bit vs 64 bit – Byte order: Little endian vs big endian Week 2 Topics • Review some features of C – Header files – Command line arguments – Utilities • Review some UNIX system calls – system, etc • Portability – Standards: ANSI, POSIX, etc – 32 bit vs 64 bit – Byte order: Little endian vs big endian Week 2 Topics ... continued • Introduction to the UNIX API – – – – Environment variables Exit status Process ID User ID • UNIX file system – File system abstraction – Directories – File descriptors Review some features of C • Header files • Macros • Command line arguments – See example1.c • Utilities Some Unix System Calls • You may use these in your first assignment – system – mkstemp – See example2.c system #include <stdlib.h> int system(const char *string); – Works as if string is typed into the shell at a terminal – Returns the exit status (see man page for waitpid) – Usually -1 is returned if there is an error mkstemp #include <stdlib.h> int mkstemp(char *template) – template should end in XXXXXX – It replaces XXXXXX with unique file name, and returns an open file descriptor for a file available for reading and writing Portability • Standards – Source code portability: ANSI/ISO C – UNIX standards: POSIX, open group – Internet engineering task force (IETF) • 32 bit vs 64 bit • Byte order – Little endian vs big endian Source Code Portability • Standard programming language – Example: ANSI/ISO C • ISO C90 is in use; C99 is latest - should it be used? • Standard libraries • Standard API to operating system – Example: POSIX.1 • Auto-configuration mechanisms • Programmer discipline Unix Standards • POSIX (IEEE STDS 1003.x and ISO/IEC 9945) – POSIX.1: System API for C language – POSIX.2: Shell and utilities – POSIX.5: System API for Ada language – POSIX.9: System API for Fortran language • See also http://www.pasc.org and http://www.standards.ieee.org Unix Standards ... continued • The Open Group – A consortium of vendors and user organizations – Consolidation of X/Open and the Open Software Foundation – Controls the UNIX trademark – The Austin Group combined the IEEE, TOG, and ISO standards • See also http://www.opengroup.org and http://www.opengroup.org/onlinepubs/007904 975 IETF • Internet Engineering Task Force (IETF) – Network designers, operators, vendors, researchers – Deals with the Internet – Issues RFCs • See also http://www.ietf.org 64-bit vs. 32-bit architecture • Pointers cannot be stored as int • size_t cannot be stored as int • long may not be long enough for size_t and off_t Datatype ILP32 LP64 char 8 8 short 16 16 int 32 32 long 32 64 pointer 32 64 (long long) 64 64 Note: ILP32 and LP64 are not the only two models Byte order • Little-Endian – Low-order byte is stored at lowest address • Big-Endian – High-order byte is stored at lowest address Introduction to the UNIX API • • • • Environment variables Exit status Process ID User ID Environment Variables • Associated with each process is a set of environment variables, each having a value – These values are passed to the process when it is executed – Along with argc and argv, this is another way of passing parameters to a program – Example: when in the Bourne shell you execute export DISPLAY=mysystem:0.0 The environment variable DISPLAY has value mysystem:0.0 – A process can access its environment variables using either (i) external variable environ, or (ii) function getenv() • See example3.c Exit status • Each process (except the system's initial process) has a parent process – When a process terminates, its parent can find out what caused the termination, via a status value • The termination status contains a short integer exit status code • The function exit() allows a process to terminate itself and specify the exit status code that it wants to return to the parent – When a command is run from a shell, the exit status code is returned to the shell, which allows it to be used in job control statements – What is the difference between _exit() and exit()? Process ID • Each process has a unique identifier, of signed arithmetic type pid_t • The process ID of the current process can be obtained by calling the function getpid() User ID • Associated with each process are two user ID's – A fixed real user ID, and – A changeable effective user ID • In order to modify privileges – The real user ID can be obtained by calling getuid() – See example4.c UNIX file system • File system abstraction • Directories • File descriptors File system abstraction • File: a sequence of bytes of data • Filesystem: a space in which files can be stored • Link: a named logical connection from a directory to a file • Directory: a special kind of file, that can contain links to other files • Filename: the name of a link • Pathname: a chain of one or more filenames, separated by /'s File system abstraction ... continued • inode: a segment of data in a filesystem that describes a file, including how to find the rest of the file in the system • File descriptor: a non-negative integer, with a per-process mapping to an open file description • Open file description: an OS internal datastructure, shareable between processes Directories Directories ... continued • • • • • Names belong to links, not to files There may be multiple hard links to one file Renaming only renames one link to that file Unix allows both hard and soft links A file will exit even after the last hard link to it has been removed, as long as there are references to it from open file descriptions – Soft links do not prevent deletion of the file • A directory may have multiple (hard) links to it – But this capability is usually restricted, to prevent creation of directory cycles File Descriptors • Each open file is associated with an open file description – Each process has a (logical) array of references to open file descriptions – Logical indices into this array are file descriptors • These integer values are used to identify the files for I/O operations – The file descriptor 0 is reserved for standard input, the file descriptor 1 for standard output, and the file descriptor 2 for the standard error File Descriptors ... continued File Descriptors ... continued • The POSIX standard defines the following – File descriptor: A per-process, unique, nonnegative integer used to identify an open file for the purposes of file access – Open file description: A record of how a process or group of processes are currently accessing a file • Each file descriptor refers to exactly one open file description, but an open file description may be referred to by more than one file descriptor • A file offset, file status, and file access modes are attributes of an open file description – File access modes: Specification of whether the file can be read and written File Descriptors ... continued – File offset: The byte position in the file where the next I/O operation through that open file description begins • Each open file description associated with a regular file, block special file, or directory has a file offset • A character special file that does not refer to a terminal device may have a file offset • There is no file offset specified for a pipe or FIFO (described later) – File status: Includes the following information • append mode or not • blocking/nonblocking • Etc File Descriptors ... continued – FIFO special file: A type of file with the property that data written to such a file is read on a first-infirst-out basis – Pipe: An object accessed by one of the pair of file descriptors created by the pipe() function • Once created, the file descriptors can be used to manipulate the pipe, and it behaves identically to a FIFO special file when accessed this way • It has no name in the file hierarchy File Descriptors ... continued • Important points – A file descriptor does not describe a file • It is just a number that is ephemerally associated with a particular open file description – An open file description describes a past "open" operation on a file; its does not describe the file – The description of the file is in the inode • There may be several different open file descriptors (or none) referring at it any given time