Operating systems (Syst`emes d'exploitation)

advertisement
Operating systems
(Systèmes d’exploitation)
Stefan Schwoon
Cours L3, 2012/13, ENS Cachan
November 26, 2012
File system
The file system is another “abstraction” provided by the operating system, for the
following purposes:
share information between processes (more flexible/comfortable) than
signals;
store information beyond the life of a process;
store large amounts of information.
File = an object from which data can be read and written
2
Types of files
ordinary files: meant for permanent storage, typically on hard drive, USB stick,
other mass storage media, (typically) random access
pipes, network sockets: temporary, for information sharing, sequential access
directories: hierarchial organisation of files
symbolic links
special files: block or character devices, for accessing hardware devices
3
Good and bad
As with all such abstractions, file systems
enable good things: programmer’s life is facilitated by uniform interface for all
those objects, no need to worry about physical details
prevent bad things: prevent conflicts on resources, protect information flows
between processes/users
4
Virtual and concrete file systems
In Unix, the term file system can be slightly ambiguous:
the entire management of file i/o on multiple physical or logical devices
the concrete, physical organisation on one specific hardware device
Unix manages a virtual file system (i.e., a data structure in memory) in which
certain directories are mapped to other virtual or concrete file systems. (These
are called mount points.)
Example for virtual file system: procfs mounted in /proc.
Example for concrete file systems: hard drive partition, USB stick, . . .
See mount for a list of “mounted” devices. Each device is managed by a driver.
5
Organisation of a file system
Entries in the file system are referenced by a path:
absolute path: starting with /, path of directories starting at root, separated
by slashes
relative path: interpreted relative to the current directory attribute of a
process, can be changed by chdir (cd in the shell).
Note: . means the current directory, .. the directory above.
A file system must manage static and dynamic aspects:
static: a “database” of all permanently stored files
dynamic: handle the current read/write accesses to the file system
In Unix: i-nodes and handles
6
Inodes
(origin of term unknown, possibly index nodes)
data structure typically used in Unix for permanent files, e.g., hard disk
device partitioned into logical blocks of a fixed, chosen size
a set of these blocks is reserved for storing inodes
an inode contains information about a file:
type, owner, group, access rights, number of pointers to the file, block numbers
where data is stored, . . . , but not the name.
7
Relation between files and inodes
An inode represents a block of data on disk;
a file is a named reference to an inode.
In general: many-to-one relation from files to inodes
(but often one-to-one, except for directories).
The ls -i command lists the inode number of files;
stat displays information about the inode associated with a file.
8
Inodes and the directory structure
Directories are special inodes that contain a list of files/directories:
their names
their inode numbers
The system of files on a disk forms a tree-like structure (a DAG).
Note that the name of a file is not stored in the file’s inode but in the directory
containing it.
Indeed, the same file (= inode) can be referenced by multiple directory entries
(see ln command, “hard” links).
File is physically removed (the inode is freed) when the last link to it is lost
(hence unlink(2) for removing a file).
9
Inodes and physical storage
An inode contains information on the physical location of its data, using a mixture
of so-called direct and indirect blocks.
A direct block contains the number of a (logical) block on the device.
An indirect block contains the number of a logical block where direct blocks are
stored.
A double indirect block contains the number of a logical block where indirect
blocks are stored.
A triple indirect block . . .
Example: 12 direct blocks, 1 indirect, 1 double indirect, 1 triple indirect
10
Visualization of the indirect blocks concept:
11
Access rights in Unix
Each user has a numerical identifier (user id, or uid).
Each user also belongs to one primary group and possibly multiple other groups.
Each group also has a numerical identifier (gid).
Each file (more precisely: each inode) belongs to some user and some group.
By default, the user who created the file imparts his uid and his primary gid on
the file.
Access rights on a file are governed by the uid and gid of a file, see next slide.
12
9 “ugo” bits: (user,group,others) × (read,write,execute)
3 other bits: setuid, setgid, sticky
setuid, setgid: when file is executed, set the user/group id to that of the owner
(“effective” uid, C functions: getuid, geteuid)
sticky bit (depends on file system)
Shell commands: chmod, chown, chgrp
13
Symbolic links
A special file that contains simply the name of another file in the file system.
Each read/write access to the symbolic link will be redirected to the file it “points
to”.
Note: A symbolic link is a different inode (referencing an entry in the file system)
while a hard link is a reference to the same inode.
Therefore, the ‘linked’ file can be deleted, while the link still exists.
14
Dynamic aspects
Apart from the logical/physical file system, Unix maintains a table of “open files”.
An open file is a data structure that permits access to a file:
inode, access mode, position in file, buffered data, . . .
This is a system-wide structure (there’s only one of it).
Multiple open file nodes may refer to the same inode (but, e.g., with different
access modes, positions, . . . ).
An open-file entry is treated similarly to a hard link to its inode – the inode is
not removed until the last entry is gone.
Open files are created by Unix functions such as creat, open, pipe, which
return file descriptors to the user.
15
File descriptors
Each process has its set of file descriptors.
A file descriptor is a reference to an entry in the open-file table; the file
descriptors owned by a process (at a given moment) are the files to which the
process can input/output.
open (2) creates a new open-file entry and a file descriptor to it in the calling
process.
fork duplicates a process and its file descriptors.
dup duplicates a file descriptor within the same process.
close removes a file descriptor from the process; the corresponding open-file
entry persists as long as there are file descriptors referencing it.
16
Standard file descriptors
Three file descriptors in each process are special:
0 is the “standard input” (e.g., getch or scanf use it)
1 is the “standard output” (e.g., printf uses it)
2 is the “standard error” (often points to same as standard output)
For processes running on the terminal, standard input is from keyboard and
standard output is the screen (special device files).
Can be changed by ‘redirecting’ output to file (done by shell before launching the
process).
17
I/O: C standard vs ANSI
C typically provides two families of functions for I/O:
open, write, read, . . .
System calls, defined by POSIX standard (may not exist on other OS)
work on file descriptors (0, 1, 2, . . . )
unbuffered I/O
fopen, printf, scanf, . . .
Defined by ANSI-C standard (exist in (practically) all C implementations)
work on streams (stdin, stdout, stderr, . . . )
buffered I/O: flush using fflush(3) or by newline (or: use setvbuf(3))
Mixing these two may produce strange effects . . .
18
Pipes
Pipes are files that are not stored permanently but serve to facilitate
communication between two processes.
Data is held in a (fixed-size) buffer provided by the file system.
read on empty buffer will block until data is available
write on full buffer will block until space becomes available
Additionally, data can only be written as long as there is at least one process
reading from the pipe.
Reading from pipes will yield end-of-file only when there is no process left writing
to the file.
Pipes are not seekable (= sequential access), whereas “normal” files are usually
random-access (see lseek(3) or fseek(3)).
19
Download