Chapter 5 - Files and directories

advertisement
Chapter 5
Files and Directories
Source: Robbins and Robbins, UNIX Systems Programming, Prentice Hall, 2003.
5.1 UNIX File System
Navigation
Navigating the UNIX File System
•
•
Operating systems organize physical disks into file systems to provide high-level
logical access to the actual bytes of a file
A file system is a collection of files and attributes such as location and name
– The location is stated in terms of an offset, which the operating system translates into a
physical location on a disk
•
•
A directory is a file containing directory entries that associate a filename with the
physical location of a file on disk
Most file systems organize their directories in a tree structure (see the next slide)
– The forward slash ( / ), when by itself or as the first character in a directory-file path,
designates the root directory of the file system; it is located at the top of the file system
tree; every file and subdirectory are located in a node somewhere underneath the root
– dirB contains the file named m3.dat
– dirA contains the files named my1.dat and my2.dat
– dirA also contains a subdirectory named dirB
•
A file or subdirectory can be specified by either an absolute path name or a relative
path name
– An absolute pathname specifies all of the nodes in the path from the root node down to
the file or subdirectory in the tree; each successive node name is separated by a slash
• Example: /dirA/dirB/my1.dat
– A relative pathname is based on the current working directory (covered later)
3
Tree Structure of a File System
4
Current Working Directory
•
•
Anytime a process exists, a current working directory is associated with it
This directory is used for path resolution for any relative path names
– If a pathname does not start with a slash, the process prepends the current working
directory to the path name in order to create an absolute path
•
In a directory listing, the dot ( . ) specifies the current directory and the dot
dot ( .. ) specifies the parent of the current directory (see next slide)
– The root directory has dot and dot-dot pointing to itself
•
The cd command can be used in a command shell to set (i.e., change to) a new
current working directory
– Example: cd /dirC
•
The PWD environment variable specifies the current working directory of a
process as shown in the example below
PWD=/home/smithj/COSC4153
5
Example Directory Contents
uxb2% ls –fl
total 10
drwx--x--x
dr-xr-xr-x
drwx-----drwx------rw------drwx--x--x
-rw------drwxr-xr-x
-rw------drwx--x--x
7
4
2
14
2
21
1
2
1
2
jjt107
root
jjt107
jjt107
jjt107
jjt107
jjt107
jjt107
jjt107
jjt107
faculty
root
faculty
faculty
faculty
faculty
faculty
faculty
mail
faculty
1024
4
2048
2048
12
1024
0
96
161
1024
Nov
Nov
Nov
Nov
Jul
Nov
Aug
Jan
Aug
Nov
9
9
9
8
13
6
28
27
30
6
20:35
19:57
15:28
16:17
16:59
12:26
11:20
2005
22:40
12:40
.
..
Mail
Files
.mailboxlist
http
.gopherrc
.jpiu
.procmailrc
Articles
6
getcwd() Function
•
The getcwd() function returns the path name of the current working
directory of a process
#include <unistd.h>
char *getcwd(char *buffer, size_t size);
– The buffer parameter represents a user supplied buffer for holding the path name
– The size parameter specifies the maximum length path name that the buffer can
accommodate, including the trailing string terminator
•
If successful, the function returns a pointer to buffer; otherwise, it returns
NULL and sets errno
7
chdir() Function
•
The chdir() function causes the directory specified by the path parameter
to become the current working directory for the calling process
#include <unistd.h>
int chdir(const char *path);
•
If successful, the function returns zero; otherwise, it returns –1 and sets
errno
8
Example use of getcwd() and chdir()
#include <stdio.h>
#include <unistd.h>
#define MAX_SIZE 500
// ********************************************
int main(int argc, char *argv[])
{
int status;
char *pathPtr;
char buffer[MAX_SIZE];
if (argc != 2)
{
printf("\nUsage: a.out directory_path_name\n");
return 1;
} // End if
(More on next slide)
9
Example use of getcwd() and chdir()
(continued)
status = chdir(argv[1]);
if (status == -1)
{
perror("Problem occurred in changing current working directory\n");
return 1;
} // End if
pathPtr = (char *) getcwd(buffer, MAX_SIZE);
if (pathPtr == NULL)
{
perror("Problem obtaining current working directory");
return 1;
} // End if
printf("\nThe current working directory is now %s\n", buffer);
return 0;
} // End main
10
Search Path
• When the name of an executable program is entered in a command shell,
the UNIX operating system systematically looks for the location of the
corresponding file
– If only the name of the file is given, the command shell searches for the
executable file in all the directories (in the order) listed in the PATH
environment variable
– The PATH variable contains the fully qualified pathnames of particular
directories, each separated by colons, as shown in the example below
PATH=/opt/local/bin:/usr/openwin/bin:/opt/local/comm
on/bin:/usr/bin:/usr/ccs/bin:/usr/ucb:/opt/local/com
mon/lib/wp51/wpbin:/opt/local/common/lib/wp51/shbin:
/opt/local/uxb1/bin:/opt/SUNWspro/bin:/opt/local/pro
g/bin:.:/opt/lpp/SPSS/bin:/opt/sas/utilities/bin
11
The which Command
• The which command can be used within a command shell
– It takes a list of one or more names and looks for the files that would be
executed had these names been given as commands
– It does this by searching the directories listed in the PATH variable as shown
in the example below
uxb3% which csh
/usr/bin/csh
uxb3% which xwd
/usr/openwin/bin/xwd
uxb3%
12
5.2a Accessing the Contents
of a Directory
opendir() Function
•
•
The contents of a directory are accessed through the use of three functions:
opendir(), readdir(), and closedir()
The opendir() function provides a handle of type DIR * to a directory
stream that is positioned at the first entry in the directory
#include <dirent.h>
DIR *opendir(const char *directoryName);
•
•
If successful, the function returns a pointer to a directory object; otherwise, it
returns a NULL pointer and sets errno
The DIR type represents a directory stream, which is an ordered sequence (not
necessarily alphabetical) of all of the directory entries in a particular directory
14
readdir() Function
•
The readdir() function reads a directory by returning successive entries in
a directory stream pointed to be directoryPtr
#include <dirent.h>
struct dirent *readdir(DIR *directoryPtr);
•
•
•
•
The function returns a pointer to a struct dirent structure containing
information about the next directory entry
The function moves the stream to the next position (i.e., directory entry) after
each call
If successful, the function returns a pointer to a struct dirent; otherwise,
it returns a NULL pointer and sets errno
The function also returns NULL to indicate the end of the directory, but in this
case it does not change the value of errno
15
Conceptual View of Directory Entry
Storage
DIR record
struct dirent record
struct dirent *next
DIR *directoryPtr
char d_name[1]
char d_name[1]
char d_name[1]
NULL
16
struct dirent structure
The following are the fields in the struct dirent structure
struct dirent
{
ino_t d_ino;
off_t d_off;
ushort d_reclen;
char d_name[1];
}
//
//
//
//
i-number
Offset into directory file
Length of record
Entry name pointer (i.e., char *)
17
closedir() and rewinddir() Functions
•
The closedir() function closes a directory stream
#include <dirent.h>
int closedir(DIR *directoryPtr);
– If successful, the function returns zero; otherwise, it returns -1 and sets errno
•
The rewinddir() function repositions the directory stream at its beginning
#include <dirent.h>
void rewinddir(DIR *directoryPtr);
– The function does not return a value and has no errors defined
18
Example use of Directory Functions
#include <stdio.h>
#include <dirent.h>
// *******************************
int main(int argc, char *argv[])
{
DIR *directoryPtr;
struct dirent *entryPtr;
if (argc != 2)
{
fprintf(stderr, "Usage: a.out directory_name\n");
return 1;
} // End if
(more on next slide)
19
Example use of Directory Functions
(continued)
directoryPtr = opendir(argv[1]);
if (directoryPtr == NULL)
{
perror ("Failed to open directory");
return 1;
} // End if
entryPtr = readdir(directoryPtr);
while (entryPtr != NULL)
{
printf("%s\n", entryPtr->d_name);
entryPtr = readdir(directoryPtr);
} // End while
closedir(directoryPtr);
return 0;
} // End main
20
5.2b Accessing File Status
Information
stat() Function
• The stat() function accesses a file by name and retrieves status
information about the file
#include <sys/stat.h>
int stat(const char *path, struct stat *buffer);
• The path parameter contains the directory path and name of the file
• The buffer parameter points to a user-supplied buffer into which the
function stores the status information
• If successful, the function returns zero; otherwise, it returns –1 and sets
errno
22
struct stat structure
The following are the fields in the struct stat structure
dev_t
ino_t
mode_t
nlink_t
uid_t
gid_t
off_t
time_t
time_t
time_t
st_dev;
st_ino;
st_mode;
st_nlink;
st_uid;
st_gid;
st_size;
st_atime;
st_mtime;
st_ctime;
//
//
//
//
//
//
//
//
//
//
Device ID of device containing file
File serial number
File mode (access permissions and file type)
Number of hard links
User ID of file
Group ID of file
File size in bytes
Time of last access
Time of last data modification
Time of last file status change
23
Example use of stat()
#include <stdio.h>
#include <time.h>
#include <sys/stat.h>
// *******************************
int main(int argc, char *argv[])
{
struct stat statusBuffer;
int status;
if (argc != 2)
{
fprintf(stderr, "Usage: a.out file_name\n");
return 1;
} // End if
status = stat(argv[1], &statusBuffer);
if (status == -1)
{
perror("Failed to get file status");
return 1;
} // End if
printf("File size
: %d\n", statusBuffer.st_size);
printf("Last accessed: %s\n", ctime(&statusBuffer.st_atime));
return 0;
} // End main
24
fstat() Function
•
The fstat() function reports status information of a file associated with the
open file descriptor fileDescriptor
#include <sys/stat.h>
int fstat(int fileDescriptor, struct stat *buffer);
•
•
The buffer parameter points to a user-supplied buffer into which the
function stores the status information
If successful, the function returns zero; otherwise, it returns –1 and set errno
25
Example use of fstat()
#include
#include
#include
#include
<stdio.h>
<time.h>
<sys/stat.h>
<fcntl.h>
// *******************************
int main(int argc, char *argv[])
{
struct stat statusBuffer;
int status;
int inFile;
if (argc != 2)
{
fprintf(stderr, "Usage: a.out file_name\n");
return 1;
} // End if
inFile = open(argv[1], O_RDONLY);
if (inFile == -1)
{
perror("Failed to open file");
return 1;
} // End if
(more on next slide)
26
Example use of fstat()
(continued)
status = fstat(inFile, &statusBuffer);
if (status == -1)
{
perror("Failed to get file status");
return 1;
} // End if
printf("File size
: %d\n", statusBuffer.st_size);
printf("Last accessed: %s\n", ctime(&statusBuffer.st_atime));
close(inFile);
return 0;
} // End main
27
Determining the File Type
•
•
•
•
The st_mode field in the struct stat structure specifies the access
permissions of the file and the type of the file
The slides for Chapter 4 UNIX I/O cover the symbolic names (e.g., S_IRUSR)
for the access permissions
The next slide specifies the macros for testing the st_mode field for the type
of the file
A regular file is a randomly accessible sequence of bytes with no further
structure imposed on the system
– UNIX stores data and programs as regular files
•
•
Directories are files that associate file names with locations
Special files specify peripheral devices
– Character special files represent devices such as terminals
– Block special files represent disk devices
•
The ISFIFO macro tests for pipes and FIFOs that are used for inter-process
communication
28
Macros for Testing the File Type
S_ISBLK(mode)
S_ISCHR(mode)
S_ISDIR(mode)
S_ISFIFO(mode)
S_ISLNK(mode)
S_ISREG(mode)
S_ISSOCK(mode)
Block special file
Character special file
Directory
Pipe or FIFO special file
Symbolic link
Regular file
Socket
mode is of type mode_t
Each macro returns a nonzero value if the test is true and zero otherwise
29
Example use of Macros
#include <stdio.h>
#include <sys/stat.h>
// *******************************
int main(int argc, char *argv[])
{
struct stat statusBuffer;
int status;
if (argc != 2)
{
fprintf(stderr, "Usage: a.out entry_name\n");
return 1;
} // End if
status = stat(argv[1], &statusBuffer);
if (status == -1)
{
perror("Failed to get entry status");
return 1;
} // End if
if (S_ISDIR(statusBuffer.st_mode))
printf("\n%s is a directory\n", argv[1]);
else
printf("\n%s is not a directory\n", argv[1]);
return 0;
} // End main
30
lstat() Function
•
The lstat() function accesses a file by name (via a symbolic link) and
retrieves status information about the file
#include <sys/stat.h>
int lstat(const char *path, struct stat *buffer);
•
•
•
•
•
The path parameter contains the directory path and name of the file
The buffer parameter points to a user-supplied buffer into which the function
stores the status information
If successful, the function returns zero; otherwise, it returns –1 and set errno
If path does not correspond to a symbolic link, then the stat() and lstat()
functions both return the same results
When path is a symbolic link, the lstat() function returns information about
the link itself whereas the stat() function returns information about the file
referred to by the link
31
5.3 UNIX File System
Implementation
UNIX File System
•
•
•
•
•
•
Disk formatting divides a physical disk into regions called partitions
Each partition can have its own file system associated with it
A particular file system can be mounted at any node in the tree of another file
system
The topmost node in a file system is called the root (or the root directory) of
the file system
In UNIX, a single slash ( / ) is used to denote the root directory
The next slide shows the subdirectory layout in the top level of a typical UNIX
file system
– /dev holds specifications for the devices (i.e., special files) on the system
– /etc holds files containing information regarding the network, accounts, and other
databases that are specific to the computer
– /home is the default directory containing the subdirectories for user accounts
– /opt is a standard location for applications
– /usr contains files shared among applications (e.g., /usr/include contains
include files)
– /var contains system files that vary and can grow arbitrarily large (e.g., log files,
incoming mail)
33
Typical UNIX Directory
Structure
34
Root Directory on uxb2
lrwxrwxrwx
drwxr-xr-x
drwxr-xr-x
drwxr-xr-x
drwxr-xr-x
drwx-----drwxr-xr-x
drwxr-xr-x
drwxr-xr-x
dr-xr-xr-x
drwxr-xr-x
lrwxrwxrwx
drwx-----drwxr-xr-x
drwxr-xr-x
drwxr-xr-x
drwx-----drwxr-xr-x
dr-xr-xr-x
drwx-----drwxr-xr-x
drwxr-xr-x
drwxrwxrwt
drwxr-xr-x
drwxr-xr-x
dr-xr-xr-x
1
4
2
14
7
4
60
2
2
3
10
1
2
4
8
10
2
43
71
2
2
2
21
36
37
6
root
root
root
root
root
root
root
root
root
root
root
root
root
root
root
root
root
root
root
root
root
root
root
root
root
root
root
root
other
sys
sys
root
sys
sys
other
root
sys
root
root
bin
root
sys
root
sys
root
root
sys
other
sys
sys
sys
root
9
512
512
4096
512
512
3584
512
512
3
512
9
8192
512
1024
512
512
1536
480032
512
1024
512
7680
1024
1024
512
Jun
Jun
Jun
Apr
Jun
Mar
May
Jun
Mar
Jul
Apr
Jun
Jun
Jun
Mar
Jun
Jun
Jun
Jul
Jun
Apr
Jun
Jul
Jun
Mar
May
2
2
18
29
2
17
20
2
11
12
5
2
2
2
17
30
2
2
12
2
5
21
12
18
11
20
2004
2004
2004
09:33
2004
2005
17:02
2004
2005
18:56
2005
2004
2004
2004
2005
2004
2004
2004
19:29
2004
2005
2004
19:29
2004
2005
17:02
bin -> ./usr/bin
cache
cdrom
dev
devices
dns
etc
export
ftp
home
kernel
lib -> ./usr/lib
lost+found
mail
mnt
opt
patch
platform
proc
radmin
sbin
src
tmp
usr
var
35
vol
UNIX Inode
•
•
•
•
Traditionally, UNIX files have been implemented with a modified tree structure
Directory entries contain a file name and a reference to a fixed-length structure
called an inode (see the figure on the next slide)
The inode contains information about the file size, the file location, the owner of
the file, the time of creation, the time of last access, the time of last modification,
permissions, etc.
The inode also contains pointers to the first few data blocks of a file
– If the file is large, the indirect pointer points to a block of pointers that point to
additional blocks
– If the file is still larger, the double indirect pointer is a pointer to a block of indirect
pointers
– If the file is huge, the triple indirect pointer contains a pointer to a block of double
indirect pointers
•
•
A block is the smallest unit of storage allocated for a file system and is always a
quantity of bytes equal to some power of 2
When a system administrator creates a file system on a physical disk partition,
the raw bytes are organized into data blocks and inodes
– Each partition has its own pool of inodes that are uniquely numbered
– Files created on that partition use inodes from that partition's pool
36
– The relative layout of the disk blocks and inodes has been optimized for performance
UNIX Inode Structure
*
*
- Time of last access
- Time of last data modification
- Time of last file status change
37
Directory Implementation
•
A directory is a file containing a correspondence between file names and file
locations
– UNIX has traditionally implemented a file location as an inode number
•
The inode itself does not contain the file name
– When a program references a file by path name, the operating system traverses the file
system tree to find the file name and inode number in the appropriate directory
– Once it has obtained the inode number from the directory, the operating system can
determine other information about the file by accessing the inode
•
Advantages of this implementation
– Changing the file name requires only changing the directory entry
– A file can be moved from one directory to another just by moving the directory entry,
as long as the partition doesn't change
– Only one physical copy of the file needs to exist on the disk; however, the file may
have several names or the same name in different directories (all on the same
partition)
– Directory entries are of variable length because the file name is of variable length;
manipulating small variable-length structures can be done efficiently
– Directory entries are small, since most of the information about each file is kept in its
inode
38
5.4a Links in Directories
Hard and Soft Links
•
•
•
•
•
•
•
•
UNIX directories contain two types of links: links and symbolic links
A link, often called a hard link, is a directory entry that associates a file name with a
file location
A symbolic link, sometimes called a soft link, is a file that stores a string used to
modify the path name when it is encountered during path name resolution
The behavioral differences between hard and soft links in practice is often not readily
obvious
A directory entry corresponds to a single link, but an inode may be the target of several
of these links
Each inode contains the count of the number of links to the inode (i.e., the total number
of directory entries that contain the inode number)
When a program uses open() to create a file, the operating system makes a new
directory entry and assigns a free inode to represent the newly created file
The figure on the next slide shows a directory entry for a file called name1 in the
/dirA directory
40
Directory Entry, Inode,
and Data Block
41
Creating and Removing a Link
•
A user can create additional links to a file by using the ln shell command or
by calling the link() function from a program
•
The creation of the new link allocates a new directory entry and increments the
link count of the corresponding inode; the link uses no additional disk space
A user can delete a file by using the rm shell command or by calling the
unlink() function from a program
•
•
•
When either approach is used, the operating systems deletes the corresponding
directory entry and decrements the link count in the inode
It does not free the inode and the corresponding data blocks unless the
operation causes the link count to be decremented to zero
42
Directory Entry, Inode,
and Data Block
43
link() and unlink() Functions
•
The link() function creates a new directory entry for the existing file specified
by currentPath in the directory specified by newPath
#include <unistd.h>
int link(const char *currentPath, const char *newPath);
– If successful, the function returns zero; otherwise, it returns –1 and sets errno
•
The unlink() function removes the directory entry specified by path
#include <unistd.h>
int unlink(const char *path);
– If the file's link count is zero and no process has the file open, the function frees the
space occupied by the file (i.e., it deletes the file)
– If successful, the function returns zero; otherwise, it returns –1 and set errno
44
Example use of link() Function
and ln Command
•
•
The figure on the next slide shows the result of creating an entry called name2 in
/dirB for the existing name1 entry in /dirA
This can be done using the ln command as shown in the following command line
ln
•
/dirA/name1
/dirB/name2
This can also be done using the link() function as shown in the following code
segment
#include <stdio.h>
#include <unistd.h>
// . . .
int status;
status = link("/dirA/name1", "/dirB/name2");
if (status == -1)
perror("Failed to make a new link in /dirB");
45
Two Links to the Same File
Current Path
New Path
46
Example use of link() and unlink()
#include <stdio.h>
#include <unistd.h>
// *******************************
int main(int argc, char *argv[])
{
int status;
if ( (argc == 4) && strcmp(argv[1], "L") == 0)
{
status = link(argv[2], argv[3]);
if (status == -1)
perror("Failed to link file");
else
printf("Successfully linked %s to %s\n", argv[2], argv[3]);
}
else if ( (argc == 3) && strcmp(argv[1], "U") == 0)
{
status = unlink(argv[2]);
if (status == -1)
perror("Failed to unlink file");
else
printf("Successfully unlinked %s\n", argv[2]);
}
47
(more on next slide)
Example use of link() and unlink()
(continued)
else
{
fprintf(stderr, "Usage: a.out L current_path new_path\n");
fprintf(stderr, "
a.out U current_path\n");
return 1;
} // End if
return 0;
} // End main
48
5.4b Symbolic Links in Directories
Creating and Removing a
Symbolic Link
•
•
•
•
•
•
•
A symbolic link is a file containing the name of another file or directory
A reference to the name of a symbolic link causes the operating system to
locate the inode corresponding to that link
The operating system assumes that the data blocks of the corresponding inode
contain another path name
The operating system then locates the directory entry for that path name and
continues to follow the chain until it finally encounters a hard link and a real
file
A user can create a symbolic link by using the ln shell command with the –s
option or by calling the symlink() function from a program
The symbolic link may be fully qualified or it may be relative to its own
directory
Unlike the single partition limitation placed on a hard link, a symbolic link
allows the link to be created between two partitions
50
symlink() Function
• The symlink() function creates a symbolic link in a directory
• path1 contains the string that will be the contents of the link and path2
contains the pathname of the link
• In other words, path2 is the newly created link and path1 is what the new link
points to
#include <unistd.h>
int symlink(const char *path1, const char *path2);
•
•
If successful, the function returns zero; otherwise, it returns –1 and sets errno
Unlike a hard link, a symbolic link uses a new inode
51
Example use of symlink()
Function and ln Command
•
•
The figure on the next slide shows the result of creating a symbolic link called
name2 in /dirB for the existing name1 entry in /dirA
This can be done using the ln command as shown in the following command line
ln
•
-s
/dirA/name1
/dirB/name2
This can also be done using the symlink() function as shown in the following
code segment
#include <stdio.h>
#include <unistd.h>
// . . .
int status;
status = symlink("/dirA/name1", "/dirB/name2");
if (status == -1)
perror("Failed to create a symbolic link in /dirB");
52
Ordinary File with a Symbolic
Link to it
Current Path1
New Path2
53

Download