Outline • File Management – Structured files – Low-level file implementations

advertisement
Outline
• File Management
– Structured files
– Low-level file implementations
Operating System Components
5/29/2016
COP4610
2
Why Programmers Need Files
HTML
Editor
<head>
…
</head>
<body>
…
</body>
Web
Browser
foo.html
File
Manager
<head>
…
</head>
<body>
…
</body>
• Persistent storage
• Shared device
5/29/2016
COP4610
File
Manager
• Structured information
• Can be read by any application
• Accessibility
• Protocol
3
File system context
5/29/2016
COP4610
4
Fig 13-2: The External View of the File Manager
Application
Program
Memory Mgr
Process Mgr
File Mgr
UNIX
Device Mgr
WriteFile()
CreateFile()
CloseHandle() ReadFile()
SetFilePointer()
Memory Mgr
Process Mgr
Device Mgr
File Mgr
mount()
write()
close() open()
read()
lseek()
Windows
Hardware
5/29/2016
COP4610
5
Levels in a file system
5/29/2016
COP4610
6
Information Structure
5/29/2016
COP4610
7
Logical structures in a file
5/29/2016
COP4610
8
Low-Level Files
5/29/2016
COP4610
9
File systems
• File system
– A data structure on a disk that holds files
• actually a file system is in a disk partition
• a technical term different from a “file system” as the
part of the OS that implements files
• File systems in different OSs have different
internal structures
5/29/2016
COP4610
10
A file system layout
5/29/2016
COP4610
11
File system descriptor
• The data structure that defines the file system
• Typical fields
–
–
–
–
size of the file system (in blocks)
size of the file descriptor area
first block in the free block list
location of the file descriptor of the root directory
of the file system
– times the file system was created, last modified,
and last used
5/29/2016
COP4610
12
File system layout variations
• MS/DOS uses a FAT (file allocation table)
file system
– so does the Macintosh OS (although the MacOS
layout is different)
• New UNIX file systems use cylinder groups
(mini-file systems) to achieve better locality
of file data
5/29/2016
COP4610
13
Locating file data
• The logical file is divided into logical blocks
• Each logical block is mapped to a physical
disk block
• The file descriptor contains data on how to
perform this mapping
– there are many methods for performing this
mapping
– we will look at several of them
5/29/2016
COP4610
14
Dividing a file into blocks
5/29/2016
COP4610
15
Contiguous Allocation
• Each file occupies a set of contiguous blocks on the disk
–
–
–
–
Simple – only starting location and length are required
Random access
Wasteful of space (dynamic storage-allocation problem)
Files cannot grow
• Mapping from logical to physical
Q
LA/512
R
– Block to be accessed = Q + starting address
– Displacement into block = R
5/29/2016
COP4610
16
A contiguous file
5/29/2016
COP4610
17
A contiguous file – cont.
5/29/2016
COP4610
18
Keeping a file in pieces
• We need a block pointer for each logical
block, an array of block pointers
– block mapping indexes into this array
– Each file is a linked list of disk blocks
• But where do we keep this array?
– usually it is not kept as contiguous array
– the array of disk pointers is like a second related
file (that is 1/1024 as big)
5/29/2016
COP4610
19
Block pointers in the file descriptor
5/29/2016
COP4610
20
Block pointers in contiguous disk blocks
5/29/2016
COP4610
21
Block pointers in the blocks
5/29/2016
COP4610
22
Block pointers in the blocks – cont.
5/29/2016
COP4610
23
Block pointers in an index block
5/29/2016
COP4610
24
Block pointers in an index block – cont.
5/29/2016
COP4610
25
Chained index blocks
5/29/2016
COP4610
26
Two-level index blocks
5/29/2016
COP4610
27
Two-level index blocks – cont.

primary index
secondary index table
5/29/2016
COP4610
data blocks
28
The UNIX hybrid method
5/29/2016
COP4610
29
The UNIX hybrid method – cont.
5/29/2016
COP4610
30
Inverted disk block index (FAT)
5/29/2016
COP4610
31
DOS FAT Files
File Descriptor
43
Disk
Block
254
Disk
Block
…
107
Disk
Block
File Descriptor
43
43
107
Disk
Block
254
Disk
Block
…
107
Disk
Block
254
File Access Table (FAT)
5/29/2016
COP4610
32
Free-Space Management
• Bit vector (n blocks)
0 1
2
n-1
bit[i] =

…
1  block[i] free
0  block[i] occupied
• First free block number
(number of bits per word) *
(number of 0-value words) +
offset of first 1 bit
5/29/2016
COP4610
33
Free-Space Management - cont.
• Bit map requires extra space. Example:
block size = 212 bytes
disk size = 230 bytes (1 gigabyte)
n = 230/212 = 218 bits (or 32K bytes)
• Easy to get contiguous files
• Linked list (free list)
– Cannot get contiguous space easily
– No waste of space
5/29/2016
COP4610
34
Free list organization
5/29/2016
COP4610
35
Free-Space Management - cont.
• Need to protect:
– Pointer to free list
– Bit map
• Must be kept on disk
• Copy in memory and disk may differ.
• Cannot allow for block[i] to have a situation where
bit[i] = 0 in memory and bit[i] = 1 on disk.
– Solution:
• Set bit[i] = 0 in disk.
• Allocate block[i]
• Set bit[i] = 0 in memory
5/29/2016
COP4610
36
Implementing Low Level Files
• Secondary storage device contains:
– Volume directory (sometimes a root directory
for a file system)
– External file descriptor for each file
– The file contents
• Manages blocks
– Assigns blocks to files (descriptor keeps track)
– Keeps track of available blocks
• Maps to/from byte stream
5/29/2016
COP4610
37
Disk Organization
Boot Sector
Volume Directory
…
Blk0
Blk1
Blkk
Blkk+1
Blkk-1
Track 0, Cylinder 0
Blk2k-1
Track 0, Cylinder 1
…
Blk
Track 1, Cylinder 0
…
Blk
Track N-1, Cylinder 0
…
Blk
Track N-1, Cylinder M-1
…
…
Blk
Blk
…
Blk
Blk
…
Blk
5/29/2016
Blk
COP4610
38
Low-level File System Architecture
Block 0
b0 b1 b2 b3 …
…
bn-1
...
Sequential Device
5/29/2016
Randomly Accessed Device
COP4610
39
File Descriptors
•External name
•Current state
•Sharable
•Owner
•User
•Locks
•Protection settings
•Length
•Time of creation
•Time of last modification
•Time of last access
•Reference count
•Storage device details
5/29/2016
COP4610
40
An open() Operation
•
•
•
•
Locate the on-device (external) file descriptor
Extract info needed to read/write file
Authenticate that process can access the file
Create an internal file descriptor in primary
memory
• Create an entry in a “per process” open file
status table
• Allocate resources, e.g., buffers, to support
file usage
5/29/2016
COP4610
41
File Manager Data Structures
2 Keep the state
of the processfile session
3 Return a
reference to
the data
structure
Process-File
Session
Open File
Descriptor
1 Copy info from
external to the
open file
descriptor
External File Descriptor
5/29/2016
COP4610
42
Opening a UNIX File
fid = open(“fileA”, flags);
…
read(fid, buffer, len);
0
1
2
3
stdin
stdout
stderr
...
On-Device File Descriptor
File structure
inode
Open File Table
Internal File Descriptor
5/29/2016
COP4610
43
Reading and Writing the Byte Stream
• Two stages
– Reading bytes into or writing bytes out of the
memory copy of the block
– Reading the physical blocks into or writing them
out of memory from/to storage devices
– Packing or unmarshalling procedure converts
secondary storage blocks into a byte stream
– Unpacking or marshalling procedure converts a
byte stream into blocks
5/29/2016
COP4610
44
Marshalling the Byte Stream
• Must read at least one buffer ahead on input
• Must write at least one buffer behind on
output
• Seek  flushing the current buffer and
finding the correct one to load into memory
• Inserting/deleting bytes in the interior of the
stream
5/29/2016
COP4610
45
Full Block Buffering
• Storage devices use block I/O
• Files place an explicit order on the bytes
• Therefore, it is possible to predict what is likely to be
read after a byte
• When file is opened, manager reads as many blocks
ahead as feasible
• After a block is logically written, it is queued for
writing behind, whenever the disk is available
• Buffer pool – usually variably sized, depending on
virtual memory needs
– Interaction with the device manager and memory manager
5/29/2016
COP4610
46
Supporting Other Storage Abstractions
• Low-level file systems avoid encoding
record-level functionality
– If applications use very large or very small
records, a generic file manager may not be
efficient
– Some operating systems provide a higher-layer
file system to support applications with large or
small files
– Database management systems and multimedia
documents are examples
5/29/2016
COP4610
47
Structured Files
5/29/2016
COP4610
48
Record-Oriented Sequential Files
5/29/2016
COP4610
49
Electronic Mail Example
5/29/2016
COP4610
50
Indexed Sequential Files
5/29/2016
COP4610
51
Database Management Systems
• A database is a very highly structured set of
information
– Stored across different files
– Optimized to minimize access time
• DBMSs implementation
– Some DBMSs use the normal files provided by
the OS for generic use
– Some use their own storage device block
5/29/2016
COP4610
52
Disk compaction
5/29/2016
COP4610
53
Memory-mapped Files
• A file’s contents are mapped directly into the
virtual address space
– Files can be read from or written to by
referencing the corresponding virtual addresses
• Memory-mapped files are very useful when a
file is shared or accessed repeatedly
5/29/2016
COP4610
54
Memory-mapped Files – cont.
5/29/2016
COP4610
55
Directories
• A directory is a set of logically associated
files and other directories of files
– Directories are the mechanism we use to organize
files
• The file manager provides a set of commands
to manage directories
– Traverse a directory
– Enumerate a list of all files and nested directories
5/29/2016
COP4610
56
Directory Structures
• How should files be organized within
directory?
– Flat name space
• All files appear in a single directory
– Hierarchical name space
• Directory contains files and subdirectories
• Each file/directory appears as an entry in exactly
one other directory -- a tree
• Popular variant: All directories form a tree, but a
file can have multiple parents.
5/29/2016
COP4610
57
Directory Structures
5/29/2016
COP4610
58
Directory Structures – cont.
5/29/2016
COP4610
59
A directory tree
5/29/2016
COP4610
60
Directory Implementation
• Device Directory
– A device can contain a collection of files
– Easier to manage if there is a root for every file
on the device -- the device root directory
• File Directory
– Typical implementations have directories
implemented as a file with a special format
– Entries in a file directory are handles for other
files (which can be files or subdirectories)
5/29/2016
COP4610
61
Directory Implementation
• Linear list of file names with pointer to the
data blocks.
– simple to program
– time-consuming to execute
• Hash Table – linear list with hash data
structure.
– decreases directory search time
– collisions – situations where two file names hash
to the same location
– fixed size
5/29/2016
COP4610
62
Mounting file systems
• Each file system has a root directory
• We can combine file systems by mounting
– that is, link a directory in one file system to the
root directory of another file system
• This allows us to build a single tree out of
several file systems
• This can also be done across a network,
mounting file systems on other machines
5/29/2016
COP4610
63
UNIX mount Command
/
/
bin usr etc
bill
bin usr etc
foo
bill
nutt
foo
/
nutt
FS
abc cde
xyz
/
FS
abc cde
xyz
blah
5/29/2016
blah
COP4610
mount FS at foo
64
Mounting a file system
5/29/2016
COP4610
65
VFS-based File Manager
Exports OS-specific API
File System Independent
Part of File Manager
Virtual File System Switch
MS-DOS Part of
File Manager
5/29/2016
ISO 9660 Part of
File Manager
COP4610
…
ext2 Part of
File Manager
66
NFS Architecture
5/29/2016
COP4610
67
Summary of File Storage Methods
• Contiguous files
– Interleaved files
• File pointers in the file descriptor
• Contiguous file pointers
• Chained data blocks
–
–
–
–
Chained single index blocks
Double index blocks
Triple index blocks
Hybrid solutions
5/29/2016
COP4610
68
Download