Chapter 12 File Management

advertisement
Chapter 12
File Management
• Overview
• File organisation and Access
• File Directories
• File Sharing
• Record Blocking
1
Files
• Files are the central element to most
applications
– file as an input to applications
– file as an output for long-term storage and for
later access
• Desirable properties of files:
– Long-term existence
– Controlled sharing between processes
– Structure that is convenient for particular
applications
2
File Structure
Fields and Records
• Fields
– Basic element of data
• e.g., student’s last name
– Contains a single value
– Characterized by its length and data type
• Records
– Collection of related fields
• e.g., a student record
– Treated as a unit
3
File Structure
File and Database
• File
– Collection of similar records
– Treated as a single entity and may be
referenced by name
– Access control restrictions usually apply at the
file level
• Database
– Collection of related data
– Explicit relationships exist among elements
– Consists of one or more files
4
A Big Picture
How to organize records in a
file and access a particular
record in a file?
How to organize records
as a sequence of blocks
for I/O?
individual block I/O
requests must be
scheduled for optimizing
performance
How to identify and locate a
selected file?
How to enforce user access
control in shared systems?
5
Roadmap
• Overview
• File organisation and Access
• File Directories
• File Sharing
• Record Blocking
6
File Organization
• The basic operations that a user or
application may perform on a file are
performed at the record level
– The file is viewed as having some structure
that organizes the records
• File organization refers to the logical
structuring of records
– Determined by the way in which files are
accessed (access method)
7
Criteria for
File Organization
• Important criteria include:
– Short access time
– Ease of update
– Economy of storage
– Simple maintenance
– Reliability
8
Criteria for
File Organization
• Priority will differ depending on the use
– For batch mode file processing, rapid access
for retrieval of a single record is of minimal
concern
• These criteria may conflict
– Use of indexes (conflict with economy of
storage) can be a primary means of
increasing the speed of access to data
9
The Pile
• Data are collected in the
order they arrive
– No structure
• Purpose is to accumulate a
mass of data and save it
• Records may have different
fields
– field should be self-describing (field
name + value)
– field length should be known (delimiters,
subfield or default for a field type)
10
The Pile
• Record access is by exhaustive search
• Used when data are collected and stored
prior to processing or data are not easy to
organize
•  Uses space well when data vary in size
and structure
•  Adequate for exhaustive searches
•  Easy to update
•  Unsuitable for most applications
11
The Sequential File
• Fixed format used for records
• Records are of the same length
– same number of fixed-length fields
in a particular order
• Only the values of fields need to
be stored
• Field name and length are
attributes of the file structure
12
The Sequential File
• Key field
– Uniquely identifies the record
– Records are stored in key sequence
•  Optimal for batch applications if they involve
the processing of all the records
•  Easily stored on tape and disk
•  Poor performance for interactive applications
– considerable processing and delay due to the
sequential search of the file for a key match
13
Indexed Sequential File
• An index is added to support
random access
– An index record contains a key
field and a pointer into the main file
– The index is a sequential file
– For searching
• Search the index to find the highest
key value that is equal to or precedes
the desired key value
• Search continues in the main file at the
location indicated by the pointer
14
Indexed Sequential File
Example
• Consider searching a particular key value
in a sequential file with 1 million records
– without index
• requires on average one-half million record
accesses
– with an index containing 1000 entries with the
keys in the index evenly distributed over the
main file
• requires on average 500 accesses to the index file
+ 500 accesses to the main file
15
Indexed Sequential File
• An overflow file is added
• A new record is added to the overflow file and is
located by following a pointer from its predecessor
record
• The indexed sequential file is occasionally merged
with the overflow file in batch mode
• Greatly reduces the time required to access a
single record, without sacrificing the sequential
nature.
16
Indexed File
• Records are accessed only through
their indexes
– no restriction on the placement of
records
– allows variable-length records
• Uses multiple indexes for different key
fields
– An exhaustive index contains one entry for
every record in the main file
– A partial index contains entries to records
where the field of interest exists
17
Indexed File
• When a new record is added to the main file, all
of the index files must be updated.
• Used mostly in applications where
– timeliness of information is critical and
– data are rarely processed exhaustively
– examples: airline reservation systems and inventory
control systems
18
Roadmap
• Overview
• File organisation and Access
• File Directories
• File Sharing
• Record Blocking
19
File Directory
• Contains information about files
– Attributes
– Location
– Ownership
• Directory itself is a file owned by the
operating system
20
Directory Elements
• Basic Information
– File name: must be unique
– File type: e.g., text, binary
– File organization
• Address Information
– Volume: device on which file is stored
– Starting address: e.g., cylinder, track on disk
– Size used: in bytes, words or blocks
– Size allocated: maximum size of the file
21
Directory Elements
• Access Control Information
– Owner: able to grant/deny access to other users and
to change these privileges
– Access information: e.g., user’s name and password
for each authorized user
– Permitted actions: controls reading, writing,
executing, transmitting over a network
• Usage Information
– Date Created, Identity of Creator, Date Last Read
Access, Identity of Last Reader, Date Last Modified
22
Hierarchical, or
Tree-Structured Directory
• Master directory with user
directories underneath it
• Each user directory may
have subdirectories and
files as entries
• Each directory and
subdirectory can be
organized as a sequential
file
23
Hierarchical, or
Tree-Structured Directory
•  Easily enforce access restriction on
directories.
•  Easily organize collections of files.
•  Minimize the difficulty in assigning
unique names.
24
Naming
• The tree structure allows users to find a
file by following a path from the root or
master directory down various branches
until the file is reached
• The series of directory names, culminating
in the file name itself, constitutes a
pathname for the file
• Duplicate filenames are possible if they
have different pathnames
25
Naming
• Usually an interactive
user or a process is
associated with a
current or working
directory
– Files are referenced
relative to the working
directory unless an
explicit full pathname
is used
26
Roadmap
• Overview
• File organisation and Access
• File Directories
• File Sharing
• Record Blocking
27
File Sharing
• In multiuser system, there is almost
always a requirement for allowing files to
be shared among a number of users
• Two issues
– Access rights
– Management of simultaneous access
28
Access Rights
• A wide variety of access rights have been
used by various systems
– often as a hierarchy, with each right implying
those that precede it.
• None
– User may not know the existence of file by not
allowing to read the user directory that
includes this file
• Knowledge
– User can only determine that the file exists
and who its owner is
29
Access Rights cont…
• Execution
– The user can load and execute a program but
cannot copy it, e.g., proprietary programs
• Reading
– The user can read the file for any purpose,
including copying and execution
• Appending
– The user can add data to the file but cannot
modify or delete any of the file’s contents
30
Access Rights cont…
• Updating
– The user can modify, delete, and add to the
file’s data.
• Changing protection
– User can change access rights granted to
other users
• Deletion
– User can delete the file
31
User Classes
• Access can be provided to different classes
of users
– Owner: usually the files creator, has full rights
and may grant rights to others
– Specific users: individual users who are
designated by user ID
– User groups: a set of users identified as a
group
– All: all users who have access to this system
32
Simultaneous Access
• When access is granted to append or
update a file to more than one user, the
OS or file management system must
enforce discipline
• User may lock the entire file or individual
records during update
• Mutual exclusion and deadlock are issues
for shared access, ref. readers/writers
problem
33
Roadmap
• Overview
• File organisation and Access
• File Directories
• File Sharing
• Record Blocking
34
Blocks and records
• Records are the logical unit of access of a
structured file
• Blocks are the unit for I/O with secondary storage
• For I/O to be performed, records must be
organized as blocks.
• Three methods of blocking are common
– Fixed length blocking
– Variable length spanned blocking
– Variable-length unspanned blocking
35
Fixed Blocking
• Fixed-length records are used, and an
integral number of records are stored in a
block
• Unused space at the end of a block is
internal fragmentation
• Common for sequential files with fixedlength records
36
Fixed Blocking
37
Variable Length
Spanned Blocking
• Variable-length records are used and are
packed into blocks with no unused space
• Some records may span multiple blocks
– Continuation is indicated by a pointer to the
successor block
•  Efficient for storage and does not limit
the size of records
38
Variable Blocking:
Spanned
•  Difficult to implement
•  Records that span two blocks require
two I/O operations
39
Variable-length
unspanned blocking
• Uses variable length records without
spanning
•  Wasted space in most blocks because
of the inability to use the remainder of a
block if the next record is larger than the
remaining unused space
•  Limits record size to the size of a block
40
Variable Blocking:
Unspanned
41
Revisit the Big Picture
User views the file as having
some structure that
organizes the records;
different access methods
reflect different file structures
Records must be
organized as a sequence
of blocks for output and
unblocked after input
individual block I/O
requests must be
scheduled for optimizing
performance
Describes the location of all
files plus their attributes
Only authorized users are
allowed to access particular
files in particular ways
42
Download