Chapter21-ProcessingDataInExternalStorage

advertisement
Processing Data
in External Storage
Chapter 21’
Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013
Contents
• A Look at External Storage
• Sorting Data in an External File
• External Dictionaries
Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013
A Look at External Storage
• External storage exists after program
execution
• Generally, there is more external storage
than internal memory
• Direct access files are essential for external
dictionaries
• A fi le contains records that are organized
into blocks
Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013
A Look at External Storage
FIGURE 21-2 A fi le partitioned into blocks of records
Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013
A Look at External Storage
• Direct access input and output involves
blocks instead of records
• A buffer stores data temporarily
Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013
A Look at External Storage
• Updating a portion of a record within a block
Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013
A Look at External Storage
• Time required to read or write a block of
data
 Longer than the time required to operate on the
block’s data
 Implication: reduce the number of required
block accesses
• File access time dominant factor in an
algorithm’s efficiency
Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013
Sorting Data in an External File
Example Problem
• Fille contains 1,600 employee records
• Sorted by Social Security number
• Each block contains 100 records
• Program can access only enough internal
memory to manipulate 3 blocks at a time
Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013
Sorting Data in an External File
FIGURE 21-3 (a) Sixteen sorted runs, one block each, in
file F1 ; (b) Eight sorted runs, two blocks each, in file F2 ;
Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013
Sorting Data in an External File
FIGURE 21-3 (c) Four sorted runs, four blocks each, in
file F1 ; (d) Two sorted runs, eight blocks each, in file F2
Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013
Sorting Data in an External File
FIGURE 21-4 (a) Merging single blocks;
(b) merging long runs
Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013
Sorting Data in an External File
• Algorithm for merging arbitrary-sized sorted
runs
Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013
External Dictionaries
• Records in order by search key
• Algorithm to traverse file in sorted order
• Retrieval could be done with binary search
Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013
External Dictionaries
FIGURE 21-5 Shifting across block boundaries
Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013
Indexing an External File
Benefits of indexing with catalog
1.Index record will be much smaller than a
data record
2.Do not need to maintain the data file in any
particular order, insert new records in any
convenient location
3.Maintain several indexes simultaneously
Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013
Indexing an External File
FIGURE 21-6 A data file with an index
Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013
Indexing an External File
FIGURE 21-7 A data file with a sorted index file
Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013
External Hashing
• Similar to the internal scheme described in
Chapter 18 .
• Hash the index file instead of data file
• Hash table—contains a pointer to beginning
of chain of items that hash into that location
Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013
External Hashing
FIGURE 21-8 A hashed index file
Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013
External Hashing
FIGURE 21-9 A single block with a pointer
Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013
External Hashing
Insertion under external hashing
1.Insert the data record into the data file.
2.Insert a corresponding index record into the
index file
Removal under external hashing
1.Search index file for corresponding index
record
2.Remove data record from the data file
Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013
B-Trees
FIGURE 21-10 (a) Blocks organized into a 2-3 tree;
Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013
B-Trees
FIGURE 21-10 (b) a single node of the 2-3 tree
Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013
B-Trees
FIGURE 21-11 (a) A node with two children; (b) a node
with three children; (c) a node with m children
Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013
B-Trees
FIGURE 21-12 (a) A full tree whose internal nodes have
five children; (b) the format of a single node
Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013
B-Trees
FIGURE 21-13 A B-tree of degree 5
Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013
B-Trees
Insertion into a B-tree.
1.Insert the data record into the data file
2.Insert a corresponding index record into the
index file
Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013
B-Trees
FIGURE 21-14 (a through d) The steps for inserting 55
Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013
B-Trees
FIGURE 21-14 (a through d) The steps for inserting 55
Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013
B-Trees
FIGURE 21-14 (e) splitting the root
Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013
B-Trees
Removal from a B-tree.
1.Locate the index record in the index file.
2.Remove the data record from the data file
Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013
B-Trees
FIGURE 21-15 (a through e) The steps for removing 73 ;
Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013
B-Trees
FIGURE 21-15 (a through e) The steps for removing 73 ;
Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013
B-Trees
FIGURE 21-15 (a through e) The steps for removing 73 ;
Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013
B-Trees
FIGURE 21-15 (a through e) The steps for removing 73 ;
Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013
B-Trees
FIGURE 21-15 (f) removing the root;
Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013
Traversals
• Accessing only the search key of each
record, but not the data file
Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013
Traversals
• Accessing data file also
Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013
Multiple Indexing
FIGURE 21-16 Multiple index files
Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013
End
Chapter 21
Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013
Download