Processing Data in External Storage Chapter 21’ Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013 Contents • A Look at External Storage • Sorting Data in an External File • External Dictionaries Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013 A Look at External Storage • External storage exists after program execution • Generally, there is more external storage than internal memory • Direct access files are essential for external dictionaries • A fi le contains records that are organized into blocks Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013 A Look at External Storage FIGURE 21-2 A fi le partitioned into blocks of records Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013 A Look at External Storage • Direct access input and output involves blocks instead of records • A buffer stores data temporarily Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013 A Look at External Storage • Updating a portion of a record within a block Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013 A Look at External Storage • Time required to read or write a block of data Longer than the time required to operate on the block’s data Implication: reduce the number of required block accesses • File access time dominant factor in an algorithm’s efficiency Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013 Sorting Data in an External File Example Problem • Fille contains 1,600 employee records • Sorted by Social Security number • Each block contains 100 records • Program can access only enough internal memory to manipulate 3 blocks at a time Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013 Sorting Data in an External File FIGURE 21-3 (a) Sixteen sorted runs, one block each, in file F1 ; (b) Eight sorted runs, two blocks each, in file F2 ; Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013 Sorting Data in an External File FIGURE 21-3 (c) Four sorted runs, four blocks each, in file F1 ; (d) Two sorted runs, eight blocks each, in file F2 Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013 Sorting Data in an External File FIGURE 21-4 (a) Merging single blocks; (b) merging long runs Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013 Sorting Data in an External File • Algorithm for merging arbitrary-sized sorted runs Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013 External Dictionaries • Records in order by search key • Algorithm to traverse file in sorted order • Retrieval could be done with binary search Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013 External Dictionaries FIGURE 21-5 Shifting across block boundaries Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013 Indexing an External File Benefits of indexing with catalog 1.Index record will be much smaller than a data record 2.Do not need to maintain the data file in any particular order, insert new records in any convenient location 3.Maintain several indexes simultaneously Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013 Indexing an External File FIGURE 21-6 A data file with an index Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013 Indexing an External File FIGURE 21-7 A data file with a sorted index file Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013 External Hashing • Similar to the internal scheme described in Chapter 18 . • Hash the index file instead of data file • Hash table—contains a pointer to beginning of chain of items that hash into that location Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013 External Hashing FIGURE 21-8 A hashed index file Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013 External Hashing FIGURE 21-9 A single block with a pointer Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013 External Hashing Insertion under external hashing 1.Insert the data record into the data file. 2.Insert a corresponding index record into the index file Removal under external hashing 1.Search index file for corresponding index record 2.Remove data record from the data file Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013 B-Trees FIGURE 21-10 (a) Blocks organized into a 2-3 tree; Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013 B-Trees FIGURE 21-10 (b) a single node of the 2-3 tree Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013 B-Trees FIGURE 21-11 (a) A node with two children; (b) a node with three children; (c) a node with m children Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013 B-Trees FIGURE 21-12 (a) A full tree whose internal nodes have five children; (b) the format of a single node Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013 B-Trees FIGURE 21-13 A B-tree of degree 5 Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013 B-Trees Insertion into a B-tree. 1.Insert the data record into the data file 2.Insert a corresponding index record into the index file Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013 B-Trees FIGURE 21-14 (a through d) The steps for inserting 55 Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013 B-Trees FIGURE 21-14 (a through d) The steps for inserting 55 Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013 B-Trees FIGURE 21-14 (e) splitting the root Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013 B-Trees Removal from a B-tree. 1.Locate the index record in the index file. 2.Remove the data record from the data file Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013 B-Trees FIGURE 21-15 (a through e) The steps for removing 73 ; Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013 B-Trees FIGURE 21-15 (a through e) The steps for removing 73 ; Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013 B-Trees FIGURE 21-15 (a through e) The steps for removing 73 ; Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013 B-Trees FIGURE 21-15 (a through e) The steps for removing 73 ; Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013 B-Trees FIGURE 21-15 (f) removing the root; Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013 Traversals • Accessing only the search key of each record, but not the data file Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013 Traversals • Accessing data file also Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013 Multiple Indexing FIGURE 21-16 Multiple index files Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013 End Chapter 21 Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013