DB.Inc. Memo ID: Summary Name of R&D group: KKS Attended members: Ki-Hwan Kim, Dong-Shin Kim, Dong-Man Shin Unattended members: No one Name of the person to prepare the memo: Ki-Hwan Kim Dong-Shin Kim Summary 1. Introduction a. We are going to build a file management system which contains pages, records, and index file. 2. Responsibility a. Heap file structure presentation: Dong-Shin Kim b. Record format presentation: Dong-Man Shin c. Page format research: Ki-Hwan Kim, Dong-Shin Kim, Dong-Man Shin d. Hash-based indexing presentation: Ki-Hwan Kim 3. Page format a. We need to store one extra information to be stored in page, that is number of field to be easy to searching and managing page. b. The page format for the fixed length of record field. i. Adv: easy to implement system and fast. ii. Disadv: expensive of data storage c. The page format for the variable length of record field. i. Adv: better performance (saving cost) ii. Disadv: difficulty of implementation. d. We are going to choose the variable length of record field since it is worthy of challenging problem for our team. And We thought that usually in enterprise field, most of DB developer use this format. e. The record format: i. Variable length of record format: we are going to implement exactly the same to the format that we discussed in the class. ii. Block size: 2kb (since we want to see page working, so that we choose it as small as we can see that working) f. Hashing function: i. Integer hashing function: we used modulo 127 since hash function must guarantee equality distribution so we choose odd number. ii. Character hashing function: Since length of string can be up to 40 byte, we cannot get distinct value of all strings(we need 26^40 distinct value). Avoiding this problem we used folding function which is cut some characters. g. Overall design i. Tool: VC++(Win32 Console application) ii. Using classes, more reusability. h. Catalog file structure i. all_objects.dat Object name Object type File name Student Relation Student.dat Student Index Student.idx ii. <object>_cat.dat Attribute Attribute Relation Position Restriction name type name SSN INTEGER Student 1 Primary key NAME CHAR40 Student 2 MAJOR CHAR40 Student 3 iii. To manage catalog file, we declare format_catalog_1/2 classes. iv. There is a toolkit such as catalog2list, and then it can store data and load data. i. Output file i. Sqlload all_objects.dat, <object>_cat.dat, <object>.dat ii. Createindex <Attrname>.idx iii. We follow exactly same way of the idea that was from the class and lecture note.