DB.Inc. Memo ID: Summary Name of R&D group: KKS

advertisement
DB.Inc.
Memo ID: Summary
Name of R&D group: KKS
Attended members: Ki-Hwan Kim, Dong-Shin Kim, Dong-Man Shin
Unattended members: No one
Name of the person to prepare the memo: Ki-Hwan Kim Dong-Shin Kim
Summary
1. Introduction
a. We are going to build a file management system which contains pages,
records, and index file.
2. Responsibility
a. Heap file structure presentation: Dong-Shin Kim
b. Record format presentation: Dong-Man Shin
c. Page format research: Ki-Hwan Kim, Dong-Shin Kim, Dong-Man Shin
d. Hash-based indexing presentation: Ki-Hwan Kim
3. Page format
a. We need to store one extra information to be stored in page, that is number
of field to be easy to searching and managing page.
b. The page format for the fixed length of record field.
i. Adv: easy to implement system and fast.
ii. Disadv: expensive of data storage
c. The page format for the variable length of record field.
i. Adv: better performance (saving cost)
ii. Disadv: difficulty of implementation.
d. We are going to choose the variable length of record field since it is
worthy of challenging problem for our team. And We thought that usually
in enterprise field, most of DB developer use this format.
e. The record format:
i. Variable length of record format: we are going to implement
exactly the same to the format that we discussed in the class.
ii. Block size: 2kb (since we want to see page working, so that we
choose it as small as we can see that working)
f. Hashing function:
i. Integer hashing function: we used modulo 127 since hash function
must guarantee equality distribution so we choose odd number.
ii. Character hashing function: Since length of string can be up to 40
byte, we cannot get distinct value of all strings(we need 26^40
distinct value). Avoiding this problem we used folding function
which is cut some characters.
g. Overall design
i. Tool: VC++(Win32 Console application)
ii. Using classes, more reusability.
h. Catalog file structure
i. all_objects.dat
Object name
Object type
File name
Student
Relation
Student.dat
Student
Index
Student.idx
ii. <object>_cat.dat
Attribute
Attribute
Relation
Position
Restriction
name
type
name
SSN
INTEGER
Student
1
Primary key
NAME
CHAR40
Student
2
MAJOR
CHAR40
Student
3
iii. To manage catalog file, we declare format_catalog_1/2 classes.
iv. There is a toolkit such as catalog2list, and then it can store data and
load data.
i. Output file
i. Sqlload all_objects.dat, <object>_cat.dat, <object>.dat
ii. Createindex <Attrname>.idx
iii. We follow exactly same way of the idea that was from the class
and lecture note.
Download