Database Management: Revision B4 Final Hoda Mohamed Waguih Database Management – Hoda M. Waguih 2009/2010 - SLIDE 1 Basic Concepts • What is Data, Metadata, Information? • What is Database? • What is file-based systems? • Characteristics of file-based systems. • Disadvantages of file-based approach. Database Management – Hoda M. Waguih 2009/2010 - SLIDE 2 Basic Concepts • What is a DBMS. • Typical functions of a DBMS. • Major components of the DBMS environment. • Advantages & disadvantages of DBMSs. Database Management – Hoda M. Waguih 2009/2010 - SLIDE 3 Basic Concepts • What is a data dependence? • Problems are caused by data redundancy? • What is a View & View Benefits? • What is Concurrency control system. • What is a Constraint in a database? Database Management – Hoda M. Waguih 2009/2010 - SLIDE 4 Basic Concepts • • • • • • • What is a Model? What is a Relation? Characteristics of a Relation What is an Attribute? What is a Domain of an Attribute? What is a Key? Types of Keys? Integrity Constrains Database Management – Hoda M. Waguih 2009/2010 - SLIDE 5 Basic Concepts • • • • • • What is Entity integrity constrain? What is Referential integrity constrain? General constrain Null Values Difference between a base table and a view Functional Dependency Database Management – Hoda M. Waguih 2009/2010 - SLIDE 6 Basic Concepts • • • • • • • • Conceptual Model What is an Entity, Attribute, Relationship Strong Entity & Weak Entity Entity with physical existence Entity with conceptual existence Entity Types & Entity Sets Relationship Types & Relationship Sets Different Types of attributes Database Management – Hoda M. Waguih 2009/2010 - SLIDE 7 Basic Concepts • Degree of a relationship type • Uniqueness constraint • Participation constraint • Participation constraint & Existence Dependency • ER Diagrams & Naming Conventions Database Management – Hoda M. Waguih 2009/2010 - SLIDE 8 Basic Concepts • ER-to-Relational Mapping Algorithm • Mapping Binary Relationship ⁃ Foreign key approach ⁃ Merged relationship approach ⁃ Cross reference or relationship relation approach Database Management – Hoda M. Waguih 2009/2010 - SLIDE 9 Basic Concepts • • • • • • • Skipped Superclass/subclass Super type/sub type Class/subclass relationship Type inheritance Specialization / Generalization Attribute-defined specialization User defined Specialization Database Management – Hoda M. Waguih 2009/2010 - SLIDE 10 Basic Concepts • • • • Skipped Specialization /generalization lattices Specialization /generalization hierarchies Multiple inheritance / Single inheritance Specialization /generalization constraint – Disjointness constraint – Completeness constraint Database Management – Hoda M. Waguih 2009/2010 - SLIDE 11 Basic Concepts • Data Anomalies – Deletion anomalies – Update anomalies – Insertion anomalies Database Management – Hoda M. Waguih 2009/2010 - SLIDE 12 Basic Concepts • First normal form (1NF) • Second normal form (2NF) • Third normal form (3NF) Database Management – Hoda M. Waguih 2009/2010 - SLIDE 13 The Relational Algebra Operations • Operations from Mathematical Set Theory – – – – UNION, INTERSECTION, SET DIFFERENCE, and CARTESIAN PRODUCT • Operations specifically for Relational Database – SELECT, – PROJECT, and – JOIN Database Management – Hoda M. Waguih 2009/2010 - SLIDE 14 Sample Review Questions • Define the following terms: Data, meta-data, information, database, DBMS, database system, database catalog, Program-data independence, user view, DBA, end user, transaction, deductive database system, Data model, database schema, database state, internal schema, conceptual schema, external schema, data independence, DDL, DML, query language Database Management – Hoda M. Waguih 2009/2010 - SLIDE 15 Sample Review Questions • Discuss the main characteristics of the database approach and how it differs from traditional file systems. • Discuss the capabilities that should be provided by a DBMS. Database Management – Hoda M. Waguih 2009/2010 - SLIDE 16 Sample Review Questions • What is the difference between a database schema and a database state? • What is the difference between logical data independence and physical data independence? • What is an candidate Key? What is primary Key? • What is an entity type? What is an entity set? Database Management – Hoda M. Waguih 2009/2010 - SLIDE 17 Sample Review Questions • Explain the differences among an entity, an entity type, and an entity set. • What is a relationship type? Explain the differences among a relationship instance, a relationship type, and a relationship set. Database Management – Hoda M. Waguih 2009/2010 - SLIDE 18 Sample MCQ • One disadvantage of file processing systems is: A) reduced data duplication. B) program-data independence. C) limited data sharing. D) enforcement of integrity constraints. • Which of the following is a characteristic of the database approach? A) It stresses the integration of data. B) It duplicates data. C) It creates physical relationships. D) It reduces storage. Database Management – Hoda M. Waguih 2009/2010 - SLIDE 19 Sample MCQ • A user view is: a) what a user sees when he or she looks out the window. b) a table or set of tables. c) a logical description of some portion of the database. d) a procedure stored on the server. • In a file processing environment, descriptions for data and the logic for accessing the data is built into: a)application programs. b)database descriptors. c)fields. d)records. Database Management – Hoda M. Waguih 2009/2010 - SLIDE 20 Sample MCQ • The logical representation of an organization’s data is called a(n): A) B) C) D) database model. entity-relationship model. relationship systems design. database entity diagram. • A relationship between the instances of a single entity type is called a(n) _____ relationship. A) B) C) D) ternary primary binary unary Database Management – Hoda M. Waguih 2009/2010 - SLIDE 21 Sample MCQ • Which of the following is a generic entity type that has a relationship with one or more subtypes? A) B) C) D) Megatype Supertype Subgroup Class • The property by which subtype entities possess the values of all attributes of a supertype is called: A) hierarchy reception. B) class management. C) attribute inheritance. D) generalization. Database Management – Hoda M. Waguih 2009/2010 - SLIDE 22 Sample MCQ • The ______ rule states that an entity instance can simultaneously be a member of two (or more) subtypes. A. disjoint B. overlap C. partial specialization D. total specialization • • In a supertype/subtype hierarchy, each subtype has: A. only one supertype. B. many supertypes. C. at most two supertypes. D. at least one subtype. •Database Management – Hoda M. Waguih 2009/2010 - SLIDE 23 • Sample (True/False) • With the traditional file processing approach, each application shares data files, thus enabling much data sharing. • The database approach emphasizes integration and sharing of data throughout the organization. • Metadata are data that describe the properties of other data. • Redundancy increases the risk of inconsistent data. Database Management – Hoda M. Waguih 2009/2010 - SLIDE 24 Sample (True/False) • An entity instance of a subtype represents the same entity instance of the supertype. • A member of a subtype does NOT necessarily have to be a member of the supertype. • Specialization is the reverse of generalization. • A completeness constraint may specify that each entity of the supertype must be a member of some subtype in the relationship. Database Management – Hoda M. Waguih 2009/2010 - SLIDE 25 Sample (True/False) • The E-R model is used to construct a conceptual model. • An entity is a person, place, object, event, or concept in the user environment about which the organization wishes to maintain data. • A single occurrence of an entity is called an entity instance. • The relationship among the instances of three entity types is called a unary relationship Database Management – Hoda M. Waguih 2009/2010 - SLIDE 26 Sample: Fill in the space • Databases are an organized collection of ______ data. • Facts, text, graphic images, sound, and video segments that have meaning in the user’s new environment are called _____________. • __________ is data that has been processed in such a way as to increase the knowledge of the person who uses the data. • __________ is data that describes the properties or characteristics of other data. • In the beginning of computer-based data processing, _______ were used instead of databases. • Unplanned and uncontrolled _________________ can often lead to a loss of metadata integrity. Database Management – Hoda M. Waguih 2009/2010 - SLIDE 27 Sample: Fill in the space • A logical representation of the data for an organization or for a business area is • A person, place, object, event, or concept about which the organization wants or needs to store data is called a(n) _____. • An attribute whose value can be calculated from a related attribute is called a(n) _____ attribute. Database Management – Hoda M. Waguih 2009/2010 - SLIDE 28 Database Management – Hoda M. Waguih 2009/2010 - SLIDE 29 Database Management – Hoda M. Waguih 2009/2010 - SLIDE 30 Sample Exercise Answer: (a) Each SECTION record is related to a COURSE record. (b) Each GRADE_REPORT record is related to one STUDENT record and one SECTION record. (c) Each PREREQUISITE record relates two COURSE records: one in the role of a course and the other in the role of a prerequisite to that course. Database Management – Hoda M. Waguih 2009/2010 - SLIDE 31 Sample Exercise 1.12 Cite some examples of integrity constraints that you think can apply to the database shown in Figure 1.2. Answer: 1. The Student_number should be unique for each STUDENT record . 2. The Course_number should be unique for each COURSE record (1,2 key constraint). 3. A value of Course_number in a SECTION record must also exist in some COURSE record (referential integrity constraint). 4. A value of Student_number in a GRADE_REPORT record must also exist in some STUDENT record (referential integrity constraint). 5. The value of Grade in a GRADE_REPORT record must be one of the values in the set {A, B, C, D, F, I, U, S} (domain constraint). 6. Every record in COURSE must have a value for Course_number (entity integrity constraint). Database Management – Hoda M. Waguih 2009/2010 - SLIDE 32 Sample Exercise • Specify all the relationships among the records of the database shown in Figure 1.2. Database Management – Hoda M. Waguih 2009/2010 - SLIDE 33 Sample Exercise Consider the following set of requirements for a UNIVERSITY database that is used to keep track of students’ transcripts: − The university keeps track of each student’s name, student number, Social Security number, current address and phone number, permanent address and phone number, birth date, sex, class (freshman, sophomore, ..., graduate), major department, minor department (if any), and degree program (B.A., B.S., ..., Ph.D.). Some user applications need to refer to the city, state, and ZIP Code of the student’s permanent address and to the student’s last name. Both Social Security number and student number have unique values for each student. − Each department is described by a name, department code, office number, office phone number, and college. Both name and code have unique values for each department. Database Management – Hoda M. Waguih 2009/2010 - SLIDE 34 Sample Exercise − Each course has a course name, description, course number, number of semester hours, level, and offering department. The value of the course number is unique for each course. − Each section has an instructor, semester, year, course, and section number. The section number distinguishes sections of the same course that are taught during the same semester/year; its values are 1, 2, 3, ..., up to the number of sections taught during each semester. − A grade report has a student, section, letter grade, and numeric grade (0, 1, 2, 3, or 4). Design an ER schema for this application, and draw an ER diagram for the schema. Specify key attributes of each entity type, and structural constraints on each relationship type. Note any unspecified requirements, and make appropriate assumptions to make the specification complete. Database Management – Hoda M. Waguih 2009/2010 - SLIDE 35 Database Management – Hoda M. Waguih 2009/2010 - SLIDE 36 Sample Exercise 7.19 - Consider the ER diagram of Figure 7.20, which shows a simplified schema for an airline reservations system. Extract from the ER diagram the requirements and constraints that resulted in this schema. Try to be as precise as possible in your requirements and constraints specification. Database Management – Hoda M. Waguih 2009/2010 - SLIDE 37 Database Management – Hoda M. Waguih 2009/2010 - SLIDE 38 Answer (1) The database represents each AIRPORT, keeping its unique AirportCode, the AIRPORT Name, and the City and State in which the AIRPORT is located. (2) Each airline FLIGHT has a unique number, the Airline for the FLIGHT, and the Weekdays on which the FLIGHT is scheduled (for example, every day of the week except Sunday can be coded as X7). (3) A FLIGHT is composed of one or more FLIGHT LEGs (for example, flight number CO1223 from New York to Los Angeles may have two FLIGHT LEGs: leg 1 from New York to Houston and leg 2 from Houston to Los Angeles). Each FLIGHT LEG has a DEPARTURE AIRPORT and Scheduled Departure Time, and an ARRIVAL AIRPORT and Scheduled Arrival Time. Database Management – Hoda M. Waguih 2009/2010 - SLIDE 39 Answer (4) A LEG INSTANCE is an instance of a FLIGHT LEG on a specific Date (for example, CO1223 leg 1 on July 30, 1989). The actual Departure and Arrival AIRPORTs and Times are recorded for each flight leg after the flight leg has been concluded. The Number of available seats and the AIRPLANE used in the LEG INSTANCE are also kept. (5) The customer RESERVATIONs on each LEG INSTANCE include the Customer Name, Phone, and Seat Number(s) for each reservation. (6) Information on AIRPLANEs and AIRPLANE TYPEs are also kept. For each AIRPLANE TYPE (for example, DC-10), the TypeName, manufacturing Company, and Maximum Number of Seats are kept. The AIRPORTs in which planes of this type CAN LAND are kept in the database. For each AIRPLANE, the AirplaneId, Total number of seats, and TYPE are kept. Database Management – Hoda M. Waguih 2009/2010 - SLIDE 40 Sample Exercise • 7.23 - Consider the ER diagram shown in Figure 7.21 for part of a BANK database. Each bank can have multiple branches, and each branch can have multiple accounts and loans. • (a) List the strong (non weak) entity types in the ER diagram. • (b) Is there a weak entity type? If so, give its name, its partial key, and its identifying relationship. • (c) What constraints do the partial key and the identifying relationship of the weak entity type specify in this diagram? Database Management – Hoda M. Waguih 2009/2010 - SLIDE 41 Sample Exercise (d) List the names of all relationship types, and specify the (min, max) constraint on each participation of an entity type in a relationship type. Justify your choices (e) List concisely the user requirements that led to this ER schema design. (f) Suppose that every customer must have at least one account but is restricted to at most two loans at a time, and that a bank branch cannot have more than 1000 loans. How does this show up on the (min,max) constraints? Database Management – Hoda M. Waguih 2009/2010 - SLIDE 42 Database Management – Hoda M. Waguih 2009/2010 - SLIDE 43 Answer (a) Entity types: BANK, ACCOUNT, CUSTOMER, LOAN (b) Weak entity type: BANK-BRANCH. Partial key: BranchNo. Identifying relationship: BRANCHES. (c) The partial key BranchNo in BANK-BRANCH specifies that the same BranchNo value ay occur under different BANKs. The identifying relationship BRANCHES specifies that BranchNo values are uniquely assigned for those BANK-BRANCH entities that are related to the same BANK entity. Hence, the combination of BANK Code and BranchNo together constitute a full identifier for a BANK-BRANCH. (d) Relationship Types: BRANCHES, ACCTS, LOANS, A-C, L-C. The (min, max) constraints are shown below. Database Management – Hoda M. Waguih 2009/2010 - SLIDE 44 Answer (e) The requirements may be stated as follows: Each BANK has a unique Code, as well as a Name and Address. Each BANK is related to one or more BANK-BRANCHes, and the BranhNo is unique among each set of BANK-BRANCHes that are related to the same BANK. Each BANK-BRANCH has an Address. Each BANK-BRANCH has zero or more LOANS and zero or more ACCTS. Each ACCOUNT has an AcctNo (unique), Balance, and Type and is related to exactly one BANK-BRANCH and to at least one CUSTOMER. Each LOAN has a LoanNo (unique), Amount, and Type and is related to exactly one BANK-BRANCH and to at least one CUSTOMER. Each CUSTOMER has an SSN (unique), Name, Phone, and Address, and is related to zero or more ACCOUNTs and to zero or more LOANs. Database Management – Hoda M. Waguih 2009/2010 - SLIDE 45 Answer • (f) The (min, max) constraints would be changed as follows: Database Management – Hoda M. Waguih 2009/2010 - SLIDE 46