RELATIONAL DATABASE Three classical data models 1. hierarchical model 2. network model o each tuple is a separate record, e.g., (AN 145) --- [Perugini] --- (CPS 430) o no separation between logical and physical views o used record-at-a-time languages o too low-level 3. relational model o proposed by E.F. Codd in 1969 o most popular and successful model o de facto standard for databases o (relational) databases are one of the most popular success stories of simple theoretical ideas o the focus of CPS 430/542 Relational Database • Definition: • Data stored in tables that are associated by shared attributes (keys). • Any data element (or entity) can be found in the database through the name of the table, the attribute name, and the value of the primary key. Relational Database Definitions • Entity: Object, Concept or event (subject) • Attribute: a Characteristic of an entity • Row or Record: the specific characteristics of one entity • Table: a collection of records • Database: a collection of tables The Relational Database model • Developed by E.F. Codd, C.J. Date (70s) • Table = Entity = Relation • Table row = tuple = instance • Table column = attribute • Table linkage by values • Entity-Relationship Model The Relational Model • Each attribute has a unique name within an entity • All entries in the column are examples of it • Each row is unique • Ordering of rows and columns is unimportant • Each position (tuple) is limited to a single entry. Data Model: What’s a model? • A data model is a representation of reality • It’s used to define the storage and manipulation of a data base. Relational Database Definitions • Entity: Object, Concept or event (subject) • Attribute: a Characteristic of an entity • Row or Record: the specific characteristics of one entity • Table: a collection of records • Database: a collection of tables The Relational Database model • Developed by E.F. Codd, C.J. Date (70s) • Table = Entity = Relation • Table row = tuple = instance • Table column = attribute • Table linkage by values • Entity-Relationship Model The Relational Model • Each attribute has a unique name within an entity • All entries in the column are examples of it • Each row is unique • Ordering of rows and columns is unimportant • Each position (tuple) is limited to a single entry. Advantages of Relational Database over Flat File 1. Avoids Data Duplication 2. Avoid Inconsistent Records 3. Easier to change Data 4. Easier to change data format 5. Data can be added and remove easily 6. Easier to maintain security Database Table Keys Definition: A key of a relation is a subset of attributes with the following attributes: • Unique identification • Non-redundancy Types of Keys a. PRIMARY KEY • Serves as the row level addressing mechanism in the relational database model. • It can be formed through the combination of several items. • Indicates uniqueness within records or rows in a table b. FOREIGN KEY • A column or set of columns within a table that are required to match those of a primary key of a second table. • the primary key from another table, this is the only way join relationships can be established. These keys are used to form a RELATIONAL JOIN - thereby connecting row to row across the individual tables. Keys • Primary key – minimal subset of fields that is unique identifier for a tuple • • sid is primary key for Students • cid is primary key for Courses Foreign key –connections between tables • Courses (cid, instructor, quarter, dept) • Students (sid, name, login, age, gpa) • How do we express which students take each course? Constructing Join Relationships • One-to-many relationships include the Primary Key of the ‘one’ table and a Foreign Key (FK) in the ‘many’ table. • Cardinality: one-to-one, one-to-many, many-to-many relationships Optionality: the relationship is either mandatory or optional Normalization • Normalization: a process for analyzing the design of a relational database • Database Design - Arrangement of attributes into entities • It permits the identification of potential problems in your database design • Concepts related to Normalization: • KEYS and FUNCTIONAL DEPENDENCE 1. Database Normalization (1) • Sample Student Activities DB • Poorly Designed • Non-unique records • • Table John Smith Test the Design by developing and queries 2. Database Normalization (2) ● Created a unique “ID” for each Record Table ● Required the creation of an “ID” lookreporting (Students Table) ● Converted the “Flat-File into a Database sample reports in the Activities up table for Relational 3. Database Normalization (3) • Wasted Space • Redundant data entry • What about taking a 3rd Activity? • Query Difficulties - trying to find all swimmers • Data Inconsistencies - conflicting prices 4. Database Normalization (4) ● Students table is fine ● Elimination of two columns and Table restructuring, Simplifies the an Activities Table ● BUT, we still have Redundant fees) and data insertion data (activity anomalies. 5. Database Normalization (5) • Modify the Design to ensure that key field is dependent on the • Creation of the Participants “every nonwhole key” Table, corrects our problems and forms a union between 2 tables. Database Design: Basic Steps Step 1: Determine the entities involved and create a separate table for each type of entity (thing, concept, event, theme) and name it. Step 2: Determine the Primary Key for each table. Step 3: Determine the properties for each entity (the non-key attributes). Step 4: Determine the relationships among the entities Database Views • A View is an individual’s picture of a database. It can be composed of many tables, unbeknownst to the user. • It’s a simplification of a complex data model • It provides a measure of database security • Views are useful, primarily for READ-only users and are not always safe for CREATE, UPDATE, and DELETE. Table Indexing • An Index is a means of expediting the retrieval of data. • Indexes are “built” on a column(s). • Indexes occupy disk space; occasionally a lot. • Indexes aren’t technically necessary for operation and must be maintained by the database administrator. . SUMMATIVE QUIZ Self-Check Activity 2.1 Problem No. 1 (5 points) a. A company database needs to store information about employees (identified by ssn, with salary and phone as attributes), departments (identified by dno, with dname and budget as attributes), and children of employees (with name and age as attributes). b. Employees work in departments; each department is managed by an employee; a child must be identified uniquely by name when the parent (who is an employee; assume that only one parent works for the company) is known. We are not interested in information about a child once the parent leaves the company. Note: Populate your tables with at least 10 records. Problem No. 2 (10 points) a. Although you always wanted to be an artist, you ended up being an expert on databases because you love to cook data and you somehow confused database with data baste. Your old love is still there, however, so you set up a database company, ArtBase that builds a product for art galleries. The core of this product is a database with a schema that captures all the information that galleries need to maintain. b. Galleries keep information about artists, their names (which are unique), birthplaces, age,and style of art. For each piece of artwork, the artist, the year it was made, its unique title, its type of art (e.g., painting, lithograph, sculpture, photograph), and its price must be stored. Pieces of artwork are also classified into groups of various kinds, for example, portraits, still lifes, works by Picasso, or works of the 19th century; a given piece may belong to more than one group. c. Each group is identified by a name (like those just given) that describes the group. Finally, galleries keep information about customers. For each customer, galleries keep that person’s unique name, address, total amount of dollars spent in the gallery (very important!), and the artists and groups of art that the customer tends to like. Note: Populate your tables with at least 10 records. Problem No. 3 (10 points) Design a Relational Database for Access Registrar’s Office. The Office maintains data about each class, including the instructor, the number of student enrolled, and the time and place of the class meetings. For each student-class pair, a grade is recorded. Underlined attributes indicate the primary key. Student (StudentID, Name, Program) Course (CourseNo, Title, Syllabus, Credits) CourseOffering (CourseNo, SectionNo, Year, Semester, Time, Room) Instructor (InstructoNo, Name, Department, Title) Enrols (StudentNo, CourseNo, SectionNo, Semester, Year, Grade) Teaches (CourseNo, SectionNo, Semester, Year, InstructorNo) Requires (MainCourse, PreRequisite)