Chapter 3: Relational Model

advertisement
Chapter 3: Relational Model
Chapter 3: Relational Model
Chapter 3: Relational Model ............................................................................................... 1
1 Objectives ........................................................................................................................ 2
2 Logical View of Data ....................................................................................................... 3
2.1 Table characteristics.................................................................................................. 3
2.2 terms .......................................................................................................................... 3
2.2.1 domain................................................................................................................ 3
2.2.2 primary key ........................................................................................................ 4
2.2.3 abstract data type................................................................................................ 4
2.2.4 tuple.................................................................................................................... 4
2.2.5 attribute .............................................................................................................. 4
3 Keys ................................................................................................................................. 4
3.1 determinant ............................................................................................................... 4
3.1.1 functional dependence ....................................................................................... 4
3.1.2 multi-functional dependence .............................................................................. 5
3.2 types of keys ............................................................................................................. 5
3.2.1 super key ............................................................................................................ 5
3.2.2 candidate key ..................................................................................................... 5
3.2.3 primary key ........................................................................................................ 5
3.2.4 alternate key ....................................................................................................... 5
3.2.5 foreign key ......................................................................................................... 5
3.2.6 secondary key..................................................................................................... 5
3.2.7 composite key .................................................................................................... 6
3.3 entity integrity ........................................................................................................... 6
3.4 referential integrity.................................................................................................... 6
4 DB Integrity ..................................................................................................................... 6
4.1 domain integrity ........................................................................................................ 6
4.2 entity integrity ........................................................................................................... 6
4.3 referential integrity.................................................................................................... 6
4.4 business rules ............................................................................................................ 6
Jay M. Lightfoot, Ph.D.
Chapter 3: Relational Model
5 Relational DB Query Languages ..................................................................................... 6
5.1 relational algebra ....................................................................................................... 6
5.1.1 select .................................................................................................................. 7
5.1.2 project ................................................................................................................ 8
5.1.3 join ..................................................................................................................... 8
5.1.3.1 equa-join ..................................................................................................... 8
5.1.3.2 theta-join ..................................................................................................... 9
5.1.3.3 natural join .................................................................................................. 9
5.1.3.4 outer join ................................................................................................... 10
5.1.4 intersect ............................................................................................................ 11
5.1.5 union ................................................................................................................ 11
5.1.6 difference ......................................................................................................... 11
5.1.7 Cartesian product ............................................................................................. 12
5.1.8 division............................................................................................................. 12
5.2 relational calculus ................................................................................................... 13
6 Meta Data Components.................................................................................................. 13
6.1 data dictionary......................................................................................................... 13
6.2 system catalog ......................................................................................................... 13
6.2.1 synonym ........................................................................................................... 14
6.2.2 homonym ......................................................................................................... 14
7 Relationship Types......................................................................................................... 14
7.1 1:1 ........................................................................................................................... 14
7.2 1:N........................................................................................................................... 15
7.3 M:N ......................................................................................................................... 16
7.3.1 intersection table .............................................................................................. 16
7.3.2 dummy table..................................................................................................... 17
8 Data Redundancy in Relational Model .......................................................................... 17
9 Indexes ........................................................................................................................... 17
9.1 pointer ..................................................................................................................... 18
9.2 unique index ............................................................................................................ 18
9.3 non-unique index .................................................................................................... 18
1 Objectives
 The relational database model takes a logical view of data.
 The relational model's basic components are entities, attributes,
and relationships among entities.
 Entities and their attributes are organized into tables.
 Know relational database operators, the data dictionary, the
system catalog.
 How data redundancy is handled in the relational model.
 Why is indexing important?
Jay M. Lightfoot, Ph.D.
Chapter 3: Relational Model
2 Logical View of Data
Relational model
 Enables us to view data logically rather than physically
 Reminds us of simpler file concept of data storage
Table
 Has advantages of structural and data independence
 Resembles a file from conceptual point of view
 Easier to understand than its hierarchical and network
database predecessors
2.1 Table characteristics
Table: two-dimensional structure composed of rows and columns
Contains group of related entities (an entity set)
Terms entity set and table are often used interchangeably
Table also called a relation because the relational model’s creator,
Codd, used the term relation as a synonym for table
Think of a table as a persistent relation:
 A relation whose contents can be permanently saved for
future use
5 "Rules" of a relational table:
1. tuple and attribute order is immaterial
2. every tuple is unique
3. cells contain single values
4. all values within an attribute come from the same domain
5. relation names within the database and attribute names
within the relation are unique
These 5 rules fully describe relations in the relational database
model.
2.2 terms
2.2.1 domain
The set of allowable values that an attribute may take on.
Jay M. Lightfoot, Ph.D.
Chapter 3: Relational Model
2.2.2 primary key
 Primary key (PK) is an attribute (or a combination of attributes)
that uniquely identifies any given entity (row)
 Key’s role is based on determination
 If you know the value of attribute A, you can look up
(determine) the value of attribute B
2.2.3 abstract data type
Data types that describe a set of similar objects with shared and
encapsulated data representation and methods. An abstract data
type is generally used to describe complex objects. Similar to a
class in object oriented domain.
2.2.4 tuple
A row in a relation.
2.2.5 attribute
A column in a relation.
3 Keys
Generally speaking, keys consist of one or more attributes that
determine other attributes. There are various types of keys, but all
share this characteristic.
Keys are generally associated with indexes; however, keys and
indexes are not the same thing. An index is a small file that uses key
information to speed up the lookup process into another file. Keys, on
the other hand, are a type of integrity constraint.
3.1 determinant
When one value can be used to "determine" another, it is said to be
its determinant.
3.1.1 functional dependence
"The attribute B is functionally dependent on the attribute A if each
value in column A determines one and only one value of column
B."
For example: if it is true that when you know A, you also know B
and C, it is true that A --> B,C (read A determines B and C).
Likewise, B and C are functionally dependent on A.
Jay M. Lightfoot, Ph.D.
Chapter 3: Relational Model
3.1.2 multi-functional dependence
The attribute B is multi-functionally dependent on the attribute A if
each value in column A determines a well defined set of values for
column B.
3.2 types of keys
3.2.1 super key
Superkey
 Any key that uniquely identifies each entity
3.2.2 candidate key
Candidate key
 A minimal superkey (one without redundancies)
3.2.3 primary key
Primary key
 A candidate key that is selected as the “prime” key
3.2.4 alternate key
Alternate key
 Candidate keys that are not selected to be the primary
key
3.2.5 foreign key
Foreign key (FK)
 An attribute whose values match primary key values in the
related table
 Foreign keys are related to Referential integrity in that it
exists when a foreign key points to a valid primary key.
3.2.6 secondary key
Secondary key
 A set of attributes that determine other attributes based
upon the values currently held. Values not required to be
unique.
Jay M. Lightfoot, Ph.D.
Chapter 3: Relational Model
3.2.7 composite key
Composed of more than one attribute. Also known as a
"concatenated key"
3.3 entity integrity
The formal type of integrity associated with primary keys.
Says that all primary keys must be unique in the relation and may
not be null. Nor can any part of a composite primary key be null.
3.4 referential integrity
A term related to foreign keys. Referential integrity is said to exist
when a foreign key has a matching primary key in another relation.
4 DB Integrity
4.1 domain integrity
Property that the value of an attribute conforms to the domain
defined for the attribute.
4.2 entity integrity
Primary keys must be unique and no null values are allowed in any
part of a primary key.
4.3 referential integrity
Referential integrity is said to exist when a foreign key has a
matching primary key in another relation.
Said another way, referential integrity means that if the foreign key
contains a value, that value refers to an existing valid tuple in
another relation.
4.4 business rules
Custom integrity rules that are specific to the business. Can be
anything. Modern DBMS's are capable of accepting and enforcing
these rules.
5 Relational DB Query Languages
5.1 relational algebra
A procedural approach to data query language that processes data
a "set-at-a-time".
Jay M. Lightfoot, Ph.D.
Chapter 3: Relational Model
Defines theoretical way of manipulating table contents using
relational operators:
 SELECT
 PROJECT
 JOIN
 INTERSECT
 UNION
 DIFFERENCE
 PRODUCT
 DIVIDE
Actually, only select, project, unioin, difference, and product are
needed because the rest can be derived from this base set of
operators.
The most commonly used operators are the select, project, and join.
Use of relational algebra operators on existing tables (relations)
produces new relations
5.1.1 select
Select (restrict)
 Yields values for all rows found in a table
 Can be used to list either all row values or it can yield only
those row values that match a specified criterion
 Yields a horizontal subset of a table
Jay M. Lightfoot, Ph.D.
Chapter 3: Relational Model
5.1.2 project
Project
 Yields all values for selected attributes
 Yields a vertical subset of a table
5.1.3 join
Join
 Allows us to combine information from two or more tables
 Real power behind the relational database, allowing the
use of independent tables linked by common attributes
5.1.3.1 equa-join
Equijoin
 Links tables on the basis of an equality condition that
compares specified columns of each table
Jay M. Lightfoot, Ph.D.
Chapter 3: Relational Model
 Outcome does not eliminate duplicate columns
 Condition or criterion to join tables must be explicitly
defined
 Takes its name from the equality comparison operator (=)
used in the condition
5.1.3.2 theta-join
Theta join
 If any other comparison operator is used
5.1.3.3 natural join
Links tables by selecting only rows with common values in their
common attribute(s)
Result of a three-stage process:
 PRODUCT of the tables is created
 SELECT is performed on Step 1 output to yield only the
rows for which the AGENT_CODE values are equal
 Common column(s) are called join column(s)
 PROJECT is performed on Step 2 results to yield a
single copy of each attribute, thereby eliminating
duplicate columns
 Final outcome yields table that
 Does not include unmatched pairs
 Provides only copies of matches
 If no match is made between the table rows,
 the new table does not include the unmatched row
 The column on which we made the JOIN—that is,
AGENT_CODE—occurs only once in the new table
 If the same AGENT_CODE were to occur several times in
the AGENT table,
 a customer would be listed for each match
Jay M. Lightfoot, Ph.D.
Chapter 3: Relational Model
5.1.3.4 outer join
Matched pairs are retained and any unmatched values in other
table are left null
In outer join for tables CUSTOMER and AGENT, two scenarios
are possible:
 Left outer join
 Yields all rows in CUSTOMER table, including
those that do not have a matching value in the
AGENT table
 Right outer join
 Yields all rows in AGENT table, including those
that do not have matching values in the
CUSTOMER table
Jay M. Lightfoot, Ph.D.
Chapter 3: Relational Model
5.1.4 intersect
Intersect:
 Yields only the rows that appear in both tables
5.1.5 union
Union:
 Combines all rows from two tables, excluding duplicate
rows
 Tables must have the same attribute characteristics
5.1.6 difference
Difference
 Yields all rows in one table not found in the other table—
that is, it subtracts one table from the other
Jay M. Lightfoot, Ph.D.
Chapter 3: Relational Model
5.1.7 Cartesian product
Product (Cartesian product)
 Yields all possible pairs of rows from two tables
5.1.8 division
DIVIDE requires the use of one single-column table and one twocolumn table
Jay M. Lightfoot, Ph.D.
Chapter 3: Relational Model
5.2 relational calculus
A non-procedural query language that processes data a 'set-at-atime'. Superior to relational algebra because it is non-procedural;
however, it does not have any more "expressive capability" than
relational algebra. That is, it cannot do any more than relational
algebra, it is just easier to use.
4 Forms

Tuple calculus

Domain calculus

transform-oriented languages (SQL)

QBE
6 Meta Data Components
The structural component of the database is called 'metadata.' This
component is normally stored in the data dictionary and the system
catalog.
6.1 data dictionary
Data dictionary
 Used to provide detailed accounting of all tables found
within the user/designer- created database
 Contains (at least) all the attribute names and
characteristics for each table in the system
 Contains metadata—data about data
 Sometimes described as “the database designer’s
database” because it records the design decisions about
tables and their structures
6.2 system catalog
System catalog
 Contains metadata
 Detailed system data dictionary that describes all objects
within the database
 Terms “system catalog” and “data dictionary” are often used
interchangeably
Jay M. Lightfoot, Ph.D.
Chapter 3: Relational Model
 Can be queried just like any user/designer- created table
6.2.1 synonym
Different names to describe the same attribute. For example,
teacher and instructor are likely synonyms in a college setting.
6.2.2 homonym
Same attribute name with different meanings. For example,
'phone' in one relation might mean home phone while 'phone' in
another could mean work phone (or cell, or fax, ...). Can cause a
good deal of confusion.
7 Relationship Types
1:M relationship
 Relational modeling ideal
 Should be the norm in any relational database design
M:N relationships
 Must be avoided because they lead to data redundancies
1:1 relationship
 Should be rare in any relational database design
7.1 1:1
 Found in some database environments
 One entity can be related to only one other entity, and vice
versa
 Often means that entity components were not defined
properly
 Could indicate that two entities actually belong in the same
table
 Sometimes 1:1 relationships are appropriate (e.g., true 1:1
such as dept to supervisor)
 Reasons you may want to have 1:1 are 1) true 1:1 (see
above), 2) single relation with all attributes would be too big to
store, 3) performance better if you split it up, 4) subtypesupertype structure.
Jay M. Lightfoot, Ph.D.
Chapter 3: Relational Model
7.2 1:N
The most common relation in reality. Most database models are
designed to show these by default.
Jay M. Lightfoot, Ph.D.
Chapter 3: Relational Model
7.3 M:N
Most data models cannot handle this directly due to the complex
nature of the relationship.
Can be implemented by breaking it up to produce a set of 1:M
relationships
Can avoid problems inherent to M:N relationship by creating a
composite entity or bridge entity (also known as an intersection
table)
7.3.1 intersection table
Intersection Table
 Implementation of a composite entity
 Yields required M:N to 1:M conversion
 Composite entity table must contain at least the primary keys
of original tables
 Linking table contains multiple occurrences of the foreign
key values
 Additional attributes may be assigned as needed
Jay M. Lightfoot, Ph.D.
Chapter 3: Relational Model
7.3.2 dummy table
Same as in intersection table, but one that only has key
information, no additional attributes included.
8 Data Redundancy in Relational Model
Data redundancy leads to data anomalies
 Such anomalies can destroy database effectiveness
Foreign keys - controlled redundancy
 Control data redundancies by using common attributes shared
by tables
 Crucial to exercising data redundancy control in the relational
model
Sometimes, data redundancy is necessary (foreign keys and
efficiency concerns are two main reasons)
9 Indexes
Data structure used to speed up access to rows in a table
Index key
 Index’s reference point (i.e., the primary key)
 Points to data location identified by the key
Unique index
Jay M. Lightfoot, Ph.D.
Chapter 3: Relational Model
 Index in which the index key can only have one pointer value
(row) associated with it
Each index is associated with only one table and a single table can
have multiple indexes.
Be aware that while indexes speed up access, they slow down
update and delete and take up space on the disk drive.
9.1 pointer
The address (or indirect link) to the tuple pointed to by the index.
Can be absolute (physical address), relative (an offset from a known
starting point), or logical (computed using an algorithm).
9.2 unique index
An index guaranteed to point to a unique tuple. No duplicates
allowed. Required for primary keys.
9.3 non-unique index
An index that allows duplicate values. Associated most often with
secondary keys.
Jay M. Lightfoot, Ph.D.
Download