Uploaded by Mr; A. Rahman Elshazly

chapter-4-data-modeling

advertisement
Chapter 4 – Data Modeling
A model is an abstraction process that hides unneeded details. Data modeling is used for
representing entities of interest and their relationship in the database. Data model is a
collection of concepts that can be used to describe the structure of a database which provides
the necessary means to achieve the abstraction.
Conceptual data model
Conceptual, logical and physical model are three different ways of modeling data in a domain.
While they all contain entities and relationships, they differ in the purposes they are created for
and audiences they are meant to target. A general understanding to the three models is that,
business analyst uses conceptual and logical model for modeling the data required and
produced by system from a business angle, while database designer refines the early design to
produce the physical model for presenting physical database structure ready for database
construction.
User level data model is the high level or conceptual model. This provides concepts that are
close to the way that many users perceive data. The purpose of a Conceptual model is to simply
establish the Entities, their Attributes and their ‘high-level’ relationships.
Figure 1 Example of Conceptual Data Model
Features of conceptual data model include:



Includes the important entities and the relationships among them.
No attribute is specified.
No primary key is specified.
Logical Data Model
A logical data model describes the data in as much detail as possible, without regard to how
they will be physical implemented in the database.
Features of a logical data model include:





Includes all entities and relationships among them.
All attributes for each entity are specified.
The primary key for each entity is specified.
Foreign keys (keys identifying the relationship between different entities) are specified.
Normalization occurs at this level.
The steps for designing the logical data model are as follows:
1.
2.
3.
4.
5.
Specify primary keys for all entities.
Find the relationships between different entities.
Find all attributes for each entity.
Resolve many-to-many relationships.
Normalization.
Comparing the logical data model with the conceptual data model diagram, we see the main
differences between the two:



In a logical data model, primary keys are present, whereas in a conceptual data model,
no primary key is present.
In a logical data model, all attributes are specified within an entity. No attributes are
specified in a conceptual data model.
Relationships between entities are specified using primary keys and foreign keys in a
logical data model. In a conceptual data model, the relationships are simply stated, not
specified, so we simply know that two entities are related, but we do not specify what
attributes are used for this relationship.
Physical Data Model
Physical data model represents how the model will be built in the database. A physical database
model shows all table structures, including column name, column data type, column
constraints, primary key, foreign key, and relationships between tables.
Features of a physical data model include:
 Specification all tables and columns.
 Foreign keys are used to identify relationships between tables.
 De-normalization may occur based on user requirements.
 Physical considerations may cause the physical data model to be quite different from
the logical data model.
 Physical data model will be different for different RDBMS. For example, data type for a
column may be different between MySQL and SQL Server.
The steps for physical data model design are as follows:
 Convert entities into tables.
 Convert relationships into foreign keys.
 Convert attributes into columns.
 Modify the physical data model based on physical constraints / requirements.
Comparison among Conceptual, Logical and Physical Data models
Feature
Entity Names
Conceptual Logical Physical
✓
✓
Entity Relationships
✓
✓
Attributes
✓
Primary Keys
✓
✓
Foreign Keys
✓
✓
Table Names
✓
Column Names
✓
Column Data Types
✓
Conceptual Model Design
Logical Model Design
Physical Model Design
We can see that the complexity increases from conceptual to logical to physical. This is why we
always first start with the conceptual data model (so we understand at high level what are the
different entities in our data and how they relate to one another), then move on to the logical
data model (so we understand the details of our data without worrying about how they will
actually implemented), and finally the physical data model (so we know exactly how to
implement our data model in the database of choice). In a data warehousing project,
sometimes the conceptual data model and the logical data model are considered as a single
deliverable.
Data modeling using Entity Relation Diagram (ERD)
A database schema in the ER model can be represented pictorially by using Entity-Relationship
diagram. An Entity-Relationship diagram (ER diagram) is a graph with nodes representing entity
sets, attributes and relationship sets.
Entity: real-world object or thing with an independent existence and which is distinguishable
from other objects. Examples are a person, car, customer, product, gene, book etc.
Attributes: an entity is represented by a set of attributes (its descriptive properties), e.g.,
name, age, salary, price etc. Attribute values that describe each entity become a major part of
the data eventually stored in a database. With each attribute a domain is associated, i.e., a set
of permitted values for an attribute. Possible domains are integer, string, date, etc.
Entity Type: Collection of entities that all have the same attributes, e.g., persons, cars,
customers etc.
Entity Set: Collection of entities of a particular entity type at any point in time; entity set is
typically referred to using the same name as entity type.
Entities of an entity type need to be distinguishable. A super key of an entity type is a set of
one or more attributes whose values uniquely determine each entity in an entity set. A
candidate key of an entity type is a minimal (in terms of number of attributes) super key. For
an entity type, several candidate keys may exist. During conceptual design, one of the
candidate keys is selected to be the primary key of the entity type.
Relationship (instance): association among two or more entities.
E.g.


Customer 'Smith' orders product 'PC42' "
Miller works in Pharmacy department.
Degree of a relationship: refers to the number of entity types that participate in the
relationship type (binary, ternary . . .).
Roles: The same entity type can participate more than once in a relationship type.
Role labels clarify semantics of a relationship, i.e., the way in which an entity participates in a
relationship.
Multivalued Attributes: An attribute that can hold multiple values is known as multivalued
attribute. We represent it with double ellipses in an E-R Diagram. E.g. A person can have more
than one phone numbers so the phone number attribute is multivalued.
Derived Attribute: A derived attribute is one whose value is dynamic and derived from another
attribute. It is represented by dashed ellipses in an E-R Diagram. E.g. Person age is a derived
attribute as it changes over time and can be derived from another attribute (Date of birth).
Role labels clarify semantics of a relationship, i.e., the way in which an entity participates in a
relationship.
ERD Diagrammatic Representations







Rectangles represent entity types
Ellipses represent attributes
Diamonds represent relationship types
Lines link attributes to entity types and entity types to relationship types
Primary key attributes are underlined
Empty Circle at the end of a line linking an attribute to an entity type represents an
optional (null) attribute
Double Ellipses represent multi-valued attributes
Constraints on Relationship Types
Limit the number of possible combinations of entities that may participate in a relationship set.
There are two types of constraints:


cardinality ratio and
participation constraints
For binary relationships, the cardinality ratio must be one of the following types:

Many-To-Many (default)
Meaning: An employee can work in many departments, and a department can have several
employees.

Many-To-One
Meaning: An employee can work in at most one department, and a department can have
several employees.

One-To-Many
Meaning: An employee can work in many departments, but a department can have at most one
employee.

One-To-One
Meaning: An employee can work in at most one department, and a department can have at
most one employee.
A Sample Database Application
Example 1: COMPANY

Employees, departments, and projects

Company is organized into departments

Department controls a number of projects

Employee: store each employee’s name, Social Security number, address, salary, sex,
(gender), and birth date

Keep track of the dependents of each employee
Example 2: Example of Student-Course High-Level Data Model
Download