Uploaded by vignesh kannadasan

Data Modeling and Data Models(chapter 2-2014-2015)

advertisement
Modern Database Management Systems
Chapter 2
Data Modeling And Data Models
Unit-1
Data Modeling and Data Models
Data Modeling:
 Database design focuses on how the database structure will be used to
store and manage end-user data.
 Data modeling, the first step in designing a database, refers to the process
of creating a specific data model for a determined problem domain.
 A problem domain is a clearly defined area within the real-world
environment, with well-defined scope and boundaries that is to be
systematically addressed.
Data Model:
 A data model is a relatively simple representation, usually graphical, of more
complex real-world data structures.
 In general terms, a model is an abstraction of a more complex real-world
object or event.
 Within the database environment, a data model represents data structures
and their characteristics, relations, constraints, transformations, and
other constructs with the purpose of supporting a specific problem domain.
 The basic building blocks of all data models are entities, attributes,
relationships and constraints.
 Data modeling is an iterative, progressive process.
 With a simple understanding of the problem domain, and the level of detail of
the data model.
 The final data model is a “blueprint” containing all the instructions to build a
database that will meet all end-user requirements.
 This blueprint is narrative and graphical in nature, meaning that it contains
both text descriptions in plain, unambiguous language and clear, useful
diagrams depicting the main data elements.
Sree Vidyanikethan Degree College
K.Muni Sankar
Modern Database Management Systems
Chapter 2
Data Modeling And Data Models
Unit-1
An implementation-ready data model should contain at least the following
components:
 A description of the data structure that will store the end-user data.
 A set of enforceable rules to guarantee the integrity of the data.
 A data manipulation methodology to support the real-world data
transformations.
Data Models and Basic Building Blocks:
The basic building blocks of all data models are entities, attributes,
relationships, and constraints. An entity is anything (a person, a place, a thing,
or an event) about which data are to be collected and stored.
An Entity represents a particular type of object in the real world. Because
an entity represents a particular type of object, entities are “distinguishable”
that is, each entity occurrence is unique and distinct. For example, a
CUSTOMER entity would have many distinguishable customer occurrences,
such as John, Smith, Tom, Pop etc. Entities may be physical objects, such as
customers or products
An Attribute is a characteristic of an entity. For example, a CUSTOMER
entity would be described by attributes such as customer last name, customer
first name, customer phone, customer address, and customer credit limit.
Attributes are the equivalent of fields in file systems
A Relationship describes an association among entities. For example, a
relationship exists between customers and agents that can be described as
follows: An agent can serve many customers, and each customer may be served
by one agent.
A Constraint is a restriction placed on the data. Constraints are important
because they help to ensure data integrity. Constraints are normally expressed in
the form of rules.
Sree Vidyanikethan Degree College
K.Muni Sankar
Modern Database Management Systems
Chapter 2
Data Modeling And Data Models
Unit-1
Data models use three types of relationships:
1. One-to-Many Relationship (1: M or 1...*)
2. Many-to-Many Relationship (M:N or *..*)
3. One-to-One Relationship (1:1 or 1..1)
Database designers usually use the shorthand notations 1:M or 1..*, M:N or
*..*, and 1:1 or 1..1, respectively. (Although the M:N notation is a standard label
for the many-to-many relationship, the label M:M may also be used.)
The following examples illustrate the distinctions among the three.
1. One-to-Many (1:M or 1..*) relationship :
 A painter paints many different paintings, but each one of them is painted
by only one painter. Thus, the painter (the “one”) is related to the paintings
(the “many”). Therefore, database designers label the relationship
“PAINTER paints PAINTING” as 1:M. (Note that entity names are often
capitalized as a convention, so they are easily identified.)
 Similarly, a customer (the “one”) may generate many invoices, but each
invoice (the “many”) is generated by only a single customer. The
“CUSTOMER generates INVOICE” relationship would also be labeled
1:M.
2. Many-to-Many (M:N or *..*) Relationship:
 An employee may learn many job skills, and each job skill may be learned
by
many
employees.
Database
designers
label
the
relationship
“EMPLOYEE learns SKILL” as M:N.
 Similarly, a student can take many classes and each class can be taken
by many students, thus yielding the M:N relationship label for the
relationship expressed by “STUDENT takes CLASS.”
3. One-to-One (1:1 or 1..1) Relationship:
 A retail company’s management structure may require that each of its stores
be managed by a single employee. In turn, each store manager, who is an
employee, manages only a single store. Therefore, the relationship
“EMPLOYEE manages STORE” is labeled 1:1.
Sree Vidyanikethan Degree College
K.Muni Sankar
Modern Database Management Systems
Chapter 2
Data Modeling And Data Models
Unit-1
A constraint is a restriction placed on the data. Constraints are important
because they help to ensure data integrity. Constraints are normally expressed in
the form of rules. For example:
 An employee’s salary must have values that are between 6,000 and
350,000.
 A student’s GPA must be between 0.00 and 4.00.
 Each class must have one and only one teacher.
Business Rules of the Organization:
 A business rule is a brief, precise, and unambiguous description of a policy,
procedure, or principle within a specific organization.
 A business rule can apply to any organization, large or small—a business, a
government unit, a religious group, or a research laboratory—that stores and
uses data to generate information.
 Properly written business rules are used to define entities, attributes,
relationships, and constraints.
 Business rules are derived from polices procedures, events, functions, and
other business objects and stat constraints on the organization.
 Business rules are important in data modeling because they govern how data
are handled and stored.
Overview of Business Rules:
A Business Rules is a statement that defines or constrains some aspect of
the business. It is intended to assert business structure or to control or influence
the behavior of the business. For example
“A student may register for a section of course only if he or she has
successfully completed the prerequisites for that course”
“A preferred customer qualifies for a 10% discount, unless he has an
overdue account balance”
Sree Vidyanikethan Degree College
K.Muni Sankar
Modern Database Management Systems
Chapter 2
Data Modeling And Data Models
Unit-1
The Business Rules Paradigm:
The concept of business rules has been used in information systems for
some time. However, it has been more common to use the related term
“integrity constraint” when referring to such rules.
Scope of Business Rule:
Business rules that impact only an organization’s databases. Most
organizations have a host of rules and or policy’s that fall outside this definition.
Some business rules cannot be represented in common data modeling notation.
Good Business Rules:
The following are the characteristics of a good business rules:
Declarative: A business rule is a statement of policy. The rule does not describe
a process or implementation, but describes what a process validates.
Precise: With the related organization, the rule must have only one
implementation among all interested people, and its meaning must be clear.
Atomic: A business rule marks one statement, not several; no part of the rule
can stand on its own as a rule.
Consistent: A business rule must be internally consistent and must be
consistent with other rules.
Expressible:
A business rule must be internally consistent and must be
consistent with other rules.
Distinct: Business rules are not redundant, but a business rule may refer to
other rules.
Business oriented: A business rule is stated in terms business people can
understand, and since it is a statement of business policy.
Sree Vidyanikethan Degree College
K.Muni Sankar
Modern Database Management Systems
Chapter 2
Data Modeling And Data Models
Unit-1
Gathering Business Rules:
Business rules appear in description of business functions, events,
policies, units, stakeholders and other objects. These descriptions can be found
in interview notes from individual and group information systems requirements
collection sessions, organizational documents and other sources. Rules are
identified by asking questions about the who, what, where, why and how of the
organization.
Data Names and Definitions:
Fundamental thing to understanding and modeling data are naming and
defining data objects before they can be used unambiguously in a model of
organizational data. . Data objects must be names and defining before they can
be used.
Data Names:
A data name is a name given for data objects like entities, relationships,
attributes etc. the following are general guidelines about naming any data
objects.
1. Related to business not technical characteristics.
2. Data name should be meaningful.
3. They should be unique.
4. They should be readable.
5. They should be taken from the approved list of words.
6. They should be repeatable in the sense they should be consistent.
Data Definitions: A definition is an explanation of a term or a fact. A term is a
word or phrase that has a specific meaning for the business. A fact is an
association between two or more terms.
Sree Vidyanikethan Degree College
K.Muni Sankar
Modern Database Management Systems
Chapter 2
Data Modeling And Data Models
Unit-1
The Evolution of Data Model:
The evolution of the major data models
Types of Data Models:
The following are major data models in roughly chronological order:
1. The Hierarchical model
2. The Network model
3. The Relational model
4. The entity Relationship model
5. The Object Oriented model
1. Hierarchical Model:
1. The hierarchical model was developed in the 1960s to manage large
amounts of data for complex manufacturing projects such as the Apollo
rocket that landed on the moon in 1969.
2. Its basic logical structure is represented by an upside-down tree.
3. The hierarchical structure contains levels, or segments.
4. A segment is the equivalent of a file system’s record type.
Sree Vidyanikethan Degree College
K.Muni Sankar
Modern Database Management Systems
Chapter 2
Data Modeling And Data Models
Unit-1
5. Within the hierarchy, a higher layer is perceived as the parent of the
segment directly beneath it, which is called the child.
6. The hierarchical model depicts a set of one-to-many (1:M) relationships
between a parent and its children segments.
7. Each parent can have many children, but each child has only one
parent.
Advantages of Hierarchical Model:
 Conceptual simplicity
 Database security and integrity
 Data independence
 Efficiency
Disadvantages of Hierarchical Model:
 Complex implementation
 Difficult to manage and lack of standards
 Lacks structural independence
 Applications
programming
and
use
complexity
 Implementation limitations
Network Model:
1. The network model was created to represent complex data relationships.
2. More effectively than the hierarchical model, to improve database
performance, and to impose a database standard.
3. In the network model, the user perceives the network database as a
collection of records in 1:M relationships.
4. However, unlike the hierarchical model, the network model allows a record
to have more than one parent.
5. In network database terminology, a relationship is called a set.
6. Each set is composed of at least 2 record types i.e., an owner record and a
member record.
7. A set represents a 1: M relationship between the owner and the member.
Sree Vidyanikethan Degree College
K.Muni Sankar
Modern Database Management Systems
Chapter 2
Data Modeling And Data Models
Unit-1
Advantages of Network Model:
 Conceptual simplicity
 Handles more relationship types
 Data access flexibility
 Promotes database integrity
 Data independence
 Conformance to standards
Disadvantages of Network Model:
 System Complexity: The network database model cannot be used to
create a user friendly database management system.
 Absence of Structural Independence: If any changes are made to the
database structure then all the application programs need to be modified
before they can access data. Even though the network database model
succeeds in achieving data independence, it still fails to achieve structural
independence.
Relational Model:
1. The relational data model was first introduced in 1970 by E.F.CODD.
2. The relational data model represents data in the form of tables (relation).
3. The relational model is based on mathematical theory and therefore has a
solid theoretical foundation.
4. Each row in a relation is called a tuple. Each column represents an attribute.
5. The relational model uses tables to organize data elements. Each table
corresponds to an application entity and each row represents an instance of
that entity.
6. Sophisticated relational database software such as Oracle, DB2, Microsoft
SQL Server, MySQL, and other mainframe relational software.
7. The most important advantage of the RDBMS is its ability to hide the
complexities of the relational model from the user.
8. The relational data model consists of the three components Data structure,
Data manipulation, Data integrity.
Sree Vidyanikethan Degree College
K.Muni Sankar
Modern Database Management Systems
Data Modeling And Data Models
Chapter 2
Unit-1
RDBMS TERMONOLOGY:
Formal Relational
Informal
Terms
Equivalents
Relation
Table
Tuple
Row, Record
Cardinality
Number of rows
Attribute
Column, field
Degree
Number of
columns
Primary key
Unique identifier
Domain
Set of legal
values
Advantages of Relational model:
 Structural Independence: Changes in the database structure do not
affect the data access.
 Conceptual Simplicity: The Relational data model frees the designer from
the physical data storage details, the designers can concentrate on the
logical view of the database.
 Design, implementation, maintenance and usage ease.
 Adhoc query capability: The presence of very powerful flexible and easy
to use query capability is one of the main reason s for the popularity of the
relational database model.
Disadvantages of relational model:
 Hardware overheads: For making things easier for the users, the
relational database systems need powerful hardware computers and data
storage devices.
 Ease of design can lead to bad design.
Sree Vidyanikethan Degree College
K.Muni Sankar
Modern Database Management Systems
Chapter 2
Data Modeling And Data Models
Unit-1
The Entity Relationship Model:
1. An E-R model is a detailed logical representation of the data for an
organization.
2. The basic constructs of the entity-relationship model are entities, attributes
and relationships.
3. The E-R model is a high-level conceptual data model expressed in terms
of entities in the business environment, the relationships among those
entities, and the attributes of both the entities and their relationships.
4. An E-R Model is normally expressed as an Entity-Relationship diagram,
which is a graphical representation of an E-R Model.
5. Peter Chen first introduced the ER data model in 1976; it was the graphical
representation of entities and their relationships in a database.
6. ER models are normally represented in an entity relationship diagram (ERD),
which uses graphical representations to model database components.
7. The original Chen notation and the more current Crow’s Foot notation.
Advantages of Relational model:
 Exceptional conceptual simplicity
 Visual representation
 Effective communication tool
 Integrated with the relational
database model
Disadvantages of Relational model:
 Limited constraint representation
 Limited relationship representation
 No data manipulation language
 Loss of information content
Sree Vidyanikethan Degree College
K.Muni Sankar
Modern Database Management Systems
Chapter 2
Data Modeling And Data Models
Unit-1
Object-Oriented Data Model (OODM):
Relational database technology failed to handle the needs of complex
information systems.
A data model consists of
1. Static properties such as object and attributes and relationships.
2. Integrity rules over objects and operations.
3. Dynamic properties.
1. In the object-oriented data model (OODM), both data and their relationships
are contained in a single structure known as an object.
2. In turn, the OODM is the basis for the object-oriented database
management system (OODBMS).
3. Object oriented model represents an entity as a class.
4. An entity, an object includes information about relationships between the facts
within the object, as well as information about its relationships with other
objects.
5. Object-oriented data models are typically depicted using Unified Modeling
Language (UML) class diagrams.
6. The OO data model is based on the following components:

An object is an abstraction of a real-world entity. In general terms, an
object may be considered equivalent to an ER model’s entity. An
object represents only one occurrence of an entity.

Attributes describe the properties of an object.

Objects that share similar characteristics are grouped in classes. A
class is a collection of similar objects with shared structure (attributes)
and behavior (methods).

Classes are organized in a class hierarchy. The class hierarchy
resembles an upside-down tree in which each class has only one
parent.

Inheritance is the ability of an object within the class hierarchy to inherit
the attributes and methods of the classes above it.
Sree Vidyanikethan Degree College
K.Muni Sankar
Modern Database Management Systems
Chapter 2
Data Modeling And Data Models
Unit-1
Advantages of Object-Oriented Data Model:
 Capability to handle large number of different data types.
 Combination of object oriented programming and database
technology.
 Object oriented features improve productivity.
 Data access.
Disadvantages of Object-Oriented Data Model
 Difficult to maintain.
 Not suited for all applications.
Extended relational Data Model:
The ERDM adds many of the OO model’s features within the inherently
simpler relational database structure. The ERDM gave birth to a new generation
of relational databases supporting OO features such as objects (encapsulated
data and methods), extensible data types based on classes, and inheritance.
That’s why a DBMS based on the ERDM is often described as an
object/relational database management system (O/R DBMS).
Evolution of Data Models:
Sree Vidyanikethan Degree College
K.Muni Sankar
Modern Database Management Systems
Data Modeling And Data Models
Chapter 2
Unit-1
Degree of Abstraction:
In the early 1970s, the American National Standards Institute (ANSI)
Standards Planning and Requirements Committee (SPARC) defined a
framework for data modeling based on degrees of data abstraction. The
ANSI/SPARC architecture (as it is often referred to) defines three levels of data
abstraction:
1. External
2. Conceptual
3. Internal
External model:
The external model is the end user’s view of the data environment. The
term end users refer to people who use the application programs to manipulate
the data and general information. A specific representation of an external view is
known as an External Schema. External schema includes the appropriate
entities, relationships, procedures and constraints imposed by the business
unit.
The Conceptual Model:
The conceptual model represents a global view of the entire database. It is a
representation of data as viewed by the entire organization. That is, the
conceptual model integrates all external views into a single global view of the
entire data in the enterprise, known as a Conceptual schema. The conceptual
schema is the basis to the identification and high level description of the main
data objects.
 The conceptual model is independent of both software and hardware.
 “Software independence” means that the model does not depend on the
DBMS software used to implement the model.
 “Hardware independence” means that the model does not depend on the
hardware used in the implementation of the model.
The Internal model:
It is the representation of the database as seen by the DBMS. It requires
the designer to match the conceptual models characteristics to those of the
Sree Vidyanikethan Degree College
K.Muni Sankar
Modern Database Management Systems
Data Modeling And Data Models
Chapter 2
Unit-1
selected implementation model. An internal schema depicts a specific
representation of an internal model, using the database constructs supported by
the chosen database. The internal model is software dependent and hardware
independent.
The Physical model:
It operates at the lowest level of abstraction describing the way data are
saved on storage media. It requires the definition of both physical storage
devices and the access methods required to reach the data within these storage
devices, making it both software and hardware dependent.
Sree Vidyanikethan Degree College
K.Muni Sankar
Modern Database Management Systems
Chapter 2
Data Modeling And Data Models
Unit-1
To illustrate the meaning of data abstraction, consider the example of
automotive design.
1. A car designer begins by drawing the concept of the car that is to be
produced.
2. Next, engineers design the details that help transfer the basic concept into
a structure that can be produced.
3. Finally,
the
engineering
drawings
are
translated
into
production
specifications to be used on the factory floor.
4. As you can see, the process of producing the car begins at a high level of
abstraction and proceeds to an ever-increasing level of detail.
5. The factory floor process cannot proceed unless the engineering details are
properly specified, and the engineering details cannot exist without the
basic conceptual framework created by the designer.
6. Designing a usable database follows the same basic process.
7. That is, a database designer starts with an abstract view of the overall data
environment and adds details as the design comes closer to implementation.
8. Using levels of abstraction can also be very helpful in integrating multiple (and
sometimes conflicting) views of data as seen at different levels of an
organization.
Sree Vidyanikethan Degree College
K.Muni Sankar
Download