Information Modeling: 1.Entity Relation diagrams (ERD) 2. The Iso

advertisement
Information Engineering
Dr B. Mills
INFORMATION MODELING:
1.ENTITY RELATION DIAGRAMS (ERD)
2. THE OSI LAYERING MODEL
3. NORMALIZING DATA
ERD – Entity Relation Diagramming
 In a well-designed relational database, each
table represents an entity. In the figure below
there are 4 entities(tables): Customers, Line
Items, Invoices, Products.
First we must understand
entities
ERD – Entity Relation Diagraming
 Entities
A database contains one or more related tables.
Each table holds all of the information about an
object, person or thing.
Some examples of database tables might be:
 - a customer table
 - an appointments table
 - an exam sessions table
 - a teachers' names table
 - a concert venue table
Tables are entities
 Each table is about an object, person, or thing.
Customers
Appointments
Books
Students
Products
ERD – Entity Relation Diagraming
Entities have attributes
 Entity = Customers.
CustomerID
FirstName
LastName
Data of Birth
Address
ERD – Entity Relation Diagraming
Entities have attributes
 Entity = Products.
ProductID
ProductName
Weight
Manufacturer
Warehouse
ERD – Entity Relation Diagraming
Entities= Customer, Products, Orders
Customers
Products
Orders
ERD – Entity Relation Diagraming
Entity Relationship Diagrams
 These relationships can be shown in the form of a diagram.
 This diagram is known as an 'entity relationship diagram',
E-R diagram or ERD
 As part of your exam, you will have to draw or interpret an
E-R diagram. Before you can do this, you need to be able to
interpret the relationships between the entities.





These relationships take the form of:
- one-to-one
- one-to-many
- many-to-many
One-to-One
 A husband can only have one wife
 A wife can only have one husband
 this would be known as a 'one-to-one
relationship'
 This relationship in a diagram would look like
this:
One-to-Many
 A mother can have many children
 A child can have only one mother
 this would be known as a 'one-to-many
relationship'
 This diagram looks like this:
Many-to-Many
 Think about a library
 A book can be read by many people
 People can read many books
 this would be known as a 'many-to-many
relationship‘
 This relationship looks like this:
Modeling Your Data
When designing a data model you should first determine the
following: > The ‘Many’ side usually contains the foreign key
> The ‘One’ side usually contains the primary key
Before you design or set up a database, you
should work out:
 - the entities
 - the attributes
 - the entity relationships
This process is called 'data modelling'
What The 7 – Layer OSI Model
IS:
.
Defines a necessary elements for data
communication between devices.
Defines a communication architecture,
for digital comuntication systems
Visually and conceptually separates
communication, network, and software
functions
What The 7 – Layer OSI Model
IS:
.
Defines a necessary elements for data
communication between devices.
Defines a communication architecture,
for digital comuntication systems
Visually and conceptually separates
communication, network, and software
functions
OSI Model Definition – 7 Layers
7
6
5
4
3
2
1
 Layer 1 – Physical
 Layer 2 – Data





Link
Layer 3 –
Network
Layer 4 –
Transport
Layer 5 – Session
Layer 6 –
Presentation
Layer 7 Application







Please
Do
Not
Throw
Sausage
Pizza
Away
How data moves through the layers
Layer 7 - Application
The Application layer provides services to the
software through which the user requests
network services.
 Examples:



Internet Explorer, Safari, and other browsers
FTP
Mail
 Many applications that run on your computer are
NOT part of the Application layer. This means that
the following are not part of layer 7 because they
do not request network services:


Physical
Microsoft Word or Excel
Adobe Photoshop
– Data Link – Network – Transport – Session – Presentatio
Layer 6 - Presentation
Manages data-format information for networked
communications (the network’s translator)

For outgoing messages, it converts data into a generic format for
network transmission; for incoming messages, it converts data from the
generic network format to a format that the receiving application can
understand

This layer is also responsible for certain protocol conversions, data
encryption/decryption, or data compression/decompression

Examples:

MIDI

JPG, GIF, TIF

MPEG
Physical
– Data Link – Network – Transport – Session – Presentatio
Layer 5 - Session
The Session layer establishes, maintains, and
manages the communication session
between computers.
 Responsible for initiating, maintaining and terminating
sessions
 Responsible for security and access control to session
information (via session participant identification)
 Responsible for synchronization services, and for
checkpoint services
 Examples:



NFS
SQL
RPC
Layer 4 - Transport
The functions defined in this layer provide
for the reliable transmission of data
segments, as well as the disassembly and
assembly of the data before and after
transmission.




Manages the transmission of data across a network
Manages the flow (flow control) of data between parties by
segmenting long data streams into smaller data chunks (based on
allowed “packet” size for a given transmission medium) (packet
sequencing)
Provides acknowledgements of successful transmissions and
requests retransmission for packets which arrive with errors (error
detection and recovery)
Examples:
 TCP
 UDP
Layer 3 - Network
The Network layer defines the processes
used to route data across the network and
the structure and use of logical addressing.





Handles addressing messages for delivery, as well as translating logical
network addresses and names into their physical counterparts (Logical
Addresses are managed by local network admins.)
Responsible for deciding how to route transmissions between computers
This layer also handles the decisions needed to get data from one point to
the next point along a network path
This layer also handles packet switching and network congestion control
Example:


Physical
IP
Network routers
– Data Link – Network – Transport – Session – Presentatio
Layer 2 – Data Link
Concerned with the linkages and
mechanisms used to move data about the
network and deals with the ways in which
data is reliably transmitted.




Handles special data frames (packets) between the Network layer and the
Physical layer
At the sending end this layer handles conversion of data into raw formats
that can be handled by the Physical Layer. At the receiving end, this layer
packages raw data from the physical layer into data frames for delivery to
the Network layer
The data link layer is often conceptually divided into two sub-layers: logical
link control (LLC) and media access control (MAC).
Examples:



Network bridges
Ethernet
Wi-Fi
Physical
– Data Link – Network – Transport – Session – Presentatio
Layer 1 - Physical
This layer defines the electrical and physical
specifications for the networking media that
carry the data bits across a network.




Converts bits into electronic signals for outgoing messages.
Converts electronic signals into bits for incoming messages
This layer manages the interface between the computer and
the network medium (coax, twisted pair, etc.)
This layer tells the driver software for the MAU (media
attachment unit) (eg. network interface cards (NICs), modems)
what needs to be sent across the medium
Examples:


Physical
Network hubs and repeaters
LAN and WAN topology
– Data Link – Network – Transport – Session – Presentatio
Advanced Topic
NORMALIZATION
Normalization
 In the field of Relational Database design,
normalization is a way of ensuring that a
database structure is suitable for generalpurpose querying and free of certain
undesirable characteristics that could lead to
a loss of Data integrity
Data Integrity
 Refers to the validity of data
 The assurance that data is accurate, correct
and valid to the validity of data
What is Normalization?
 Database normalization is the practice of
optimizing table structures. Optimization is
done by a complete investigation of the
various pieces of data that will be stored
within the database
An Introduction to Database
Normalization - Preliminary
Definitions
Terminology in Normalization:
 Entity: The word ‘entity’ as it relates to databases can
simply be defined as the general name for the information
that is to be stored within a single table.
 Example: for storing information about the school’s
students, then ‘student’ would be the entity.
 The student entity would likely be composed of several
pieces of information, for example:
 student identification number, name, and email address.
These pieces of information are better known as
attributes.
Relationship
 Understanding the relationships between the




data items forming the various entities and
between the entities themselves forms the
foundation of database normalization.
Remember, there are three types of data
relationships that you should be aware of:
One-to-One
One-to-Many
Many-to-Many
Foreign Key and ERD
 Foreign key: A foreign key forms the basis of a
One-to-Many relationship between two tables.
The foreign key can be found in the Many table,
and points to the primary key found in the One
table
Entity-relationship diagram (ERD): An ERD is a
graphical representation of the database
structure. An ERD can be created using
sophisticated software or drawn on a piece of
paper from your pocket.
We want to eliminate data
redundancy
 Redundancy happens when the same data values are
stored more than once in a table, or when the same
values are stored in more than one table.
 To prevent redundancy, normalization is done to
improve performance when performing CRUD
operations, especially searching for information
 One of the biggest disadvantages of data
redundancy is that it increases the size of the
database unnecessarily. Also data redundancy might
cause the same result to be returned as multiple
search results when searching the database causing
confusion and clutter in results.
Avoiding Redundancy
Analysis
 This table maps (points to) various students to
the classes found within their schedule.
 Issues:
 Assuming that the only intention of this table is
to create student-class mappings, then there
really is no need to repeatedly store the class
time and professor ID.
 if there are 30 students to a class, then the class
information would be repeated 30 times over
Why avoid Redundancy?
 Redundancy introduces the possibility for error.
 the name of the class found in the final row in the
table (Matj 148).
 Given the name of the class found in the first row,
chances are that Matj 148 should actually be Math
148!
 While this error is easily identifiable when just four
rows are present in the table, imagine finding this
error within the rows representing the 60,000
enrolled students
Database Normalization - The Three Normal
Forms
The process towards database normalization
progressing through a series of steps,
typically known as Normal Forms.
First Normal Form (1NF)
 Converting a database to the first normal
form is rather simple.
 The first rule calls for the elimination of
repeating groups of data through the
creation of separate tables of related data.

Breaking bigger tables down into several smaller tables. The
first table contains solely student information (Student):
The second table contains
solely class information
(Class):
The third table contains solely
professor information
(Professor):
Second Normal form
 Once you have separated the data into their
respective tables, you can begin
concentrating upon the rule of Second
Normal Form -the elimination of redundant data. Referring
back to the Class table, typical data stored
within might look like:
Second Normal Form (2NF)
 While this table structure is certainly
improved over the original, notice that there
is still room for improvement.
 In this case, the className attribute is being
repeated. With 60,000 students stored in this
table, performing an update to reflect a
recent change in a course name could be
somewhat of a problem. Therefore:
 create a separate table that contains classID
to className mappings (ClassIdentity):
Class Identity
The updated Class table would
then be simply:
Third Normal Form (3NF)
 For complete normalization of the school system
database, the next step in the process is to
satisfy the rule of the Third Normal Form.
 This rule seeks to eliminate all attributes from a
table that are not directly dependent upon the
primary key. In the case of the Student table,
the college and college Location attributes are
less dependent upon the student ID than they
are on the major attribute. Therefore, we’ll
create a new table that relates the major, college
and college Location information:
Third Normal Form
The revised Student table would
then look like:
Some other Database Terms…
 Data Mining
 Data Matching
 Distributed Databases
 Boolean Operators
 SQL Servers
Summary
 Normalization is a systematic way of
ensuring that a database structure is suitable
for general-purpose querying and free of
certain undesirable characteristics—insertion,
update, and deletion anomalies—that could
lead to a loss of data integrity.
Download