Data Modelling Prepared by Mohamed Osama Mortada Copyright © Qeema 2023. All Rights Reserved w w w. q e e m a . n et Agenda 1. Data Modeling Definition 2. Data Modeling Activities 3. Data Modeling Schemes Definition • Data Modeling: The process of discovering, analyzing, and scoping data requirements, and then representing and communicating these requirements in the form of a data model. The data model forms are logical and physical. • Goals: 1. Formalization 2. Scope definition 3. Knowledge retention/documentation Activities &Lifecycle Inputs Activities Key Players Tools & Techniques Output Data Modeling Schemes • Relational • Dimensional • Time-Based • NoSQL Relational Models ERD Components • Entity • Relationships • Attributes Relational Models Relationship Type Components • Entity • Relationships • Attributes The number of entities in a relationship is the ‘arity’ of the relationship. The most common are unary, binary, and ternary relationships. Relational Models Relationship Cardinality Components • Entity • Relationships • Attributes captures how many of one entity (entity instances) participates in the relationship with how many of the other entity Relational Models Participation Constraint Components • Entity • Relationships • Attributes captures the minimum existence of an entity in the relationship (Partial/Optional or Total/Must) ERD Components Components • Entity • Relationships • Attributes Example Test yourself UPS prides itself on having up-to-date information on the processing and current location of each shipped item. To do this, UPS relies on a company-wide information system. Shipped items are the heart of the UPS product tracking information system. Shipped items can be characterized by item number (unique), weight, dimensions, insurance amount, destination, and final delivery date. Shipped items are received into the UPS system at a single retail center. Retail centers are characterized by their type, uniqueID, and address. Shipped items make their way to their destination via one or more standard UPS transportation events (i.e., flights, truck deliveries). These transportation events are characterized by a unique scheduleNumber, a type (e.g, flight, truck), and a deliveryRoute. Please create an Entity Relationship diagram that captures this information about the UPS system. Be certain to indicate identifiers and cardinality constraints. Normalization vs Denormalization Normalization is the method of arranging the data in the database efficiently. It involves constructing tables and setting up relationships between those tables according to some certain rules. The redundancy and inconsistent dependency can be removed using these rules in order to make it more flexible. Normalization have the following forms: 1NF, 2NF, 3NF … The purpose of Normalization is to eliminate redundant (useless) data and ensure data is stored logically. Denormalization is the inverse process of normalization where the redundancy is added to the data intentionally to improve the performance of the specific application and data integrity. The reason for performing denormalization is the overheads produced in the query processor by an over-normalized structure. Denormalization reduces the number of tables, and the complicated table joins because a higher number of joins can slow down the process. Normalization – Deep dive Why do we normalize? • Save space • Eliminate Update anomalies • Insertion anomalies • Cannot insert a project unless an employee is assigned to it • Delete anomalies • When a project is deleted all employees working on that project will be deleted as well • Modification anomalies • Changing the name of a project from “Billing” to “Customer Invoicing” will cause update for 100 rows in the dataset. Normalization – Deep dive Normalization – Deep dive Normal Form: are a series of guidelines that help to ensure that the design of a database is efficient, organized, and free from data anomalies. There are several levels of normalization, each with its own set of guidelines, known as normal forms. The guidelines for each normal form is as follows: • 1NF: Eliminate Repeating Groups & MV Attributes • 2NF: Eliminate Partial FD • 3NF: Eliminate Transitive Dependency Normalization – 1NF Normalization – 1NF Normalization – 2NF Normalization – 3NF Mapping ER diagram to Relational Schema Methodology: • Step 1: Mapping of Regular Entity Types • Step 2: Mapping of Weak Entity Types • Step 3: Mapping of 1:1 Relation Types • Step 4: Mapping of 1:N Relation Types • Step 5: Mapping of M:N Relation Types • Step 6: Mapping of Multivalued attributes Example Mapping ER diagram to Relational Schema Normalization – Deep dive Questions ? Email: mohamed.mortada@qeema.net