Entity Relationship Approach Top-down approach to data modeling Uses diagrams Normalization - confirms technical soundness Entity Relationship - reflects business requirements 1 Using Diagrams Define a procedure for representing existing models Develop diagrams directly from business requirements Business-oriented terminology based on... – Entities (things of interest to business) – Relationships between entities 2 Hospital Example Graphic 2-13 Format does not encourage a quick appreciation of the main concepts and rules – e.g. Each operation can be performed by only one surgeon but this is not immediately apparent This is a very simple model, as we progress to more complex models the problem of presentation becomes more serious 3 Data Model Diagram Sometimes called data structure diagram, they are based on two symbols – A "box" represents a table (a rounded rectangle is usually used) – A line drawn between two boxes represents a foreign key pointing back to the table that has the primary key 4 Existing Tables Simply draw a box for every table in the model with the name of it inside Graphic 3-2 5 Existing Relationships Draw a line between the two boxes and indicate the direction of the link by putting a "crows foot" at the foreign key end – Think of the crows foot as an arrow pointing back to the table with the primary key Graphic 3-3 6 Interpreting the Diagram The model specifies a Surgeon table (data about surgeons) The model specifies an Operation table (data about operations) Each Operation can be associated with only one surgeon Each Surgeon can be associated with many Operations 7 Summarizing Relationships This represents the relationship between tables (implied by the primary and foreign keys) without having to list any column names at all 8 Asking Questions We could now ask the business specialist – Is it true that each operation is performed by one surgeon only? – Can we count on this in the future? We can make changes now (in the model) while the cost is still low Question: If more than one Surgeon is allowed, how would we handle it? 9 Another Solution We could choose to track only the surgeon who managed the operation Record decisions like this in the diagram – Avoids the question being revisited – Specifies precisely what data will be held Cannot answer question - "In how many operations did surgeon number 12 at hospital number 18 participate?" 10 Annotated Relationship Graphic 3-4 As well as annotating the diagram, we should change the column name from Surgeon Number to Managing Surgeon Number 11 Optionality What if there are no surgeons required for an operation? – e.g. a small cut with antibiotics given Operation table may have a NULL value for Surgeon Number To show the difference between optional or mandatory relationships use the convention shown in Graphic 3-5 12 Exercise Draw the relationships for the rest of the model 13 First Pass Graphic 3-6 Checking the diagram will often reveal unsound assumptions and misunderstandings Conversely, it may increase confidence in the model for both user and designer 14 Operation and Operation Type Are we sure that each operation can be of only one type? How would we represent a gall bladder removal and appendectomy operation? – 1. Allow only "simple" operations - need a new table for repeating group – 2. Allow complex operation types such as "Combined Gall Bladder removal and Appendectomy" 15 Both will Work Option 2 will be easier to implement Option 1 will be more elegant – Can ask "list all operations that involved appendectomies" 16 Redundant Lines Note the Lines connecting Hospital, Operation, and Surgeon tables – line from Hospital to Surgeon – line from Surgeon to Operation – line from Operation to Hospital » Does this line add anything to our knowledge of the business rules? 17 Rules for Removing Lines Graphic 3-7 If A derives B and B derives C than an A derives C connecting line is not needed – If B is optional, than we cannot remove the A derives C line – If A derives C provides different information than A derives B we can not remove A derives C 18 Exercise Remove the remaining Redundant lines from Graphic 3-7 19 Solution Graphic 3-8 20 Top-Down Approach Why Normalize at all? Why not just use this Top-down E-R approach? In practice this is what is most often done, with Normalization being a final check Using E-R first... – we can start with "What data do we need to keep information about – no need to start with a single overly complex table 21 Terminology The Relational Models (used with Normalization) were built on three basic concepts: tables, columns, and keys The E-R Models: Entities, Attributes, and Relationships Easier to say, "The relationship between a hospital and surgeon", then "the existence of the primary key of Hospital as a foreign key in the Surgeon table" 22 Entities An entity is the "real world" class of things that a table represents – entities: St. John's Hospital – entity type: Hospital In practice: Entity means Entity type and Entity Instance (row) for Entity 23 Entities - continued Some things will need to be represented by more than one entities – e.g. Invoices would be represented by two entities: Invoice Header and Invoice Item – e.g. Quarterly Profit would be derived from sales and expense figures from other entities 24 Multiple "Things" as One Entity Example... – Preferred Customer – Corporate Customer – Becomes a Customer Type with a Customers Table 25 Entity Naming The name of an entity must be in the singular. (Exactly the opposite of what it says in the SQL book!) – e.g. Account instead of Accounts Three reasons: – Consistency - standards – Communication - an 'entity is something we need to keep information about – Compatibility with Relationship Name - (we'll look at this later) 26 Relationships The lines between the boxes are the relationships between the entities We name relationships in both directions – "Each company may issue one or more shares" – "Each share must be issued by one company" Graphic 3-9 show notation were using Graphic 3-10 shows alternative diagramming notations 27 Exercise Draw the E-R model that would represent the relationship between a Manager and Departments within his company. A Manager may be in charge of more than one department 28 Solution and More Examples Graphic 3-11 Graphic 3-12 29 Suggested Diagramming Tips Orient your diagram so that the Crows feet are nearer to the bottom of the page Place Crow's feet on the right Eliminate crossing lines where possible (but clarity is most important Duplicate Entities to avoid long relationship lines – Can use a dotted-line to connect the same entity 30 Many-to-Many Relationships Many-to-Many Relationships are frequently modeled Graphic 3-13 How would we implement the relationship using foreign keys? 31 We Can't We can't hold the key to Qualification in the Employee table because an Employee could have several qualifications Like wise the Qualification table would need to support multiple Employee Keys A Normalized model cannot represent many-to-many relationships with foreign keys, yet such relationships certainly exist 32 Representing Many-to-Many Can't use foreign keys Can use a table 33 Normalized Many-to-Many Graphic 3-16 Whenever we encounter a many-to-many relationship between two entities we represent it with a new entity linked to the two original entities. The Primary keys of each original table become together become the Primary Key of the new entity ("the resolution entity") 34 Choice of Representation Conceptual vs. Physical Model Conversion is not totally mechanical – New Non-key attributes in the resolution entity? – Different Name for the resolution entity? » e.g. instead of Employee-Qualification table we might use "Certifications" table 35 One-to-One Relationships Should not automatically combine the entities into a single entity When to split – distinct real-world concepts (e.g. PersonPassport) – Separating attribute groups (e.g. Detail vs. default) – Transferables (e.g. Part type -stored in-Bin) 36 Self-Referencing Relationships Graphic 3-18 Each employee may manage one or more employees Each employee may be managed by one employee – carry the foreign key in the same table as the Primary key – e.g. Manager Id -> Employee Id 37 Attributes Sometimes show a few attributes to clarify meaning (e.g. Primary, Foreign keys) Don't show all the attributes, because we want the big picture Keep in separate lists for each entity 38 More Normalization If in the process of listing attributes, we find repeating groups or lookup tables – Normalize the design – Update the E-R model 39 Summary Data models can be presented diagrammatically by using a box to represent each table and a line for each foreign key relationship This provides a language for "top-down" data models; prior to developing attributes Many-to-Many relationships are resolved with a "resolution entity" 40 Last Slide - Entity Relationship Assignment #10 due next week 41