ZEIT2301 – Database Design Entity-Relationship Diagrams School of Engineering and Information Technology UNSW@ADFA Dr Kathryn Merrick Bldg 16, Rm 212 (Thursdays and Fridays only) k.merrick@adfa.edu.au Topic 08: Database Design Objectives: To review Entity Relationship Diagrams for modelling data and its relationships To review the Relational model for database management systems 2 Data Storage The Class Diagram identifies the classes and attributes of interest from the problem domain. We now consider how this data can best be stored in order to support the specified requirements of the system. The most popular storage format today is the relational database. Session 2, 2010 3 Database Design To reap the potential benefits offered by database technology, databases must be properly designed Effort spent in design is always rewarded in data quality There are different (though complementary) approaches to achieving a good design 4 Approaches to Database Design Entity-Relationship (ER) data modelling Normalization (next week’s lecture): A graphical technique for understanding and organizing the data independently of the eventual database implementation An approach for evaluating the quality of a database design - most applicable to relational database designs ER modelling and normalization are the core techniques for good database design 5 Entity-Relationship Diagrams ER models are based on the following concepts: Entities (or, more correctly, entity types) Relationships (between entities) Attributes (of entities and relationships) Similar to the discussion of Class Diagrams Entities are similar to Classes in OO analysis. But: Entities do not have any methods (data only) Entities have a Primary Key (PK) 6 Entity-Relationship Diagrams There are various diagrammatic styles for ERDs We will use the UML style notation But note that we are talking about Entities Which store attributes (ie data) only Not Classes Which store attributes (data) and have processes (operations or methods) Session 2, 2010 7 An Entity and its Attributes An entity is represented on entity-relationship diagram student as a named rectangle with two parts An entity is conventionally named in the singular (because it is a type of thing) 8 Attribute Domain Domain: set of values that may be assigned to an attribute e.g. for attribute ‘gender’ the possible values are 'Male' and 'Female', so Not shown on an ER diagram; recorded in a data dictionary domain(Gender) = {'Male', 'Female'} e.g. for attribute ‘quantityHeld' the possible values range from 0 onwards, so domain(quantityHeld) = {all natural numbers} 9 Composite Attribute student studentID name address streetAddress suburb state postcode DOB Gender Composite Attribute: component parts indented slightly. 10 Multi-Valued Attribute student studentID name DOB phoneNo [1..3] gender Multi-valued Attribute: 1 to 3 occurrences (for a particular student) phoneNo [1..*] means 1 or more occurrences Session 2, 2010 11 Derived Attribute student studentID name DOB /age gender Session 2, 2010 Derived Attribute: use “/” in front of attribute name 12 Entity Uniqueness Each entity instance should be distinguishable from all other instances of the same entity type by inspection of the values of all of its attributes eg distinguish one student from another student That distinguishing attribute (or group of attributes) is called the Primary Key This is a significant difference between a Class Diagram and an Entity-Relationship Diagram A Class does not have a PK 13 Identify Primary Key textbook ISBN {PK} title Primary Key mainTitle subTitle edition author [1..*] publisher price quantityHeld /valueOfStock 14 Relationships between Entities A relationship is a set of meaningful associations among entities. Three types of relationships are: Unary: one entity involved Binary: two entities involved (the most common) Ternary: three entities involved 15 Multiplicity constraints Multiplicities indicate how many instances of each entity participate in the relationship Generally these are zero, one or many eg: one-to-one (1..1) one-to-many (1..*) many-to-many (*..*) zero-to-one (0..1) zero-to-many (0..*) or simply * 16 Multiplicity Constraints “A student enrols in up to 4 courses and must enrol in at least one course” student course 1..4 studentID {PK} enrols In courseCode {PK} “A course may have zero or many students.” student studentID {PK} course 0..* enrols In courseCode {PK} 17 Unary (recursive) Relationship 1..* course courseCode {PK} courseName creditPoints is prerequisite for 0..* A course “is a prerequisite for” another course 18 Binary Relationship textbook ISBN {PK} title student studentID {PK} name DOB address gender mainTitle subTitle buys 1..* 0..* edition author [1..*] publisher price quantityHeld /valueOfStock 19 Ternary Relationship “buys” student studentID {PK} textbook buys ISBN {PK} bookshop name{PK} 20 Attributes of a Relationship student textbook 1..* buys studentID {PK} 0..* ISBN {PK} datePurchased Relationships may also have attributes. These attribute(s) are connected to the relationship via a dashed line. 21 ER Modelling Context is important an attribute in one context may be an entity in another Is author an attribute of a Book entity or an entity in its own right? The model is about what is possible, not what is a fact at a particular point in time A candidate for employment potentially has many qualifications (or possibly none?) A candidate fills several position (ie over a period of time) 22 Steps to create an ER Diagram: Identify Entities 1. Identify Relationships 2. Look for associations, verbs between nouns..… then add constraints Identify Attributes 3. 4. Look for nouns, major objects that we want to store data about. Look for nouns, noun phrases that are properties of things … decide if multi-valued, derived, etc … record description in data dictionary Choose Primary Key 23 Case Study Temps for Hire (TFH) has a file of candidates that are willing to work at short notice. The file lists the id, name, address and contact number for each candidate. All candidates at TFH have a number of qualifications and TFH uses a unique code and general description to specify the qualification. TFH also has a list of companies (name and address) that use their services. When a company has a position to be filled, they specify the start date, end date and hourly rate of the position. The company also specifies the essential qualifications required for the position. TFH matches the position qualifications against the candidates' qualifications and selects a candidate to fill the position. Temps for Hire (TFH) has a file of candidates that are willing to work at short notice. The file lists the id, name, address and contact number for each candidate. All candidates at TFH have a number of qualifications and TFH uses a unique code and general description to specify the qualification. TFH also has a list of companies (name and address) that use their services. When a company has a position to be filled, they specify the start date, end date and hourly rate of the position. The company also specifies the essential qualifications required for the position. TFH matches the position qualifications against the candidates' qualifications and selects a 24 candidate to fill the position. 1. Identify Entities candidate qualification company position look for nouns, major objects Note: TFH itself not an entity in the database Session 2, 2010 25 2. Identify Relationships All candidates at TFH have a number of qualifications … When a company has a position to be filled, … The company also specifies the essential qualifications required for the position. … and selects a candidate to fill the position. Look for associations, verbs between nouns..… then add constraints 26 2. Identify Relationships candidate qualification has fills requiredFor company position specifies Session 2, 2010 27 3. Identify Attributes … the id, name, address and contact number for each candidate … … a unique code and general description to specify the qualification. a list of companies (name and address) … … the start date, end date and hourly rate of the position. Look for nouns, noun phrases that are properties of things … decide if multi-valued, derived, etc … record description in data dictionary 28 3. Identify Attributes candidate candidateID name address contactNo company name address has qualification qualificationCode description fills requiredFor specifies position startDate endDate hourlyRate 29 4. Determine Keys … the id, name, address and contact number for each candidate … … a unique code and general description to specify the qualification. a list of companies (name and address) … … the start date, end date and hourly rate of the position. (no obvious key here?) Choose primary key … create artificial key if necessary Session 2, 2010 30 4. Determine Keys candidate candidateID {PK} name address contactNo company Name {PK} address has qualification qualificationCode {PK} description fills requiredFor specifies position positionNo {PK} startDate endDate hourlyRate 31 Multiplicity candidate candidateID {PK} name address contactNo qualification has 0..* 0..* qualificationCode {PK} description 1..* 0..1 requiredFor fills 0..* company Name {PK} address 0..* position specifies 1..1 1..* Some assumptions on multiplicities made here - would need to clarify Session 2, 2010 with stakeholders positionNo {PK} startDate endDate hourlyRate 32 Summary After today’s lecture you should be able to Construct an ER-diagram Identify entities Identify relationships Determine attributes Select keys