PHASE 2 : SYSTEM ANALYSIS LESSON 4 DATA MODELING INTRODUCTION During the requirement modeling process described in the previous lesson, we used fact-finding techniques to investigate the current system and identify user requirement for the proposed system. The next activity after determining and gathering requirements for the proposed system is data and process modeling. This lesson will continue with the data modeling. A logical data model shows what the system should do, doesn’t mind how it’ll be implemented physically. Data modeling is a representative of organizational data. This lesson consists of five sections : an overview of data modeling modeling entities modeling attributes modeling relationship LEARNING OUTCOMES At the end of this lesson, students should be able to : define the important of data modeling identify several types of entities and how to represent it identify several types of attribute and how to represent it identify three types of relationship degree TERMINOLOGY No 1 Word Attribute Definition Data that represents characteristics of interest about an entity 2 Data modeling A data-centered technique used for organizing and documenting system’s data. 3 Entity A class of persons, places, objects, events or concepts about which needed to capture and store data 4 Entity relationship diagram A data model using several notations to depict data in terms of the entities, attributes and relationships 5 Relationship A natural business association between one or more entities 4.1 OVERVIEW OF DATA MODELING System Planning Requirements Gathering √ Data Modeling Process Modeling System Analysis System Design System Implementation System Maintenance Figure 4-1: Data Modeling Activities in the System Analysis Phase Data and process are important elements that need to be clearly defined when develop a system. It’s because data and process modeling technique will be used to develop a logical data model of the proposed system and document the entire system requirements. Figure 4-1 shows the data modeling activity should be carried out after determining and gathering requirements. Data modeling is a technique for defining business requirements for database. Sometimes, it refers to database modeling because a data model is eventually implemented as a database. There are several notations for data modeling. The most popular is Entity-Relationship Diagram. ER-Diagram is a data model utilizing several notations to depict data in terms of the entities with its attributes and relationships described by that data. There are several notations for ERDs; for example Chen Model, Martin Model, and Bachman Model. Figure 5-2 shows notation used in drawing ERD. Entity Relationship Identifier Attribute Attribute Multivalued Attribute Figure 4-2: ER-Diagram Notation 4.2 MODELING ENTITIES All systems must have data. Data describe “things”. A university’s system has data that describe things such as STUDENTS, LECTURER, COURSES, and ROOMS. For this data, imagine what type of data that describes any given instance of the thing. For example, the data describe a student might include NAME, MATRIC NUMBER, ADDRESS, TELEPHONE NUMBER, GENDER, PROGRAMME, GRADE and many more. An entity can be a class of persons, places, objects, events or concepts about which we need to capture and store data. Entity is representing by using rectangle shape as in Figure4-2 above. 4.2.1 Entity Type versus Entity Instance There’s important to differentiate between entity type and entity instance. An entity type is a collection of entities that share same properties while entity instance is a single occurrence of an entity type. Entity represents all instances of name entity. STUDENT is an entity and STUDENT A, STUDENT B represent an instances of entity STUDENT. Entity STUDENT drawn the shape represent all STUDENTS in the system. But, to represent the name of entity in the shape, should be written as singular, for example STUDENT and not STUDENTS. Figure 4-2 shows an example of entities STUDENT and LECTURER. STUDENT LECTURER Figure 4-3: An Entity There are several types of entities such as : PERSONS : STUDENT, LECTURER, ADMINISTRATOR, SUPERVISOR, DEPARTMENT, DIVISION. Entity PERSON represents an individuals, groups or organizations. PLACES : ROOM, CLASSROOM, BUILDING, REGION, BRANCH, CAMPUS. OBJECTS : BOOK, PRODUCT, VEHICLE, LOCKER, COUPON. EVENTS : APPLICATION, REGISTRATION, INVOICE, FLIGHT SCHEDULU, ORDER CONCEPTS : ACCOUNT, BOND, COURSE, FUND, QUALIFICATION, STOCK 4.2.2 Strong Entity versus Weak Entity Types A strong entity type is an entity that exists independently of other entity types. Examples of strong entity types are STUDENT, LECTURER, and COURSE. Weak entity type in an entity type that whose existence is depends on some other entity. An example of weak entity is PARENT. STUDENT is a strong entity and PARENT is a weak entity. PARENT only exists when the STUDENT exist. Figure 4-4 shows that double-lined indicate that PARENT is a weak entity and a double underline are used to represent the relationship. Has STUDENT PARENT Figure 4-4: Strong Entity vs. Weak Entity 4.2.3 Supertypes and Subtypes It’s normal to have two or more entity types. It’s when these entities have a similar name but there are a few differences between them. It shares a few similar properties but have one or more distinct attributes or relationships. E-R Diagram refers this to supertype and subtype entity. A supertype is a generic entity type that has a relationship with one or more subtypes. A subtype is a subgrouping of the entities in an entity type that is meaningful to the organization and that shares common attributes or relationships distinct from other subgroupings. (Hoffer et. al., 2005). STUDENT Categorized by POST UNDER GRADUATE GRADUATE Program_type Majoring Figure 4-5: Supertypes Entity and Subtype Entity 4.2.4 Exercises Answer TRUE or for FALSE for each of the questions below. 1 An entity is something about which the business needs to store data; such as persons, places, objects, events or concepts about which we need to capture and store data. TRUE 2 An entity existence is a single occurrence of an entity. FALSE 3 Student Identification Number, Student Name, Student Address are entity types. FALSE 4 Weak entity type in an entity type that whose existence is depends on some other entity. TRUE 5 A supertype is a generic entity type that has a relationship with one or more subtypes. TRUE 4.3 MODELING ATTRIBUTES Every entity has its own pieces of data. We need to identify what are the specific pieces of data we want to store about each instance of a given entity. This is referring to attributes. Attributes is a descriptive property or characteristic of an entity. In the previous subsection, data about student such as STUDENT_IDENTIFICATION_NUMBER, NAME, ADDRESS, TELEPHONE NUMBER, GENDER, PROGRAMME, GRADE is refer to attribute. Attribute is represented by using an oval shape along with the name as in Figure 4-6. Telephone Number Gender Address Program Name Majoring Student ID STUDENT Figure 4-6: Representing Attributes 4.3.1 Required versus optional attribute Each entity should have its value. An attribute that MUST have a value refers to required attribute. This attribute should contain value and can’t leave empty. An example is attribute STUDENT_IDENTIFICATION _NUMBER. Every student has its own identification number and it’s compulsory for them to have the number. So, this value can’t be NULL. It’s different with optional attribute, whereas this optional attribute refers to attribute that may not have a value. An example of optional value is MAJORING. It’s because not all student have chosen their majoring yet. In ER-Diagram, certain symbol used to represent required and optional attribute. 4.3.2 Simple versus composite attribute Attribute can be in a form of simple attribute or composite attribute. Simple attribute is an attribute that can’t be broken down to any detailed parts. It’s also refers to atomic attribute. Examples of simple attribute are STUDENT_IDENTIFICATION_NUMBER, BOOKING_ID, COURSE_CODE. In other way, some attributes can be broken down into several components. It’s refers to composite attribute. An example of composite attribute is ADDRESS. As we know, we can divide the ADDRESS by HOUSE_NUMBER, STREET, ADDRESS, CITY, STATE and also POSTCODE. Figure 4-7 shows how to represent the composite attribute for ADDRESS. City State Street Number Postcode Telephone Number Gender Address House Number Program Name Majoring Student Number STUDENT Figure 4-7: Representing Simple vs. composite attribute 4.3.3 Single valued vs. multivalued attribute Previously, we assume that each attribute only have one value. For example, one STUDENT must be in one program. If the attribute only have one value, it’s refers to single values attribute. But, sometimes, one attribute may have more than one value. Multivalued attribute is when an attribute can have more than one value. For example, one STUDENT may have more than one telephone number. This TEL_NUM is refers to multivalued attribute. Figure 4-7 shows how to represent multivalued attribute. State City Postcode Street Number Telephone Number Gender Address House Number Program Name Majoring Student Number STUDENT Figure 4-8: Representing Single valued vs. multivalued attribute 4.3.4 Exercises Answer TRUE or for FALSE for each of the questions below. 1 An attribute is a descriptive property or characteristic of an entity. TRUE 2 DEPART_ID, DEPART_NAME, DEPART_HEAD can be classified as an attributes for entity DEPARTMENT. TRUE 3 Required attribute can be NULL. FALSE 4 STUDENT NAME can be broken down to composite attribute such as FIRST_NAME, LAST_NAME AND SURNAME. TRUE 5 A multivalued attribute is an attribute that may have more than one value for each entity instance. TRUE 4.4 MODELING RELATIONSHIP Conceptually, entities and attributes do not exist in isolation. A relationship is a natural business association between two or more entities. The relationship represents an event that links the entities or logical affinity that exist between entities. A relationship is representing using a diamond shape with a verb phrase name. For example; Figure 4-9 shows an example of relationship between entity STUDENT and COURSE. We can make the following business assertion that link between STUDENT and COURSE. STUDENT registered COURSE Figure 4-9: Representing Relationship between Entity STUDENT and COURSE A STUDENT is registered in one or more than one COURSE A COURSE is being registered by zero, one or more than one STUDENT. From the above example, the word IS REGISTERED and IS BEING REGISTERED defines the business relationships that exist between two entities. 4.4.1 Degree of a relationship Another measure of the complexity of data relationship is the degree. Degree is the number of entities that participate in a relationship. Previous example in Figure 4-9 is degree 2, because there are two entities involved in the relationship. There are several types of relationship degree such as : unary relationship – also known as recursive relationship. It’s existing when the entity is having a relationship with its own entity. binary relationship – exist when two different entities participated in the relationship ternary relationship – a relationship when it’s involved three different entities. handle STUDENT UNARY RELATIONSHIP STUDENT register COURSE BINARY RELATIONSHIP STUDENT involved COURSE LECTURER TERNARY RELATIONSHIP Figure 4-10: Degree of Relationship 4.4.2 Cardinalities in Relationship Cardinality is the minimum and maximum number of occurrences of one entity that may be related to a single occurrence of the other entity. It shows the complexity or degree of each relationship. Minimum cardinality of a relationship is the minimum number of instances exists of one entity that may be associated with each instance of another entity meanwhile maximum cardinality of a relationship is the maximum number of instances of one entity that may be associated with each instance of another entity. By using the previous in Figure 4-9, we can answer the following questions: 4.4.3 Must there exist an instance of student for each instance of course? NO Must there exist an instance of COURSE for each instance STUDENT? YES How many instances of COURSE can exist for each instance of student? MANY How many instances of student can exist for each instance of Course? MANY Exercises Answer TRUE or for FALSE for each of the questions below. 1 A relationship is a natural business association exists between one or more entities. TRUE 2 A relationship is representing using a diamond shape with a verb phrase name. TRUE 3 A unary relationship is also refers to recursive relationship. TRUE 4 A binary relationship exists when three different entities participated in the relationship. FALSE 5. The minimum cardinality of a relationship is the minimum number of instances of entity B that may be associated with each instance of entity A. TRUE SUMMARY This is the end of lesson Four. In this lesson, we have learned : an overview of data modeling concepts modeling entities modeling attributes modeling relationship In the next lesson, we will discuss the second type of modeling; process modeling. Process modeling illustrates the processes or activities that are performed and how data moves among them in the system. SELF ASSESSMENT Fill in with the correct answer 1 ____________________________ is a technique for defining business requirements for a database. Data modeling 2 Data modeling is sometimes called ____________________________ because a data model is eventually implemented as a database. database modeling 3 A(n) _________________________ is something that the business needs to store data. Entity 4 A(n) ____________________________ shape is used to represent entity. rectangle 5 A(n) ________________________ is a piece of data that we want to store about each instance of a given entity. attribute 6 The ___________________________ defines what values an attribute can legitimately take. Domain 7 A(n) _____________________________ is a natural business association exists between one or more entities. relationship 8 A(n) ____________________________is also known as recursive relationship. It’s exist when the entity is having a relationship with its own entity. unary relationship 9 ________________________________ is the minimum and maximum number of occurrences of one entity that may be related to a single occurrence of the other entity. Cardinality 10 The ___________________________ of a relationship is the number of entity classes that participate in the relationship. Degree