Modern Database Management Systems Chapter 2 Data Modeling And Data Models Unit-1 Data Modeling and Data Models Data Modeling: Database design focuses on how the database structure will be used to store and manage end-user data. Data modeling, the first step in designing a database, refers to the process of creating a specific data model for a determined problem domain. A problem domain is a clearly defined area within the real-world environment, with well-defined scope and boundaries that is to be systematically addressed. Data Model: A data model is a relatively simple representation, usually graphical, of more complex real-world data structures. In general terms, a model is an abstraction of a more complex real-world object or event. Within the database environment, a data model represents data structures and their characteristics, relations, constraints, transformations, and other constructs with the purpose of supporting a specific problem domain. The basic building blocks of all data models are entities, attributes, relationships and constraints. Data modeling is an iterative, progressive process. With a simple understanding of the problem domain, and the level of detail of the data model. The final data model is a “blueprint” containing all the instructions to build a database that will meet all end-user requirements. This blueprint is narrative and graphical in nature, meaning that it contains both text descriptions in plain, unambiguous language and clear, useful diagrams depicting the main data elements. Sree Vidyanikethan Degree College K.Muni Sankar Modern Database Management Systems Chapter 2 Data Modeling And Data Models Unit-1 An implementation-ready data model should contain at least the following components: A description of the data structure that will store the end-user data. A set of enforceable rules to guarantee the integrity of the data. A data manipulation methodology to support the real-world data transformations. Data Models and Basic Building Blocks: The basic building blocks of all data models are entities, attributes, relationships, and constraints. An entity is anything (a person, a place, a thing, or an event) about which data are to be collected and stored. An Entity represents a particular type of object in the real world. Because an entity represents a particular type of object, entities are “distinguishable” that is, each entity occurrence is unique and distinct. For example, a CUSTOMER entity would have many distinguishable customer occurrences, such as John, Smith, Tom, Pop etc. Entities may be physical objects, such as customers or products An Attribute is a characteristic of an entity. For example, a CUSTOMER entity would be described by attributes such as customer last name, customer first name, customer phone, customer address, and customer credit limit. Attributes are the equivalent of fields in file systems A Relationship describes an association among entities. For example, a relationship exists between customers and agents that can be described as follows: An agent can serve many customers, and each customer may be served by one agent. A Constraint is a restriction placed on the data. Constraints are important because they help to ensure data integrity. Constraints are normally expressed in the form of rules. Sree Vidyanikethan Degree College K.Muni Sankar Modern Database Management Systems Chapter 2 Data Modeling And Data Models Unit-1 Data models use three types of relationships: 1. One-to-Many Relationship (1: M or 1...*) 2. Many-to-Many Relationship (M:N or *..*) 3. One-to-One Relationship (1:1 or 1..1) Database designers usually use the shorthand notations 1:M or 1..*, M:N or *..*, and 1:1 or 1..1, respectively. (Although the M:N notation is a standard label for the many-to-many relationship, the label M:M may also be used.) The following examples illustrate the distinctions among the three. 1. One-to-Many (1:M or 1..*) relationship : A painter paints many different paintings, but each one of them is painted by only one painter. Thus, the painter (the “one”) is related to the paintings (the “many”). Therefore, database designers label the relationship “PAINTER paints PAINTING” as 1:M. (Note that entity names are often capitalized as a convention, so they are easily identified.) Similarly, a customer (the “one”) may generate many invoices, but each invoice (the “many”) is generated by only a single customer. The “CUSTOMER generates INVOICE” relationship would also be labeled 1:M. 2. Many-to-Many (M:N or *..*) Relationship: An employee may learn many job skills, and each job skill may be learned by many employees. Database designers label the relationship “EMPLOYEE learns SKILL” as M:N. Similarly, a student can take many classes and each class can be taken by many students, thus yielding the M:N relationship label for the relationship expressed by “STUDENT takes CLASS.” 3. One-to-One (1:1 or 1..1) Relationship: A retail company’s management structure may require that each of its stores be managed by a single employee. In turn, each store manager, who is an employee, manages only a single store. Therefore, the relationship “EMPLOYEE manages STORE” is labeled 1:1. Sree Vidyanikethan Degree College K.Muni Sankar Modern Database Management Systems Chapter 2 Data Modeling And Data Models Unit-1 A constraint is a restriction placed on the data. Constraints are important because they help to ensure data integrity. Constraints are normally expressed in the form of rules. For example: An employee’s salary must have values that are between 6,000 and 350,000. A student’s GPA must be between 0.00 and 4.00. Each class must have one and only one teacher. Business Rules of the Organization: A business rule is a brief, precise, and unambiguous description of a policy, procedure, or principle within a specific organization. A business rule can apply to any organization, large or small—a business, a government unit, a religious group, or a research laboratory—that stores and uses data to generate information. Properly written business rules are used to define entities, attributes, relationships, and constraints. Business rules are derived from polices procedures, events, functions, and other business objects and stat constraints on the organization. Business rules are important in data modeling because they govern how data are handled and stored. Overview of Business Rules: A Business Rules is a statement that defines or constrains some aspect of the business. It is intended to assert business structure or to control or influence the behavior of the business. For example “A student may register for a section of course only if he or she has successfully completed the prerequisites for that course” “A preferred customer qualifies for a 10% discount, unless he has an overdue account balance” Sree Vidyanikethan Degree College K.Muni Sankar Modern Database Management Systems Chapter 2 Data Modeling And Data Models Unit-1 The Business Rules Paradigm: The concept of business rules has been used in information systems for some time. However, it has been more common to use the related term “integrity constraint” when referring to such rules. Scope of Business Rule: Business rules that impact only an organization’s databases. Most organizations have a host of rules and or policy’s that fall outside this definition. Some business rules cannot be represented in common data modeling notation. Good Business Rules: The following are the characteristics of a good business rules: Declarative: A business rule is a statement of policy. The rule does not describe a process or implementation, but describes what a process validates. Precise: With the related organization, the rule must have only one implementation among all interested people, and its meaning must be clear. Atomic: A business rule marks one statement, not several; no part of the rule can stand on its own as a rule. Consistent: A business rule must be internally consistent and must be consistent with other rules. Expressible: A business rule must be internally consistent and must be consistent with other rules. Distinct: Business rules are not redundant, but a business rule may refer to other rules. Business oriented: A business rule is stated in terms business people can understand, and since it is a statement of business policy. Sree Vidyanikethan Degree College K.Muni Sankar Modern Database Management Systems Chapter 2 Data Modeling And Data Models Unit-1 Gathering Business Rules: Business rules appear in description of business functions, events, policies, units, stakeholders and other objects. These descriptions can be found in interview notes from individual and group information systems requirements collection sessions, organizational documents and other sources. Rules are identified by asking questions about the who, what, where, why and how of the organization. Data Names and Definitions: Fundamental thing to understanding and modeling data are naming and defining data objects before they can be used unambiguously in a model of organizational data. . Data objects must be names and defining before they can be used. Data Names: A data name is a name given for data objects like entities, relationships, attributes etc. the following are general guidelines about naming any data objects. 1. Related to business not technical characteristics. 2. Data name should be meaningful. 3. They should be unique. 4. They should be readable. 5. They should be taken from the approved list of words. 6. They should be repeatable in the sense they should be consistent. Data Definitions: A definition is an explanation of a term or a fact. A term is a word or phrase that has a specific meaning for the business. A fact is an association between two or more terms. Sree Vidyanikethan Degree College K.Muni Sankar Modern Database Management Systems Chapter 2 Data Modeling And Data Models Unit-1 The Evolution of Data Model: The evolution of the major data models Types of Data Models: The following are major data models in roughly chronological order: 1. The Hierarchical model 2. The Network model 3. The Relational model 4. The entity Relationship model 5. The Object Oriented model 1. Hierarchical Model: 1. The hierarchical model was developed in the 1960s to manage large amounts of data for complex manufacturing projects such as the Apollo rocket that landed on the moon in 1969. 2. Its basic logical structure is represented by an upside-down tree. 3. The hierarchical structure contains levels, or segments. 4. A segment is the equivalent of a file system’s record type. Sree Vidyanikethan Degree College K.Muni Sankar Modern Database Management Systems Chapter 2 Data Modeling And Data Models Unit-1 5. Within the hierarchy, a higher layer is perceived as the parent of the segment directly beneath it, which is called the child. 6. The hierarchical model depicts a set of one-to-many (1:M) relationships between a parent and its children segments. 7. Each parent can have many children, but each child has only one parent. Advantages of Hierarchical Model: Conceptual simplicity Database security and integrity Data independence Efficiency Disadvantages of Hierarchical Model: Complex implementation Difficult to manage and lack of standards Lacks structural independence Applications programming and use complexity Implementation limitations Network Model: 1. The network model was created to represent complex data relationships. 2. More effectively than the hierarchical model, to improve database performance, and to impose a database standard. 3. In the network model, the user perceives the network database as a collection of records in 1:M relationships. 4. However, unlike the hierarchical model, the network model allows a record to have more than one parent. 5. In network database terminology, a relationship is called a set. 6. Each set is composed of at least 2 record types i.e., an owner record and a member record. 7. A set represents a 1: M relationship between the owner and the member. Sree Vidyanikethan Degree College K.Muni Sankar Modern Database Management Systems Chapter 2 Data Modeling And Data Models Unit-1 Advantages of Network Model: Conceptual simplicity Handles more relationship types Data access flexibility Promotes database integrity Data independence Conformance to standards Disadvantages of Network Model: System Complexity: The network database model cannot be used to create a user friendly database management system. Absence of Structural Independence: If any changes are made to the database structure then all the application programs need to be modified before they can access data. Even though the network database model succeeds in achieving data independence, it still fails to achieve structural independence. Relational Model: 1. The relational data model was first introduced in 1970 by E.F.CODD. 2. The relational data model represents data in the form of tables (relation). 3. The relational model is based on mathematical theory and therefore has a solid theoretical foundation. 4. Each row in a relation is called a tuple. Each column represents an attribute. 5. The relational model uses tables to organize data elements. Each table corresponds to an application entity and each row represents an instance of that entity. 6. Sophisticated relational database software such as Oracle, DB2, Microsoft SQL Server, MySQL, and other mainframe relational software. 7. The most important advantage of the RDBMS is its ability to hide the complexities of the relational model from the user. 8. The relational data model consists of the three components Data structure, Data manipulation, Data integrity. Sree Vidyanikethan Degree College K.Muni Sankar Modern Database Management Systems Data Modeling And Data Models Chapter 2 Unit-1 RDBMS TERMONOLOGY: Formal Relational Informal Terms Equivalents Relation Table Tuple Row, Record Cardinality Number of rows Attribute Column, field Degree Number of columns Primary key Unique identifier Domain Set of legal values Advantages of Relational model: Structural Independence: Changes in the database structure do not affect the data access. Conceptual Simplicity: The Relational data model frees the designer from the physical data storage details, the designers can concentrate on the logical view of the database. Design, implementation, maintenance and usage ease. Adhoc query capability: The presence of very powerful flexible and easy to use query capability is one of the main reason s for the popularity of the relational database model. Disadvantages of relational model: Hardware overheads: For making things easier for the users, the relational database systems need powerful hardware computers and data storage devices. Ease of design can lead to bad design. Sree Vidyanikethan Degree College K.Muni Sankar Modern Database Management Systems Chapter 2 Data Modeling And Data Models Unit-1 The Entity Relationship Model: 1. An E-R model is a detailed logical representation of the data for an organization. 2. The basic constructs of the entity-relationship model are entities, attributes and relationships. 3. The E-R model is a high-level conceptual data model expressed in terms of entities in the business environment, the relationships among those entities, and the attributes of both the entities and their relationships. 4. An E-R Model is normally expressed as an Entity-Relationship diagram, which is a graphical representation of an E-R Model. 5. Peter Chen first introduced the ER data model in 1976; it was the graphical representation of entities and their relationships in a database. 6. ER models are normally represented in an entity relationship diagram (ERD), which uses graphical representations to model database components. 7. The original Chen notation and the more current Crow’s Foot notation. Advantages of Relational model: Exceptional conceptual simplicity Visual representation Effective communication tool Integrated with the relational database model Disadvantages of Relational model: Limited constraint representation Limited relationship representation No data manipulation language Loss of information content Sree Vidyanikethan Degree College K.Muni Sankar Modern Database Management Systems Chapter 2 Data Modeling And Data Models Unit-1 Object-Oriented Data Model (OODM): Relational database technology failed to handle the needs of complex information systems. A data model consists of 1. Static properties such as object and attributes and relationships. 2. Integrity rules over objects and operations. 3. Dynamic properties. 1. In the object-oriented data model (OODM), both data and their relationships are contained in a single structure known as an object. 2. In turn, the OODM is the basis for the object-oriented database management system (OODBMS). 3. Object oriented model represents an entity as a class. 4. An entity, an object includes information about relationships between the facts within the object, as well as information about its relationships with other objects. 5. Object-oriented data models are typically depicted using Unified Modeling Language (UML) class diagrams. 6. The OO data model is based on the following components: An object is an abstraction of a real-world entity. In general terms, an object may be considered equivalent to an ER model’s entity. An object represents only one occurrence of an entity. Attributes describe the properties of an object. Objects that share similar characteristics are grouped in classes. A class is a collection of similar objects with shared structure (attributes) and behavior (methods). Classes are organized in a class hierarchy. The class hierarchy resembles an upside-down tree in which each class has only one parent. Inheritance is the ability of an object within the class hierarchy to inherit the attributes and methods of the classes above it. Sree Vidyanikethan Degree College K.Muni Sankar Modern Database Management Systems Chapter 2 Data Modeling And Data Models Unit-1 Advantages of Object-Oriented Data Model: Capability to handle large number of different data types. Combination of object oriented programming and database technology. Object oriented features improve productivity. Data access. Disadvantages of Object-Oriented Data Model Difficult to maintain. Not suited for all applications. Extended relational Data Model: The ERDM adds many of the OO model’s features within the inherently simpler relational database structure. The ERDM gave birth to a new generation of relational databases supporting OO features such as objects (encapsulated data and methods), extensible data types based on classes, and inheritance. That’s why a DBMS based on the ERDM is often described as an object/relational database management system (O/R DBMS). Evolution of Data Models: Sree Vidyanikethan Degree College K.Muni Sankar Modern Database Management Systems Data Modeling And Data Models Chapter 2 Unit-1 Degree of Abstraction: In the early 1970s, the American National Standards Institute (ANSI) Standards Planning and Requirements Committee (SPARC) defined a framework for data modeling based on degrees of data abstraction. The ANSI/SPARC architecture (as it is often referred to) defines three levels of data abstraction: 1. External 2. Conceptual 3. Internal External model: The external model is the end user’s view of the data environment. The term end users refer to people who use the application programs to manipulate the data and general information. A specific representation of an external view is known as an External Schema. External schema includes the appropriate entities, relationships, procedures and constraints imposed by the business unit. The Conceptual Model: The conceptual model represents a global view of the entire database. It is a representation of data as viewed by the entire organization. That is, the conceptual model integrates all external views into a single global view of the entire data in the enterprise, known as a Conceptual schema. The conceptual schema is the basis to the identification and high level description of the main data objects. The conceptual model is independent of both software and hardware. “Software independence” means that the model does not depend on the DBMS software used to implement the model. “Hardware independence” means that the model does not depend on the hardware used in the implementation of the model. The Internal model: It is the representation of the database as seen by the DBMS. It requires the designer to match the conceptual models characteristics to those of the Sree Vidyanikethan Degree College K.Muni Sankar Modern Database Management Systems Data Modeling And Data Models Chapter 2 Unit-1 selected implementation model. An internal schema depicts a specific representation of an internal model, using the database constructs supported by the chosen database. The internal model is software dependent and hardware independent. The Physical model: It operates at the lowest level of abstraction describing the way data are saved on storage media. It requires the definition of both physical storage devices and the access methods required to reach the data within these storage devices, making it both software and hardware dependent. Sree Vidyanikethan Degree College K.Muni Sankar Modern Database Management Systems Chapter 2 Data Modeling And Data Models Unit-1 To illustrate the meaning of data abstraction, consider the example of automotive design. 1. A car designer begins by drawing the concept of the car that is to be produced. 2. Next, engineers design the details that help transfer the basic concept into a structure that can be produced. 3. Finally, the engineering drawings are translated into production specifications to be used on the factory floor. 4. As you can see, the process of producing the car begins at a high level of abstraction and proceeds to an ever-increasing level of detail. 5. The factory floor process cannot proceed unless the engineering details are properly specified, and the engineering details cannot exist without the basic conceptual framework created by the designer. 6. Designing a usable database follows the same basic process. 7. That is, a database designer starts with an abstract view of the overall data environment and adds details as the design comes closer to implementation. 8. Using levels of abstraction can also be very helpful in integrating multiple (and sometimes conflicting) views of data as seen at different levels of an organization. Sree Vidyanikethan Degree College K.Muni Sankar