Enhanced Entity Relationship model (EER) Origin of ERD – Chen 76. 1. Motivation Develop a data-model that is close to reality. Has a real world ontology (compare with the relational model). Used mainly for database design. Recent motivation: data integration. 1 2. Intuitive presentation: Example – an EERD for a medical clinic. name ID Employee phone (1,1) (0,1) Department dep-name manages x start-date license-no specialization (1,1) works-for (0,N) address ID name supervises (0,1) degree supervision supervised-by Physician (0,N) (1,N) name Patient Nurse (1,1) job-description (1,1) address phone (1,1) School city patient visits (1,2) treating date study physician schedules (0,N) (2,3) (1,20) date Visit diagnosis Profession code (0,N) description name Schedule day 2 (0,N) from-time med-code no-of-days (0,N) to-time Medicine prescriptions qty-per-day Ontology – the world view: The real world portion captured by the knowledge model (EER schema in our case). Named finite sets of entities – the entity types (classes). An entity type is identified by its name (a constraint). Named relations on the sets of entities – the relationship types. Each relation has an arity. Each entity set participating in a relation has a role name (not always given). A relationship type is identified by its name (a constraint). The entities and the related entities (tuples) might have named attributes. An attribute assigns values to the entities/tuples in an entity/relationship set. An attribute of an entity/relationship type is identified by its name (a constraint). Attributes are named functions (mappings) from entity/relationship types to value domains. The names are unique for a type. Attribute classification: 1. Single / multi valued. The values of a multivalued attribute are sets (the value domain is a set of sets). 2. Simple / composite. A composite attribute is associated with a sequence of attributes. Its values are tuples of the values of the attributes in the sequence. Restriction: The association is acyclic (a constraint). 3 An entity type might contain other entity types – the generalization/sub-typing relations. Restrictions: 1. The sub-typing structure is acyclic (a constraint). 2. An entity type is directly contained in at most a single entity type. Questions that are left open: 1. Are the entity types that are not related by subtyping (directly or indirectly) disjoint? 2. Can two named entity types be “a single set of entities with two names”? 3. Can two named relationship types be “a single relation with two names”? Other questions …? 4. 4 1. Constraints: Key constraints: One. Most entity types have an attribute (or attributes) whose values identify the entities. This attribute(s) is called a key attribute. The key is a 1:1 function from the entity set to the value domain of the key attributes (either the key attribute is a 1:1 function, or the key attributes combination forms a 1:1 tuple function). Such entity types are called strong. Entity types that are not strong are called weak. Two. Weak entity types: The entities in entity types that are not strong are identified by their relationships with entities of other entity types. Each weak entity type E is associated with binary identifying relationship types, that relate E with its owner entity types. The key constraint for E is: 1. Every identifying relationship type relates an E entity to a single owner entity. 2. Some attributes of a weak entity type may be marked as partial key attributes. 3. Every E entity is uniquely identified by its owner entities and its partial key values. The combination of the owner entity mapping and the partial key attributes combination forms a 1:1 function. Questions that are left open – are there restrictions on ownership structures: 1. Can a single entity type function as a multiple owner type? 2. Are cyclic ownership structures legal? 3. Are ownership structures necessarily directed trees? Three. Relationship types might also have key attributes. 5 2. Cardinality constraints – impose number restrictions on relationship types. Some versions: 1. For every physician, the number of different visits related through treating is limited between 1 to 20. Entity look-across cardinality constraint. 2. a. Every profession must be represented in the clinic. Two. Every physician must have studied some profession in some school. Entity participation cardinality constraint. a. Every physician that studied a profession, must have studied it in at most 2 schools. b. Every physician that studied in some school, must have studied at least 2 professions and no more than 3 professions. Relationship look-across cardinality constraint. 3. a. If a school is represented in the clinic, then it must be related to at least 2 physician-profession entity tuples. b. If a medicine is prescripted, it must be prescripted at least 5 times and no more than 10 times. Relationship participation cardinality constraint. 4. 6 Is that enough? Is an ontological specification of a data model sufficient? YES – for human (manual) manipulation: Conceptual database design. Manual software specification. Maintenance aid. Client-modeler communication. Requirements specification. Needed – specification of visual language: Labeled rectangles. Labeled diamonds – connected by lines to rectangles. role name labels. cardinality labels. Labeled circles – each connected by a line to a rectangle or to a diamond or to another circle. Labeled dashed rectangles -connected by directed lines to dashed diamonds. Lines between rectangles. + the constraints. 7 For automatic checking of the syntactical correctness of an EER diagram: Formal visual grammar – Spatial Relations Specification (SRS): Symbols: rectangles, diamonds, circles, lines, texts, arrows, dashed rectangles, dashed diamonds. Sentences: R-E, E-E, A-A, E-A, CardC, NamedR, … Definitions of sentences: R-E := line & at-least 4 touching & exactly-one touching CardC & exactly-one touching Rectangle & exactly-one touching Diamond & exactly-one touching NamedR. Formal specification of the constraints. 8 Is that enough? Is an ontological specification of a data model sufficient? NO – for automatic processing: Software synthesis (e.g., Rapsody). Automatic conceptual database design. Schema reasoning: Satisfiability, correction, optimization, equivalence and transformations. Data integration, reverse engineering. Model extensions: Concrete domains – e.g., Time. Hierarchy. Dynamic enhancements: Consistency enforcement. Add active elements like triggers. 9 Needed: Abstract (logical) Syntax Specification (ASS) – specifies syntactically correct schemas. Declarative semantics – the ontological denotation (e.g., a database instance) of the syntactical schema. Consistency specification – constrains the legal denotations (which database instances are legal). A schema is consistent (satisfiable) if it has a consistent denotation. 10 Knowledge -- Visual Specification A Diagramatic visual language: Collection of diagrams. A diagram: Graphical objects with well-defined meaning. USER: Physical layout of a diagram + meaning of the diagram. Implementation: Needs connection between the 2 representations: Spatial relations specification (SRS) – describes the diagram. Abstract syntax specification (ASS). PRODUCTION APPLICATIONS create create represents Spatial Relations Specification Abstract Syntax Specification Represented by Graphical scanning Constraint solving interpretation Physical layout Physical layout: lines, circles, colors, exact positions, line kinds, text fragments, … Spatial relations specification (SRS): replace individual properties of objects by spatial relations between objects, like: touches, contains, labels, left-of. Defines the pictorial structure of the diagram. Forms a constraint system that specifies the correctness of the physical layout. Abstract syntax specification (ASS): Diagram content is abstracted from the visual representation (like the EER specification). Specifies logical constraints. Each construct in the SRS represents a construct(s) in the ASS, and each construct in the ASS is represented by a construct(s) in the SRS. Meaning: The ontology described by the diagram. Reference: K. Marriott and B. Meyer: Visual Language Theory, Springer-Verlag, 1998. 11 Ontology