Database Design Foundations 8/09) FoundationsofofInformation InformationManagement Management(WS (WS200 2008/09) – Chapter 4 – ? © 2008 Prof. Dr. Rainer Manthey Principles Principlesof ofDatabase DatabaseDesign Design Foundations of IM 1 Design of DB applications Relative costs for compensating the consequences of errors: 11 Database development Requirements Requirementsanalysis analysis 10 10 Design Design 100 100 Implementation Implementation >> >>100 100 Usage © 2008 Prof. Dr. Rainer Manthey Foundations of IM Topic of this chapter 2 DB-Design: Phases Application area in the "real world" Conceptual design Design documentation Conceptual level Logical design Logical structure „on paper" Logical level Physical design Physical structure in storage © 2008 Prof. Dr. Rainer Manthey Physical level Foundations of IM 3 Conceptual design Application area in the "real world" Formally Formallydefine definewhat whatyou youknow know about the world “ without about„„the world“ withoutthinking thinking ofofcomputers yet)! computersand anddatabases databases((yet)! Conceptual design Design documentation Conceptual level Logical design Logical structure „on paper" Logical level Physical design Physical structure in storage © 2008 Prof. Dr. Rainer Manthey Physical level Foundations of IM 4 The Entity-Relationship model • Entity-Relationship data model (ER model): • Proposed in 1976 in a paper by Peter Chen • Graphical notation for application modeling (ER diagrams) • Independent semantic data model (aiming at the meaning of concepts in real world) • Predecessor of today‘s object-oriented data models • Extremely successful as a means of „pre-design“ of relational DBs • The ER model offers few very simple and basic concepts: • Entities (objects), characterized by attributes (properties) • Binary or n-ary relationships between entities, possibly characterized by attributes as well • Often not mentioned explicitly, but important and basic: Values: printable symbols as values of attributes; play a subordinate role (characterizing objects) • Roles: Names for the special meaning an entity has within a relationship © 2008 Prof. Dr. Rainer Manthey Foundations of IM 5 An entity Each entity is completely characterized by the values of all its attributes. an anentity entity its itsattributes attributes 765 persNr firstName name age "Rainer" "Manthey" 53 topic rank "Informatik" "C3" their theirvalues values © 2008 Prof. Dr. Rainer Manthey Foundations of IM 6 Two similar entities persNr firstName name age 765 "Rainer" "Manthey" topic 53 rank "Informatik" "C3" another anotherentity entity persNr firstName name age ?? „Armin B." „Cremers" 60 topic attributes identicalattributes rank identical "Informatik" "C4" partially partiallydifferent differentvalues values © 2008 Prof. Dr. Rainer Manthey Foundations of IM 7 Entity types • Similar entities can be combined into entity types. • „Similarity" requires at least identical attribute structure. (Attribute names and corresponding value domains are identical.) • Entity types are graphically represented by rectangles. Attributes label the line connecting an entity type and a value domain (often symbolized by an oval): professor persNr firstName name age Int String © 2008 Prof. Dr. Rainer Manthey String Name of the type topic Int Foundations of IM rank String String 8 Entity types: Alternative representations • Domain names are often omitted (in case they are obvious or irrelevant). Instead attributes are placed inside ovals: professor persNr firstName name age topic rank • In larger diagrams, the attribute structure is often entirely omitted in order to save space (or is written down in abbreviated form only): professor persNr, firstName, name, . . . © 2008 Prof. Dr. Rainer Manthey Foundations of IM 9 Instances and population of an entity type • Each entity „belongs to" at least one entity type: It is called an instance of this type. an instance of professor • The set of all current instances of an entity type is calles its current population. a population of © 2008 Prof. Dr. Rainer Manthey Foundations of IM professor 10 Multiple classification • One and the same entity can be an instance of various entity types. professor instance of instance of fan of Werder Bremen • In such a case, the attributes of the different types of this entity may well be quite different. © 2008 Prof. Dr. Rainer Manthey Foundations of IM 11 Classification in presence of identical attributes • Entities with the same attributes do not at all have to be instances of the same type! instance of Ass name, firstName, age instance of Fan of football club Werder Bremen name, firstName, age • Almost always additional classification criteria are required, usually not derivable from the attribute structure. © 2008 Prof. Dr. Rainer Manthey Foundations of IM 12 Key attributes • As in the relational model (Access, SQL), there are usually one or more attributes per entity type the values of which are sufficient for uniquely identifying each instance: Key Keyattributes attributes professor persNr firstName name age topic rank • (Primary) key attributes are usually underlined in an ER-diagram. • Keys ought to be "minimal" (no attribute can be omitted). • A distinction between primary key and other candidate keys is not made in the ER-model, even though it would be useful to do so. © 2008 Prof. Dr. Rainer Manthey Foundations of IM 13 Relationships • 2nd main concept of the ER-model: elementary relationships between two or more entities (possibly with their own attribute values) Relationship Relationship PersNr 765 year of assignment 1992 name Universität Bonn • Each relationship is uniquely characterized by the key values of the participating entities and by the values of all relationship attributes. • However, there are no separate key attributes for relationships, as the keys of the participating entities always suffice for unique identification. © 2008 Prof. Dr. Rainer Manthey Foundations of IM 14 Multiple relationships Entitites may participate in various relationships (also similar ones). year of assignment 1992 year of assignment 1999 © 2008 Prof. Dr. Rainer Manthey Foundations of IM 15 Relationship types • Similar relationships may be grouped into relationship types. • „Similarity" requires at least identical attribute structure and identical types (and number) of participating entities. • Relationship types are graphically denoted in the ER-model by a diamond. Attributes are written as for entity types. assigned_to university professor persNr, . . . name, . . . year of assignment Int © 2008 Prof. Dr. Rainer Manthey Foundations of IM 16 Instances of a relationship type assigned_to university professor name, . . . persNr, . . . year of assignment Int Relationship types have instances, too: Individual relationships between individual entities are analogously considered instances of the corresponding R-type. assigned_to year of assignment persNr 765 name ... © 2008 Prof. Dr. Rainer Manthey 1992 Foundations of IM ... Universität Bonn 17 Type vs. instance assigned_to university professor persNr, . . . year of assignment name, . . . relationship relationshiptype type entity entitytype type Int The distinction between types and instances is difficult even for specialists: Try to be precise from the very beginning! assigned_to year of assignment persNr 765 name ... entity entity © 2008 Prof. Dr. Rainer Manthey 1992 ... relationship relationship Universität Bonn Foundations of IM 18 Roles • One and the same entity type may participate in a relationship type more than once, e.g.: mother descendent_of person daughter • For a (syntactical) distinction between the different „forms of participation“, special designators are used, called roles. Roles are used as labels of the lines connecting the resp. entity and relationship type. © 2008 Prof. Dr. Rainer Manthey Foundations of IM 19 Extended Geo-DB: ER-diagram neighbour_country capital country 1 capital_of_country country city country 2 city_in_country stadt_an_fluss region_ in_country river_ through_ country region river source river relationship type entity type source_of © 2008 Prof. Dr. Rainer Manthey role Foundations of IM 20 Relationship types and mathematical relations On the level of instances, a relation over the populations of the entity types involved is associated with each relationship type. E1 R E2 entities not involved in R © 2008 Prof. Dr. Rainer Manthey Foundations of IM 21 Functional relationships In many cases, at most one entity on one „side" of a relationship type may be associated with a particular entity on the other „side“ of the relationship, e.g.: country city city_in_country •• •• Each . Eachcity cityisisininexactly exactlyone onecountry country. In , however , arbitrarily . Ineach eachcountry country, however, arbitrarilymany manycities citiesmay maybe besituated situated. © 2008 Prof. Dr. Rainer Manthey Foundations of IM 22 Functional relationships (2) • Mathematically, city_in_country is a function: city_in_country : city city_in_country: city → → country country • More exactly: . . . a function from the population of city into the population of country. population(city) © 2008 Prof. Dr. Rainer Manthey population(country) Foundations of IM 23 Functional relationships (3) • In the ER-model, such restrictions of the admissible combinations can be expressed by means of so-called functionalities, annotations attached to the edges connecting entity and relationship types. • There are four different kinds of binary relationships expressible by means of functionalities: 11:: 11 11::NN NN:: 11 NN::M M • In this context, N resp. M stands for arbitrary integer values ≥ 0. • In the ‚city_in_country'-example, an N:1-relationship is appropriate: There is exactly one country per city, but arbitrarily many citites per country (N ≥ 0). city N 1 country city_in_country © 2008 Prof. Dr. Rainer Manthey Foundations of IM 24 Functional relationships (4) • Functionalities of type 1 : 1, 1 : N or N : 1 define partial functions where some of the instances of the types involved possibly are not related at all. city N 1 country city_in_country City without country! • In the „normal“ ER-model, total functions cannot be distinguished from partial ones – in extensions of the model there are additional graphical means for explicitly stating whether a function is partial or total. © 2008 Prof. Dr. Rainer Manthey Foundations of IM 25 Functional relationships (5) • In a 1 : 1-relationship each instance of one of the entity types involved is related to none or exactly one of the instances of the other entity type. • An N : M-relationship can be considered the „normal case" without restrictions on the number of participating entities. • If no functionalities have been stated for a relationship type, then an implicit N : M functionality is assumed. • Functionalities can be defined for relationships with more than two entities involved, too: E1 N M E2 R: R: EE11 ×× EE22 → → EE33 R 1 E3 © 2008 Prof. Dr. Rainer Manthey Foundations of IM 26 Functional relationships (6) • If in an n-ary relationship several edges are marked by '1', then the resp. relationship type represents several partial functions: E1 N 1 E2 RR(1)(1):: EE1 ×× EE2 → E 1 2 → E33 R RR(2)(2):: EE1 ×× EE3 → E 1 3 → E22 1 E3 • . . . and analogously: E1 1 1 E2 RR(1)(1):: EE1 ×× EE2 → E 1 2 → E33 R RR(2)(2):: EE1 ×× EE3 → E 1 3 → E22 1 RR(3)(3):: EE2 ×× EE3 → E 2 3 → E11 E3 © 2008 Prof. Dr. Rainer Manthey Foundations of IM 27 Extended ER model: Generalization • The concepts introduced so far have been contained in Chen‘s original proposal throughout. Since then, however, various extensions have been proposed: Extended Entity-Relationship Model (EER-model) • Most important extensions (as in object-oriented models): Generalization Generalization • This means: • formation of subtypes of entity types • sub-/supertype relationships (type hierarchy) • inheritance of attributes and of „participations" in R-types • Special graphical notation for generalization of E-types: is_a ... © 2008 Prof. Dr. Rainer Manthey Foundations of IM 28 Generalization (2) supertype capital_of city region name, area, inhabitants is_a subtypes country state prime minister governor „Inheritance" in this example means: • Both subtypes inherit all attributes of the supertype, i.e., they „own" these attributes without explicit definition. • Both subtypes participate in the relationship type capital_of', which has been explicitly defined for the supertype only. © 2008 Prof. Dr. Rainer Manthey Foundations of IM 29 Generalization (3) region Generalization always means that the populations of the subtypes are subsets of the population of the supertype. is_a country state population(region) population(country) This circumstance motivates the notion 'is_a'-relationship: "Every country is a region." population(state) Quantifier over instances! © 2008 Prof. Dr. Rainer Manthey Foundations of IM 30 Generalization (4) In the example: special case disjoint generalization (Empty intersection of the populations) In general: This form of the 'is_a'-notation just means some form of subset formation, i.e. overlapping, incomplete subdivision. © 2008 Prof. Dr. Rainer Manthey Foundations of IM 31 Logical design: ER ⇒ Relations Conceptual design ER ERmodel model Mapping Design documentation Logical design Relational Relationalmodel model Logical structure „on paper" Improvement Physical design © 2008 Prof. Dr. Rainer Manthey Foundations of IM 11stststep step Map MapER ERtypes typessystema systematically ticallytotorelations: relations: From Fromsemantic semanticconcepts concepts totosyntactic syntacticstructures structures!! 32 ER ⇒ Relations: Mapping Entity types • Mapping from ER model to relational model: • in principle very easy: per type one relation (table) • in detail and for extensions: quite difficult • Mapping of entity types: rather obvious ⇒ table name • attribute ⇒ column name • type E A1 A2 . . . An E A1 A2 ... An • Key attributes are mapped to primary key columns. © 2008 Prof. Dr. Rainer Manthey Foundations of IM 33 ER ⇒ Relations: Mapping Relationship types • For relationship types: • Use the same basic idea: one relation per type. • but: How are participating entity types represented? city_in_state city name inhabitants ... state since name area ... • Obvious solution as well: The participating entities are represented by means of the values of their primary key attributes (possibly after renaming). city_in_state city.name state.name since Foreign key © 2008 Prof. Dr. Rainer Manthey Foundations of IM 34 ER ⇒ Relations: Mapping Relationship types (2) In presence of special functionalities (1 : N, N : 1, 1 : 1 resp.), a separate relationship table is not necessary, as the relationship information can be embedded into the table of the entity type on the N-side: city city_in_state N name inhabitants ... city city.name © 2008 Prof. Dr. Rainer Manthey 1 since state name area ... inhabitants . . . state.name Foundations of IM Foreign key since 35 Relational representation of generalization hierarchies How Howtotorealize realizeinheritance inheritanceand and sub -/super type sub-/super typerelationships relationshipsrelationally relationally?? E1 A1, A2, . . . • Relational model: does not know any inheritance! inheritance • Inheritance thus has to be "simulated" . is_a E2 B1, B2, . . . Relational representation of a super type E1 is obvious: E1 © 2008 Prof. Dr. Rainer Manthey A1 A2 . . . Foundations of IM 36 Relational representation of generalization hierarchies (2) E1 E1 A1, A2, . . . A1 a11 a21 A2 a12 a22 . . . . . . . . . is_a inherited Inheritance E2 E2 B1, B2, . . . „native" attributes A1 A2 . . . B1 B2 . . .a11 a12 . . . b11 b12 . . . • Obvious relational realization of a subtype: subtype relation E2 owns „native" and inherited attributes. • But: Values of the inherited attributes of all E2-instances have to be (redundantly) repeated in the E1-relation in this case! • Reason: Each E2 -instance is an E1-instance, too ! © 2008 Prof. Dr. Rainer Manthey Foundations of IM 37 Relational representation of generalization hierarchies (3) Avoiding duplication: Store only „native" attributes (+ key for joining) in the subtype relation! E1 A1, A2, . . . inherited attributes is_a inheritance E1 E2 A1 A2 a11 a12 a21 a22 . . . . . . . . . B1, B2, . . . joint key E2 But does Butininthis thiscase caserelation relationEE22does no nolonger longercontain containall allattributes attributesof of EE1--entities entities !! 1 © 2008 Prof. Dr. Rainer Manthey Foundations of IM „native" attributes A1 B1 B2 . . .a11 b11 b12 . . . 38 Relational representation of generalization hierarchies (4) way out: E2-population is completely realized by means of a view joining the inherited and the native attributes. inherited attributes E1 E2-global A1 A2 . . . B1 B2 . . . a11 a12 . . . b11 b12 . . . A1 A2 a11 a12 a21 a22 . . . . . . . . . „native" attributes E2-local CREATE -global AS CREATEVIEW VIEWE2 E2-global AS (SELECT * (SELECT * FROM -local FROM E1 E1 INNER INNERJOIN JOIN E2 E2-local ON -local.A1) ON E1.A1 E1.A1==E2 E2-local.A1) © 2008 Prof. Dr. Rainer Manthey Foundations of IM A1 B1 B2 . . .a11 b11 b12 . . . 39 Relational representation of generalization hierarchies (5) 3rd alternative (also free of redundancies and using a view): Distribute values of the inherited attributes to different relations – super types are reconstructed via views. E1-local E1-global A1 A2 . . . a11 a12 . . . a21 a22 . . . A1 A2 a21 a22 ∪ π CREATE -global AS CREATEVIEW VIEWE1 E1-global AS ( (TABLE -local ) ) TABLE E1 E1-local UNION UNION (SELECT (SELECT AA1,1,AA2,2, . . . . . . FROM FROM E2 E2) ) © 2008 Prof. Dr. Rainer Manthey E2 inherited . . . . . . „native" attributes A1 A2 . . . B1 B2 . . .a11 a12 . . . b11 b12 . . . Foundations of IM 40 Relational representation of generalization hierarchies (6) Which the best" Whichof ofthe thethree threealternatives alternativesisis„„the best"?? 1. Relations E1 and E2, no views: + − short access time (without any join of tables) high requirements for space (due to redundant storage) 2. Relations E1 and E2-local, view E2-global (JOIN): + − Only key attribute values are stored redundantly. Access to E2-attributes is slower (due to join). 3. Relations E2 and E1-local, view E1-global (PROJECT-UNION): + − © 2008 Prof. Dr. Rainer Manthey No duplication of any attribute values. Access to E1-attributes is slower (due to projection and union) . Foundations of IM 41 Generalization and updates What entity isisdeleted deleted?? Whathappens happensififan anEE22--entity E1 A1, A2, . . . is_a E2 B1, B2, . . . • relational variant 1 (inherited attributes duplicated): Deletion from both relations is necessary. • relational variant 2 (inherited and native A. separated): Deletion from both relations is necessary. • relational variant 3 (E2-attributes only in one relation): no propagation of deletions required ⇒ ⇒ In Invariants variants11and and2:2: referential referentialintegrity integrityconstraints constraints with . withdelete deletecascade cascadeisisrequired required. • For insertions and modifications: Changes in several relations may be necessary, too (depending on the chosen strategy). • Deletion of instances of the super type E1: Cascading deletion if the resp. instance is an E2-instance, too (again referential integrity). © 2008 Prof. Dr. Rainer Manthey Foundations of IM 42 Relationships and referential integrity Each relationship type induces FOREIGN KEY-constraints as well: • With N : M-functionality: river_through_state river M N state CREATE CREATETABLE TABLE river_through_state river_through_state ( (River String REFERENCES , River String REFERENCES river river, State State String String REFERENCES REFERENCES state state) ) • With N : 1-functionality: city_in_state city N 1 state CREATE CREATETABLE TABLE city city ( (. .. .. ., , State State String String REFERENCES REFERENCES state state) ) © 2008 Prof. Dr. Rainer Manthey Foundations of IM 43 Functionalities as constraints • But: Uniqueness of the state in the city_in_state-relationship has not yet been expressed ! • An additional CHECK-constraint is required contraining the number of state instances: city_in_state city N 1 state CREATE CREATETABLE TABLE city city ( (. .. .. ., , CHECK CHECK COUNT COUNT(SELECT (SELECT State State FROM city FROM citySS WHERE WHERE S.Name S.Name==Name) Name)=< =< 11 Implicit universal quantifier ranging over each city row ! © 2008 Prof. Dr. Rainer Manthey Foundations of IM 44 ER ⇒ Relations (2) Conceptual design ER ERmodel model Mapping Design documentation Logical design Relational Relationalmodel model Logical structure „on paper" Improvement Physical design 22ndndstep step (after (afterreaching reachingrelations): relations): Improve Improvedesign designby by „„fine fine tuning “ the tuning“ the structure structureof oftables tables © 2008 Prof. Dr. Rainer Manthey Foundations of IM 45 Redundancies due to bad design • Example of „bad design“ (remaining after mapping from ER level): ‚City_in_state‘ and ‚capital_of‘ have been placed into a single table. City Bonn Köln Essen ... Mainz Trier ... State Capital NW NW NW Düsseldorf Düsseldorf Düsseldorf RP RP Mainz Mainz Redundantly : Redundantlystored storedinformation information: DDüsseldorf üsseldorf isisthe thecapital capital of -Westphalia. ofNorth NorthRhine Rhine-Westphalia. • Obviously one topic (Which city is the capital of . . . ?) has been combined with another topic (In which state is a certain city situated ?) in such an „unlucky“ manner that considerable redundancies occur, resulting in waste of space. What a topic “ mean Whatdoes does„„a topic“ mean?? © 2008 Prof. Dr. Rainer Manthey Foundations of IM 46 Anomalies resulting from redundancies • An immediate consequence of such cases of storing multiple topics in one table is the occurrence of so-called anomalies when updating such tables: Assume Köln (as largest city in NW) replaces Düsseldorf as capital: One fact changes, but multiple updates have to be made. City State Capital Bonn Köln Essen ... Mainz Trier ... NW NW NW Düsseldorf Düsseldorf Düsseldorf RP RP Mainz Mainz • Analoguos anomalies may happen due to insertions and deletions: • An instance of topic 1 disappears, as soon as it is no longer associated with any instance of topic 2. • A new instance of topic 1 can only be inserted, if it is combined with an instance of topic 2 (or null values are used). © 2008 Prof. Dr. Rainer Manthey Foundations of IM 47 Decompositions avoid redundancies and anomalies How defects“ ((anomalies, anomalies, redundancies ) ?? Howtotoprevent preventsuch such„„defects“ redundancies) In the example, there is a simple remedy: Separate the two topics into different relations! City State Capital Bonn Köln Essen ... Mainz Trier ... NW NW NW Düsseldorf Düsseldorf Düsseldorf Decomposition RP RP Mainz Mainz No redundancy, no anomaly ! © 2008 Prof. Dr. Rainer Manthey Foundations of IM City Bonn Köln Neuss ... State NW NW NW State Capital NW RP ... Düsseldorf Mainz 48 Functional dependencies and decompositions • Already discovered by Codd before 1970: Functional relationships between attributes are of help for finding meaningful decompositions and for avoiding reduncancies! • Resulting from this observation, Codd developed an elaborate theory of relational normal forms. • Prerequisite: Designers identify such functional relationships during schema design (quite similar to identifying functionalities in the ER model) and express them as special integrity constraints. Functional Functionaldependency dependency • (short: FD) Principle of functional dependency: • Let A and B be attributes of a relation R. • B depends functionally on A, if in each state of R each A-value always occurs in combination with the same, uniquely determined B-value. • symbolic notation: A → B © 2008 Prof. Dr. Rainer Manthey Foundations of IM 49 FDs in the example • • • • Each city lies in exactly one state: City → State Each state has exactly one capital: State → Capital But also: If a city is a capital then it is the capital of exactly one state: Capital → State Each city is associated with exactly one capital (namely the capital of its state): City → Capital • Decomposition separates FDs: City State Capital Bonn Köln Essen ... Mainz Trier ... NW NW NW Düsseldorf Düsseldorf Düsseldorf City State RP RP Mainz Mainz State Capital • One FD seems to be lost, however: City → Capital © 2008 Prof. Dr. Rainer Manthey Foundations of IM 50 "Codd's thesis" • Thesis (claim) on which the normalization theory of Codd is based: Attributes Attributesconnected connectedvia viaan anFD FDrepresent representsemantically semanticallysignificant significanttopics topics of . ofthe theapplication applicationdomain domain. • That is: Every FD identifies a topic – thus separating topics means separating FDs. • But: Not every such topic is necessarily represented by an FD. • Moreover: FD-connection is a sufficient, but not a necessary criterion for the existence of a ‚topic‘. • Basic idea of Codd‘s approach to normalization of relations: Decompose relations in such a way that „normally" each FD has a component relation of its own. But try to identify exceptions where several FDs may „coexist“ in one and the same relation ! © 2008 Prof. Dr. Rainer Manthey Foundations of IM 51 Transitive FDs and Armstrong‘s axioms • There are FDs which are derivable from other FDs, already known. An important example are so-called transitive FDs: α → β is a transitive FD if there is an attribute set γ, such that α → γ and γ → β are both FDs, but not γ → α. • Every such transitive case leads to a (new) functional dependency. • There are two other such inference rules for FDs, called the „Armstrong axioms“ (as they have been discovered by the Canadian scientist W. Armstrong). ββ⊆⊆αα ⇒ ⇒ αα→ →ββ Every subset depends on its superset. αα→ →ββ ⇒ ⇒ ααγγ → → ββγγ Augmentation on both sides. • Not to be found in the literature, but quite useful: Special notion for FDs which are not transitive FDs: α → β is a direct FD if there is no attribute set γ, such that α → γ and γ → β are both FDs, but not γ → α. © 2008 Prof. Dr. Rainer Manthey Foundations of IM 52 Normal forms • Properly determining FDs and investigating their properties is the basis for each meaningful decomposition of relations into components free of redundancies. „„Codd's Codd's recepy “ (in ): recepy“ (inshort shortfrom from): Redundancies Redundanciescan canbe besafely safelyavoided avoidedififFDs FDsalways alwaysoriginate originatefrom from candidate . candidatekeys keysof ofaarelation relation. • Codd defined various degrees of FD separation, called normal forms – the process of transforming a given schema into relations all of which exhibit a given normal form is called normalization. • The most important normal form is the third normal form (short: 3NF) defined as follows: rd AArelation relationisisinin33rdnormal normalform form ⇔ ⇔ Each -key attribute Eachnon non-key attributefunctionally functionallydepends dependsdirectly directly on oneach eachcandidate candidatekey keyof ofthe therelation relation. . • There are various other normal forms (1NF, 2NF, 4NF and others). © 2008 Prof. Dr. Rainer Manthey Foundations of IM 53 Normalization in the example schema • Our example schema from the geographic domain originally was not in 3NF, as it still contains a transitive dependency pointing from a candidate key (City) to a non-key attribute (Capital): (ℜ , ) City Bonn Köln Essen ... 3NF State NW NW NW Capital Düsseldorf Düsseldorf Düsseldorf (ℜ , 3NF ) City (ℜ , State ) 3NF State Capital • After decomposition, however, each of the resulting component schemas is in 3NF! Such a decomposition will always be possible. • The „lost“ dependency City → Capital can be reconstructed by means of the transitivity axiom of Armstrong. © 2008 Prof. Dr. Rainer Manthey Foundations of IM 54