A Methodology for Creating User Views in Database Design

A Methodology for Creating User Views in
Database Design
VEDA C.
University
and
ROBERT
University
STOREY
of Rochester
C. GOLDSTEIN
of British Columbia
The View Creation System (VCS) is an expert system that engages a user in a dialogue about the
information
requirements for some application,
develops an Entity-Relationship
model for the
user’s database view, and then converts the E-R model to a set of Fourth Normal Form relations.
This paper describes the knowledge base of VCS. That is, it presents a formal methodology, capable
of mechanization
as a computer program, for accepting requirements from a user, identifying and
resolving inconsistencies,
redundancies, and ambiguities, and ultimately
producing a normalized
relational representation. Key aspects of the methodology are illustrated by applying VCS’s knowledge
base to an actual database design task.
Categories and Subject Descriptors: H.2.1 [Database Management]:
Logical Design; H.2.7 [Database Management]:
Database Administration;
1.2.1 [Artificial
Intelligence]:
Applications
and
Expert Systems
General Terms: Design
Additional
Key Words and Phrases: View Creation
System
1. INTRODUCTION
Logical database design is concerned with determining the contents of a database
independent
of implementation
considerations.
The design process usually
takes as its starting point a statement of requirements in the form of a set
of user uiews. Each view describes the database content and structure that are
This research was supported by grants from The Imperial Order of the Daughters of the Empire, The
Natural Sciences and Engineering
Council of Canada, Suncor, Inc., The University
of British
Columbia, the IBM Program of Support for Education and Research in the Management of Information Systems, and the William E. Simon Graduate School of Business Administration,
University
of Rochester.
Portions of this paper are adapted from V. Storey’s View Creation: An Expert System for Database
Design, published by ICIT Press in 1988. 0 by International
Center for Information
Technologies,
1988. All rights reserved.
Authors’ addresses: V. C. Storey, William E. Simon Graduate School of Business Administration,
University of Rochester, Rochester, NY 14627; R. C. Goldstein, Faculty of Commerce and Business
Administration,
University of British Columbia, 2053 Main Mall, Vancouver, BC, Canada V6T lY8.
Permission to copy without fee all or part of this material is granted provided that the copies are not
made or distributed for direct commercial advantage, the ACM copyright notice and the title of the
publication and its date appear, and notice is given that copying is by permission of the Association
for Computing Machinery.
To copy otherwise, or to republish, requires a fee and/or specific
permission.
0 1988 ACM 0362-5915/88/0900-0305
$01.50
ACM Transactions on Database Systems, Vol. 13, No. 3, September 1988, Pages 305-338.
306
l
V. C. Storey and R. C. Goldstein
appropriate for a particular function that a user (or a group of users) performs.
The process of designing user views relies heavily on judgment rather than
mechanistic algorithms. Traditionally,
it has involved an experienced database
designer collecting information from users or systems analysts and then producing a view specification that is refined through an iterative process. Good database
designers are both scarce and costly. Consequently, the number of users consulted
and the number of design iterations are both usually less than what ideally would
be desired.
This paper discusses a methodology that formalizes the view specification task.
The objective of the formalism is to provide a procedure for developing user
views that minimizes the need for scarce human expertise. The methodology has
been implemented as the knowledge base of an expert system, called the View
Creation System (VCS).
The following section defines user views and view modeling. Section 3 outlines
the view creation methodology. The View Creation System is then discussed in
Section 4. Section 5 contains concluding remarks and a discussion of future work.
A partial transcript of a design session using the expert VCS is included in the
Appendix.
2. VIEW MODELING
A suer view can be defined as “the perception of users about what a proposed
database (or an ideal database) should contain” [ll]. In essence, a user view is a
representation
of reality relevant to a particular user or group of users for a
specific purpose. The set of all views used in an organization can be taken as a
specification of the required contents of that organization’s database. Currently,
most methodologies for database design assume the existence of a set of view
definitions
and are concerned primarily with integrating these into a unified
whole [e.g., [lo, 151).
The process of eliciting a user’s view of the database is called view modeling
and is defined formally by Navathe and Schkolnick as “the modeling of the usage
and information
structure of the real world from the point of view of different
users and/or applications”
[13]. Navathe and Schkolnick describe the two major
components of view modeling as
(1) extracting from the user or from a person in charge of application development the relevant parts of real-world information,
and
(2) abstracting this information into a form that completely represents the user
view so that it can be subsequently used in the design.
View representation has been addressed mainly as a by-product of data model
development
[2, 131. According to Navathe and Schkolnick
[13], the most
pertinent work done in this area has been the Entity-Relationship
(E-R) data
model of Chen [4] and the Data Abstraction methodology of Smith and Smith
[16]. Navathe and Schkolnick also propose their own data model, the Navathe
and Schkolnick (N-S) model, as a vehicle for modeling user views.
In addition, there are two methodologies that have been developed explicitly
for constructing user views. These are Bubble Charting [lo] and the Interactive
Specification methodology of Baldissera et al. [l]. A detailed summary of these
ACM Transactions
on Database Systems, Vol. 13, No. 3, September
1988.
Creating User Views in Database Design
307
approaches may be found in [17]. The methodology described in this paper
employs ideas from the E-R model, the Data Abstraction methodology, and the
Interactive Specification methodology.
2.1 View Modeling in Logical Database Design
Database design is a complex and lengthy process beginning with the determination of users’ information requirements and concluding with a physical design.
User requirements are usually specified in the form of a set of views, each of
which is relevant to a particular task or group of tasks. In an actual application,
it is possible that more than one user might be asked to provide a view for a
single task. If, as is likely, these views do not exactly coincide, the differences
among them must be examined and reconciled. This is followed by a view
integration process that is concerned with producing an overall database design
compatible with the complete set of user views. At that point, the logical database
design task is complete.
The methodology described in this paper provides a formal approach to the
elicitation of user views and their representation as a set of Fourth Normal Form
relations. At the end of the paper, we discuss prospects for extending the
methodology to cover both the view reconciliation
and view integration tasks.
3. VIEW CREATION
METHODOLOGY
3.1 E-R Model
This methodology for generating user views is based on the E-R model [4], which
is widely accepted as an effective approach to database design. The model employs
two basic constructs: entities and relationships. An entity is a “thing” of interest
in a database, for example, student. A relationship
is an association among
entities; for example, students take courses is an association between the entities
student and course. Attributes
are properties or characteristics
that can be
identified for both entities and relationships. For example, student-number could
be an attribute of the entity student, and grade an attribute of the relationship
students take courses.
3.2 Rule Set
The view creation methodology is represented as a set of rules that forms the
knowledge base of the View Creation System. There are 130 major rules found
in the knowledge base, many of which contain a number of subrules. Altogether,
there are approximately
500 rules, with the exact number changing slowly but
continuously as the methodology is used and refined. The rule set is a mixture
of both procedural and production rules.
3.2.1 Procedural Rules. Procedural rules dictate the order in which various
tasks are performed. The first such rule controls the overall procedure for the
creation of a user view:
First:
Then:
Then:
Identify entities, their attributes, and candidate keys.
Determine relationships, relationship attributes, and mapping ratios.
Detect and resolve ambiguities, redundancies, and inconsistencies.
ACM Transactions
on Database Systems, Vol. 13, No. 3, September
1988.
308
V. C. Storey and R. C. Goldstein
l
Then: Select primary keys for entities.
Then: Represent entities and relationships as relations.
Then: Identify and resolve partial and transitive functional
dependencies.
Other procedural rules are used to sequence functions within each of these
major sections. For example, when an entity is obtained, the procedural rule
governing how it should be treated is as follows:
First:
Then:
Then:
Then:
Then:
Then:
Verify that the entity name is unique.
Elicit entity attributes.
Convert repeating attributes to entities.
Convert multivalued attributes to entities.
Obtain candidate keys.
Add the entity to the database specification.
3.2.2 Production Rules. Production rules are of the form IF-THEN. They are
interpreted as IF a certain condition holds, THEN carry out a particular action.
These rules indicate what should be done for each condition that could arise in
attempting to achieve the subgoals specified in the procedural rules. As an
example, the following production rule deals with how a certain type of binary
relationship should be represented:
IF: a relationship is of the form A is-u B
THEN: represent the relationship by adding the key of entity B as a foreign key
ofA.
3.3 Sources of Knowledge
The knowledge incorporated
in the methodology was obtained from various
sources. These are listed below along with examples of the types of knowledge
they provided:
(1) Database design theory:
-procedures
-properties
-alternative
and
-candidate
for converting an E-R model into a relational one,
of is-a relationships,
ways of obtaining mapping ratios and their function
keys and their use as primary
in a design,
or foreign keys in a database design.
(2) E-R model:
-a
set of constructs (entities, relationships,
view, and
-a top-down approach to view modeling.
(3) Normalization
and attributes)
for modeling a user’s
theory:
-a
means of determining
whether or not a set of relations avoids certain
anomalies, and
-rules for identifying and resolving violations of normalization
principles.
ACM Transactions
on Database Systems, Vol. 13, No. 3, September
1988.
Creating User Views in Database Design
l
309
(4) Expert database designers:
-heuristics,
-suggestions
and
for improvements
to the user interface.
(5) Colleagues and other people knowledgeable
in database design:
-suggestions
for improvements to the user interface,
-rules for distinguishing
entities and attributes, and
-rules
for ascertaining whether an entity is a subset or superset entity
relationship.
in a
(6) General knowledge:
-attribute
names that are often used to identify
(7) Experience
entities.
using and testing the system:
-rules for identifying missing information,
-rules for detecting inconsistencies,
-system default values,
-rules
for allowing the key of one entity
another, and
-improvements
to the user interface.
to be used in the identification
of
The remainder of this section describes the step-by-step procedure of the view
creation methodology. Refer to the Appendix for a partial transcript illustrating
this methodology as implemented in VCS.
3.4 Entities
Since entities are the fundamental units in the E-R model, the first step in the
procedure is to obtain a list of entities. The example used thoughout this paper
is a library circulation function where the objective is to keep track of where
books are at any given point in time. For this example an initial set of entities
might be borrower, book, volume,l etc.
3.4.1 Entity Attributes. The attributes appropriate for each entity are identified as each entity is obtained. Although it would be possible to postpone the
determination
of attributes until a later point, there are advantages to doing it
as soon as each entity is known. First, it forces one to think carefully about the
application. Knowledge of the attributes may also aid in the detection of consistency problems. The occurrence of particular attributes might imply the need for
certain relationships. Examples are given below:
(1) Multivalued attributes. An attribute, Att, that can have more than one
value for a given instance of an entity, El, indicates the existence of a relationship
between El and the entity EZ identified by Att.
E.g.: book: [catalog-no,
title, volume, . . .]
’ In all the examples in this paper, volume is considered to be not one part of a multipart work, as a
volume of an encyclopedia, but rather a physical instance of a book. Libraries often have many copies
of popular books, and it is essential in this application to distinguish between the conceptual “book”
and the physical “volume.”
ACM Transactions
on Database Systems, Vol. 13, No. 3, September
1988.
310
l
V. C. Storey and R. C. Goldstein
Since each book can have more than one volume, volume is a multivalued attribute.
This indicates the existence of a relationship between book and volume. If the
volume entity is not already known, the need for it is hereby established.
(2) Attribute name is entity name. Any attribute that is the name of another
entity deserves special attention. In this situation, the name of one entity, Ez,
appears as an attribute of another entity, E1. If Ez is needed as (part of) a unique
identifier of E1, then the attribute should be retained, otherwise, this attribute
implies the existence of a relationship between EI and ES.
E.g.: branch: [branch-name,
library,
address]
If library is needed to identify uniquely branch, then the attribute library should
later be replaced by its primary key. If library is not needed in the identification
of branch, a relationship between the two entities is implied.
(3) Repeating attributes. If an entity, E, has attributes of the form Attl,
Att2, Att3, . . . , Attn, there is a presumption that these attributes represent
instances of some entity, Att, rather than properties or characteristics of E. A
relationship between Att and E is also implied.
E.g.: borrower:
[name, address, bookl, book2, book31
Having bookl, book2, and book3 as attributes of borrower suggests the need for a
book entity and a relationship between book and borrower.
3.4.2 Candidate Keys. Each entity occurrence in a database must be uniquely
identifiable. A candidate key is an attribute or a combination of attributes that
uniquely identifies instances of an entity. As each entity is identified, a set of
candidate keys should be obtained. Eventually, one of these will be selected as a
primary key for the entity.
E.g. (Key attributes are in UPPER CASE):
borrower: [NUMBER, name, address, phone]
borrower: [NAME, ADDRESS, number, phone]
Borrower has two candidate keys: (1) [NUMBER]
and (2) [NAME,
ADDRESS].
Key indicator attributes. Certain attributes are commonly used in the identification of entities. These are attributes such as name, number, id, and code.
Whenever such attributes occur, they should be considered as possible candidate
keys.
Generated identifiers. If an entity, E, is identified that does not have any
attributes, then a unique identifier must be generated. This is done by concatenating the entity name, E, to the suffix, id, to obtain a key, E-id.
3.4.3 Missing Entities. Once an initial set of entities and attributes has been
identified, the attributes should be scanned for indications of “missing entities.”
For example, suppose the entity book has the following attributes:
book: [CATALOG-NO,
ACM Transactions
title author-id,
. . .]
on Database Systems, Vol. 13, No. 3, September
1988.
Creating User Views in Database Design
l
311
The attribute author-id
is of the form X-key indicator attribute. This suggests
author might also be an entity in the database. If so, a relationship is needed
between book and author.
3.5 Relationships
Following Baldissera et al. [l], relationships are restricted to binary ones of the
form A VP B, where VP stands for verb phrase.2 Examples are
-borrowers borrow volumes,
-libraries
have branches, and
-library-director
is-a person.
The procedural rule for dealing with each relationship
A VP B requires
that (1) A and B values be appropriately
identified as entities or attributes;
(2) mapping ratios be determined; and (3) if appropriate, corresponding relationship attributes be obtained. Before discussing each of these steps, it is important
to note that the use of certain verb phrases enables one to make inferences about
the semantics of the application.
3.5.1 Semantics. Two special verb phrases that imply specific semantic relationships are is-a and haue/has. The verb phrase have/has is subject to multiple
interpretations.
Two of these interpretations,
instance-of and component-of, as
well as the is-a verb phrase, are important
for determining
primary keys,
“inheriting”
attributes, and detecting inconsistencies, as will be discussed later.
Is-a relationships. The is-a verb phrase corresponds directly to Smith and
Smith’s [6] concept of generalization. A relationship A is-a B implies that one
should be able to attribute to A all the properties of B, but not vice versa. For
each occurrence of B, there may or may not be a corresponding occurrence of A;
for each occurrence of A, there is precisely one occurrence of B.
E.g.: Relationship:
librarian
is-a person
A person may or may not be a librarian, but every librarian
attributes of person should be attributable to librarian.
is a person. All the
Instance-of verb phrases. The instance-of verb phrase is similar to the is-a
verb phrase in the sense that, for A instance-of B, one should be able to attribute
properties of B to A (but not vice versa). The instance-of verb phrase differs
from the is-a verb phrase, however, in that it allows for many occurrences of A
for each B. For each occurrence of A, there is precisely one occurrence of B.
E.g.: Relationship:
volume instance-of
book
For each book there can be one to many volumes; for each volume there is one
and only one book. All the attributes of book should be attributable to volume.
’ With some dexterity on the part of the user, the methodology also permits the modeling of nonbinary
relationships. For example, the relationship students take courses in a given semester and receive a
correspondinggrade
could be modeled as the entity: student-grade:
[STUDENT,
COURSE, SEMESTER, grade], where the entities student, course, and semester would eventually be replaced by their
key attributes.
ACM Transactions on Database Systems, Vol. 13, No. 3, September 1988.
312
-
V. C. Storey and R. C. Goldstein
3.5.2 Unidentified As and Bs. The A and B in a relationship, A VP B, are
normally assumed to be entities. For certain verb phrases, however, it is possible
that either or both can be attributes. If a relationship is specified for which A
and/or B is unknown, it is necessary to classify them appropriately. A series of
rules is provided for dealing with such situations based on
-the semantics of the verb phrase, and
-existing
information about A and B.
For example, suppose a relationship A is-a B occurs in which B is known to be
an entity but A is unidentified.
Since B is a generalization of A (by definition of
is-a [3]), A must also be an entity.
3.5.3 Mapping Ratios. Mapping ratios describe the minimum and maximum
number of A values that can occur for each B value in a relationship, A VP B,
and vice versa. Tsichritzis and Lockovsky [ 181 refer to this type of mapping ratio
as the minimum and maximum cardinality of the mapping. For example, if each
value of A can have from 0 to many corresponding values of B, the min/max
cardinalities of A are (0, N). Similarly, if each value of B has one and only one
corresponding value of A, the min/max cardinalities of B are (1, 1).
3.5.4 Infer Min/Max
Cardinalities.
In some cases, it might be possible to infer
cardinalities by examining (1) the verb phrase and
(2) the form of the entities (singular or plural) as they appear in a relationship.
some or all of the min/max
(1) Is-a verb phrases. A relationship A is-a B is interpreted as an association
between a specific A and a generic B; that is, A is a subset of the superset B [3].
Each value of A, therefore, can have one and only one corresponding value of B,
so the min/max cardinalities of A are (1, 1). For each value of B, there may or
may not be a corresponding value of A. Thus, the min/max cardinalities of B are
(0, 1).
E.g.: Relationship:
librarian
is-a person
The min/max cardinalities for librarian are (1, 1) because each librarian corresponds to one and only one person. The min/max cardinalities for person are
(0, 1) because a person may or may not be a librarian.
(2) Entities in singular or plural form. Inferences about mapping ratios can
also be made by examining the form (singular or plural) of the entities appearing
in a relationship. For example, if, in the relationship A VP B, A and B are both
singular, then there is one and only one B for each A. The min/max cardinalities
of A, therefore, must be (1, 1). The inverse, however, is not necessarily true.
E.g.: Relationship:
book has publisher
Using the singular form for both book and publisher implies that a book has one
and only one publisher, so the min/max cardinalities for book are (1, 1). As can
be seen from this example, the inverse is not implied because, obviously, a
publisher is not restricted to publishing only one book.
ACM Transactions
on Database Systems, Vol. 13, No. 3, September
1988.
Creating User Views in Database Design
313
l
As another example, in the relationship book has authors, the singular book
and plural authors imply there are multiple authors for a single book. Therefore,
the min/max cardinalities for book are (1, N).
3.5.5 Relationship Attributes. As is the case with entities, relationships can
have attributes: properties or characteristics of the relationship as a whole that
are of interest to the user. Unlike entities, which usually have corresponding
attributes, however, only some types of relationships
have attributes. These
relationships are identified by examining the min/max cardinalities of A and B.
It will be shown that relationship
attributes only exist when the min/max
cardinalities of A are (0, N)3 or (1, N), and the min/max cardinalities of B are
not (1, 1) or vice versa.4
Consider a relationship
A VP B where the min/max cardinalities
of A
are (0, N) or (1, N) and the min/max cardinalities of B are (1, 1). It must be
shown that a relationship
attribute cannot exist in such a situation. Suppose
such a relationship attribute, R,,,, does exist. R,,, is a function of the relationship A VP B and hence a function, f, of the entities A and B. Formally this is
represented as
Ratt = f(A
B).
A is a function of B, however, since there is one and only one value of A for each
value of B. The above equation can therefore be rewritten as
Ratt = f(fi(B),
B)
or
Ratt = g(B).
Thus, if a relationship attribute did exist, for A(0, N) or A(l, N) and B(1, l), it
would be a function of the entity B only and, hence, appear as an attribute of B.
Analogously, it can be shown that, for A (1, 1) and B(0, N) or B(1, N), a
relationship attribute would be a function of the entity A and would thus appear
as an attribute of A.
Now consider a relationship A VP B where A (1, 1) and B (1, 1). This case
is easily shown to be an extension of the above. An apparent relationship attribute can, in this case, be expressed as an attribute of either the entity A or the
entity B.
3.5.6 Missing Relationships. There are various
relationships; among them are the following:
ways to identify
missing
(1) It would
be unlikely to have an entity that does not participate in any
relationship. The appearance of such an entity suggests a missing relationship.
3 N can mean one or many, depending on the situation.
as (0, N).
4 The following proof was suggested by Yair Wand.
ACM Transactions
Therefore,
(0,l) may sometimes be represented
on Database Systems, Vol. 13, No. 3, September
1988.
314
-
V. C. Storey and R. C. Goldstein
(2) If an entity, El, had a multivalued attribute that was converted to another
entity, Ez, then a relationship should exist between El and EP.
(3) If the name of one entity appears as an attribute of another, the existence of
a relationship between the two entities is implied.
(4) If an entity, E, originally had an attribute of the form X-suffix and X became
a new entity, then a relationship should exist between E and X.
(5) If an entity, E, originally had repeating attributes of the form Xl, X2, X3,
. , Xn and X became a new entity, then a relationship should exist between
& and X.
The last four cases involve the appearance of an attribute of one entity that
refers to some other entity. Such an attribute implicitly indicates the existence
of a relationship between the two entities.
3.6 Ambiguities,
Redundancies,
and Inconsistencies
The previous steps concentrate on eliciting an application’s information requirements and modeling them using the E-R formalism. The model must now be
examined for undesirable properties. The following sections on have/has relationships, inherited attributes, and synonyms indicate how ambiguities, inconsistencies, and redundancies can be detected.
3.6.1 Have/Has Relationships. Relationships employing a verb phrase that is
some form of have/has are inherently ambiguous. At least four interpretations
are possible:
(1)
(2)
(3)
(4)
A
B
B
B
possesses B; for example, library has books;
component-of A; for example, book has chapters;
instance-of/example-of
A; for example, book hu.s volumes; and
associated-with A in some other way; for example, books have authors.
Have/has relationships
with the instance-of interpretation
are of particular
interest because they assist in selecting primary keys and detecting inconsistencies in the user’s input (see Sections 3.6.2 and 3.6.3). Component-of is useful in
ensuring that primary keys are complete (see Section 3.7). The other two
interpretations
(possession and association) are employed simply to reflect more
of the semantics of the application than the verbs have and has.
3.6.2 Hierarchical Relationships. Both is-a and instance-of verb phrases indicate the existence of hierarchical relationships. Other relationships involving
entities that appear in a hierarchical relationship must be examined to ensure
they are specified at the most appropriate hierarchical level. To illustrate, given
the hierarchical relationship volume instance-of book, it is necessary to examine
other relationships
in which either volume or book appears. For example, the
relationship borrowers borrow books would need to be changed to borrowers borrow
volumes because it is physical “volumes” that can be borrowed, not conceptual
“books” (refer to Footnote 1). On the other hand, the relationship authors write
books, when examined, would be determined to be at the correct level.
3.6.3 Inherited Attributes. The analysis described in the previous section is
concerned with ensuring that relationships involving entities that appear in
ACM Transactions
on Database Systems, Vol. 13, No. 3, September
1988.
Creating User Views in Database Design
l
315
hierarchical relationships are specified at the correct level. This section discusses
a similar analysis for attributes of such entities. In a relationship A is-a/instanceof B, all attributes of the superset entity B should be attributable to the subset
entity A. An attribute of the superset entity that cannot validly be applied to the
subset entity indicates an inconsistency: Either
-the hierarchical relationship is incorrectly specified, or
-the attribute in question should not appear in the definition
entity.
of the superset
To illustrate, consider the hierarchical relationship librarian is-an employee,
and suppose that union is an attribute of employee, but that librarians do not
belong to unions. An inconsistency exists that can be corrected by (1) creating
two new entities-manager
and worker; (2) making union an attribute of worker;
and (3) replacing the original relationship, librarian is-an employee, with librarian
is-u manager, manager is-an employee, and worker is-an employee. The effect is
to interpose an additional hierarchical level to distinguish the two categories of
employee.
Note that any attribute that appears in both the subset and superset entities
should be deleted from the subset entity to avoid redundancy.
E.g.:
Relationship:
employee is-a person
Person: [PERSON-NAME,
ADDRESS, person-birthday,
Employee: [EMPLOYEE-NUMBER,
employee-birthday,
Becomes:
(‘employee-birthday”
and ‘address”
. . .]
person-name,
address, . . .]
are deleted from “employee”)
Person: [PERSON-NAME,
ADDRESS, person-birthday,
Employee: [EMPLOYEE-NUMBER,
person-name]
. . .]
3.6.4 Synonyms. Synonyms in either entities or relationships represent redundant information
that should be removed from the design. One way to detect
synonyms is to examine the format of the relationships. Consider, for example,
relationships of the form A, VP B, AS VP B, . . . , A, VP B. Al, AZ, . . . , A,, are
candidates to be either synonyms or related in some way that is not already
known.
The Ais are considered pairwise to determine whether they are synonyms or if
one is a subset of the other. If synonyms are found, one term is selected for
further use. If one is a subset of the other, an is-u relationship is implied.
E.g.:
Relationships:
student borrows volumes
borrower borrows volumes
There are four possibilities
borrower:
for the appropriate
relationship
between student and
(1) They are synonyms, in which case one of the terms is selected to replace the
other throughout the design and any resulting redundancies are eliminated.
(2) Students are a subset of borrowers, implying that the relationship student isa borrower should be added and the relationship student borrows volumes
deleted.
ACM Transactions
on Database Systems, Vol. 13, No. 3, September
1988.
316
l
V. C. Storey and R. C. Goldstein
(3) Borrowers are a subset of students, implying the relationship borrower is-a
student should be added and the relationship
borrower borrows volumes
deleted.
(4) None of the above are correct, so no modification is necessary.
3.7 Primary Keys
Each entity occurrence in a database must be uniquely identifiable. Any of the
candidate keys is, by definition, an acceptable identifier. If there is more than
one candidate key, one of them must be selected as the primary key. The rules
for selecting entity primary keys take into account the semantics of relationships
in which the entity appears and attempt to maximize the efficiency of join
operations.
3.7.1 Unique Attributes Names. Before primary keys are chosen, any attribute
names that either (1) are “key indicator attributes” (e.g., name number, code, and
id) or (2) exist for more than one entity are prefixed by their entity names in
order to produce a set of unique attribute names. This ensures that the resulting
primary keys will all be unique, which is especially important for entities involved
in hierarchical relationships.
3.7.2 Rules for the Selection of Primary Keys. The rules for choosing primary
keys are heuristic. They concentrate first on obtaining the simplest possible
primary key, that is, the alternative that consists of the smallest number of
attributes. When this criterion does not result in a unique choice, the candidate
key that appears most often as a candidate or primary key for other entities is
selected. The latter criterion aims at enhancing retrieval performance by increasing the efficiency of join operations that might be required during use of the
database. Finally, if neither of these criteria are met (or there is a tie), the
candidate key that was provided first is chosen, as it is probably the most natural
one for the user.
Three classes of entities must be considered: (1) those that occur in is-a
hierarchies, (2) those that occur in instance-of hierarchies, and (3) all others.
Is-a relationships. An is-a hierarchy occurs when there are relationships of
the form . . . A is-a B, B is-a C, and so forth. Since C is the generic term for B,
the key of C must be a suitable candidate key for the entity B (and, for that
matter, for the entity A as well). (E.g., if manager is-an employee and employee
is-a person, the key of person can serve as an identifier of both employee and
manager.) Therefore, the primary key of the highest entity in the hierarchy is
chosen first. This key is then “inherited”
as a candidate key by the entities at
the next lower level in the hierarchy. This process is applied recursively until
primary keys have been selected for all entities in the hierarchy.
E.g.:
1) employee is-a person
2) librarian is-an employee.
Original
set of candidate keys:
Person: [PERSON-NAME, ADDRESS, . . . ]
Employee: [EMPLOYEE-NUMBER, . . . ]
Librarian: [JOB-TITLE, BRANCH, librarianname,
ACM Transactions
. . . ]
on Database Systems, Vol. 13, No. 3, September
1988.
Creating User Views in Database Design
l
317
The primary key [PERSON-NAME,
ADDRESS] is adopted for the entity, personD
since it is the only alternative. It is then added as a candidate key for %mployce’:
Person: [PERSON-NAME,
ADDRESS,.
. . ]
Employee: [EMPLOYEE-NUMBER,
person-name, address, . . . ]
Employee: ]PERSONNAME,
ADDRESS, employee-number, . . . ]
Librarian: [JOB-TITLE,
BRANCH, librarian-name,
. . . ]
The primary key [EMPLOYEE-NUMBER]
is chosen for aemployce’, and added as a
candidate key for ‘librarian”:
Person: [PERSON-NAME,
ADDRESS,.
. . ]
Employee: [EMPLOYEE-NUMBER,
person-name, address, . . .]
Librarian:
[JOB-TITLE,
BRANCH, librarian-name,
employee-number,
. . .]
Librarian:
[EMPLOYEE-NUMBER,
librarianname,
branch, job-title, . . .]
The primary key for ‘librarian’
is determined:
Person: [PERSON-NAME,
ADDRESS,.
. . ]
Employee: [EMPLOYEE-NUMBER,
person-name, address, . . . ]
Librarian: [EMPLOYEE-NUMBER,
1i b rarianname,
branch, job-title,
. . .)
At this point, for a subset entity that adopts the primary key of its superset
entity, the subset key is prefixed by its entity name. This is done to preserve
primary key uniqueness, which facilitates the representation
of relationships
between subset and superset entities.
E.g.: Relationship:
libmrian
is-an employee
Employee:
Librarian:
[EMPLOYEE-NUMBER,
[EMPLOYEE-NUMBER,
person-name, address]
branch, job-title, . . .]
Becomes:
Employee:
Librarian:
[EMPLOYEE-NUMBER,
personname,
[LIBRARIANEMPLOYEENUMBER,
address]
branch, job-title,
. . .]
Instagze-of relationships. Instance-of hierarchies are similar to is-a hierarchies, for example, . . . A instance-of B, B instance-of C, and so forth. In this
case, however, the key of the superset entity does not uniquely identify occurrences of the subset entity because there can be many subset entity occurrences
for each superset entity occurrence, for example, volume instance-of book. Rather,
the key of the subset entity (e.g., volume) must include the key of the superset
entity (e.g., book). This is because the key of the subset entity might only be
unique within a particular
occurrence of the superset entity. Therefore, the
primary key of the superset entity is concatenated to each candidate key of the
subset entity, if it is not already there, before the latter’s primary key is selected.5
As in the case of is-a hierarchies, instance-of hierarchies are processed from
the entity at the highest level downwards.
E.g.:
Relationship
volume instance-of
Candidate Keys:
book: [CATALOG-NO,
book-title,
volume: (COPY-NO, volume-title]
5 One could conceive
without reference to
include the superset
representation of the
book
author, publisher]
of a situation where the subset entity is given a key that is unique in itself
the superset entity. In such a case, however, either the subset entity key must
entity key or something that is functionally
related to it. Such a disguised
relationship seems likely to give rise to normalization-related
difficulties.
ACM Transactions
on Database Systems, Vol. 13, No. 3, September
1988.
318
l
V. C. Storey and R. C. Goldstein
There is only one candidate key for “book” eo it becomes the primary
book: [CATALOG-NO,
book-title, author, publisher]
key.
Note that WOPY-NO”
only uniquely identifies a volume for a particular book. Therefore, the candidate key, and hence the primary key, for $olume” becomes:
volume: [CATALOG-NO,
COPY-NO, volume-title]
The attribute bolume-title’
=book”.
volume: [CATALOG-NO,
ia deleted from
%olume”
because it can be inherited from
cow-No]
3.7.3 Entities Requiring Other Entities for Identification.
In some cases, the
key of one entity must include the key of another entity in order to guarantee
uniqueness. For example, if branches of a library are allowed to assign card
numbers independently of each other, then the key of library-card
must include
the key of the branch that issued it.
3.7.4 Component-of Relationships. In a relationship A component-of B, the
key of B might be needed in order to identify uniquely an instance of A. For
example, if branch names are unique only within a library, then the key of branch
must include the key of library.
E.g.
Relationship:
Branch componen+oj
Library
Primary
Library:
Branch:
The key
Kc ya:
[LIBRARY-ID,
library-name,
library-address]
IBRANCH-NAME,
branch-address]
of LIBRARY is concatenated to the key of branch:
Branch:
[LIBRARY-ID,
BRANCH-NAME,
branch-address]
3.8 Entity Representation
Each entity is represented by a separate entity relation with the key and nonkey
attributes of the relation corresponding directly to those of the entity. The
relation thus constructed may not be in its final form. Modifications
discussed
below might be required to ensure adherence to normalization principles.
3.9 Relationship
Representation
There are two alternative
representations
for each relationship
A VP B:
(1) A relation can be constructed with relation name A- VP-B, and relation key
equal to the concatenation of the keys of the A and B entities.
(2) The key attributes of one entity can be added as nonkey attributes of the
other (the foreign key approach).
The choice of representation
depends on the mapping ratios and possibly the
anticipated usage.
As will become apparent in the discussion that follows, the only cardinalities
that are relevant to this decision are 0, 1, and N. The distinguishing
factor in
determining how a relationship should be represented is whether or not one (or
both) of the involved entities has min/max cardinalities of (1, 1).
3.9.1 Relationships
sider:
Involving
(1, 1) Cardinalities.
There are two cases to con-
(1) Only one of the involved entities has (1, 1) cardinalities.
(2) Both entities have (1, 1) cardinalities.
ACM Transactions
on Database Systems, Vol. 13, No. 3, September
1988.
Creating User Views in Database Design
l
319
Case 1. In a relationship
A VP B, suppose the min/max cardinalities are
(1, 1) for A, but not for B. Therefore, there is precisely one occurrence of the
relationship for each occurrence of entity A.
If we adopt the foreign key approach to representing such a relationship, the
storage used would be that required to add a foreign key attribute (i.e., the key
of B) to each occurrence of entity A. Database processing that requires locating
the B corresponding to a given A (i.e., the B of A type of query) would be
efficiently processed through the A relation. The inverse query (i.e., the A of B)
would also be efficiently handled if the A relation has an index on the B foreign
key field.
The alternative
is to construct a new relation to represent this relationship. The length of the new relation would be equal to that of the A relation
because of the (1, 1) cardinality of A. The width of the new relation would be
equal to the sum of the sizes of the A and B keys. Compared to the foreign key
approach, this solution requires additional
storage equal to size of the A
key multiplied by the length of the A relation. Retrievals of either type (the
B of A or the A of B) should be equally efficient because both entity keys occur
in the key of the new relation. Thus, the indexes needed to avoid exhaustive
searching should exist.
The two alternatives are equally appealing in terms of their retrieval performance, but the foreign key option is preferred because it requires significantly
less
storage.
Case 2. When both A and B have (1, 1) cardinalities,
the above analysis
suggests that the relationship could be represented by adding the key of either
entity as a foreign key of the other. Because of the (1, 1) cardinalities of both
entities, the lengths of the A and B relations must be equal. Assuming the
existence of the necessary indexes, retrieval performance will also be equivalent
for the two alternatives. Therefore, the only basis for selection is the size and
complexity of the two keys. If one of the keys involves fewer attributes than the
other, then it should become the foreign key in the other relation. If both keys
have the same number of attributes, then there is some saving of storage by using
the shorter key as the foreign key.
3.9.2 Relationships Not Involving (1, 1) Cardinalities. A relationship may have
attributes only when neither of the entity cardinalities are (1, l), as previously
discussed. Any relationship that does have attributes must be represented by a
separate relation since use of the foreign key approach would unavoidably result
in normalization
violations.
For relationships that do not have attributes, there are two cases to consider:
(1) relationships
(2) all others.
in which the cardinalities
of both entities are (0, l), and
Case 1. (0, l)/(O, 1) and no relationship attributes. Consider a relationship
A VP B where the min/max values of both A and B are (0, 1). The first thing to
examine is participation
rates (i.e., the percentage of occurrences of each entity
that participates in the relationship).
If the participation
rate of one entity is
significantly
higher than for the other, the relationship should be represented
ACM Transactions
on Database Systems, Vol. 13, No. 3, September
1988.
320
l
V. C. Storey and R. C. Goldstein
using the foreign key approach in the entity relation with the higher participation
rate.
E.g. : Relationship:
applicant fills position
Initially
assume that the number of applicants greatly exceeds the number of
positions. Then, the participation
rate for positions is much higher than that for
applicants. Therefore, the relationship should be represented by making the key
of applicant a foreign key in the position relation. This choice would require less
storage for the relationship and less use of null foreign key attribute values than
the alternative of making position a foreign key in the applicant relation.
If both entities have high participation
rates, then the choice of which representation to use should be based on anticipated query frequencies. By query
frequencies we mean whether the user is most often interested in the A of B or
the B of A type of queries. We will use the term input entity for the one about
which the user knows something and output entity for the one about which
information is required. In such cases, the key of the input relation should appear
as a foreign key in the output relation.
If the user knows the key of the input entity, this solution allows such queries
to be processed with only an indexed access to the output relation. If the user
does not know the key of the input entity, the input relation must be searched,
and in this situation the performance is not affected by which entity key is used
as the foreign key.
Finally, if neither entity has a high participation
rate, the most efficient
representation would be a new relation.
Case 2. Not (0, l)/(O, 1) and no relationship attributes.
cases are those in which the entities have either (0, N) or
There is no foreign key approach for these cases that would
ization principles. Therefore, there is no alternative but
relationships by relations.
The only remaining
(1, N) cardinalities.
not violate normalto represent these
3.10 Normalization
The last step deals with two types of undesirable functional dependencies: partial
and transitive. In normalization
theory, partial functional dependencies violate
Second Normal Form, while transitive functional dependencies violate Third
Normal Form.
A partial functional dependency exists when a nonkey attribute in a relation
depends on only part, as opposed to the complete, relation key. A transitive
functional dependency exists when a nonkey attribute depends on other nonkey
attributes instead of directly on the key. Following normal database design
practice, these normalization
violations are removed by splitting the original
relation into two or more relations.
The procedure outlined here cannot produce relations that violate 4NF.6
Therefore, the result will be a set of 4NF relations that represent a user’s
database view.
6 This result is demonstrated
in Storey [17].
ACM Transactions on Database Systems, Vol. 13, No. 3, September 1988.
Creating User Views in Database Design
4. VIEW CREATION
l
321
SYSTEM
The methodology presented in this paper has been implemented as the knowledge
base of an expert system, called the View Creation System (VCS). The implementation serves as a precise specification of the methodology as well as providing
an extremely useful tool for evaluating and refining it. A partial transcript of a
VCS session is included in the Appendix in order to illustrate the view creation
methodology. The system engages the user in a dialogue designed to elicit the
entities, relationships, and attributes of a view. At appropriate points during a
session, VCS explains the concepts of entities, attributes, and relationships using
brief tutorials. Thus, the user is not expected to know anything about database
design techniques or terminology. The user is led to describe his or her application
using the constructs of the E-R model, while the system attempts to detect and
resolve inconsistencies, ambiguities, and redundancies.
4.1 System Development
The development of the knowledge base was a three-step process. A
was built using general knowledge of the database design process
standard textbooks). The knowledge base was then expanded to include
from database design experts and further refined through testing the
a number of different organizations.
prototype
(i.e., from
expertise
system in
4.1.1 Expertise. Consultation sessions were arranged with a number of expert
database designers. During these sessions each designer was asked to create a
database design for a hypothetical problem with one of the researchers playing
the role of the user. At the conclusion of each session, the designer was asked to
describe how and why certain decisions were made. Finally, the designer was
invited to experiment with and critique the then-current version of the system.
The rules and heuristics obtained from each of these sessions were added to the
formalization and, hence, to the knowledge base of the expert system.
4.1.2 Testing Sessions. The system was tested for seven different database
design problems using real users in real organizations. These sessions identified
some missing rules and also produced suggestions for improvement of VCS’s user
interface. Use of the system in a number of organizations subsequent to the end
of the formal testing phase continues to identify occasional refinements. This is
discussed further in Section 4.3.
4.2 VCS Implementation
4.2.1 Use of PROLOG. The system was implemented
guage was chosen for several reasons:
in PROLOG.
This lan-
-Many
researchers (e.g., [5, 6, 8, 141) have characterized PROLOG as an
appropriate language for defining and implementing expert systems. In particular, the constructs used in E-R modeling can be easily represented in
PROLOG [14].
-PROLOG
easily accommodates incremental additions or deletions [6], which
facilitated development of the system in an iterative manner.
ACM Transactions
on Database Systems, Vol. 13, No. 3, September
1988.
322
l
V. C. Storey and R. C. Goldstein
-Updating
of acquired knowledge is easily and quickly accomplished [S].
-It is easy to implement the capability for the system to explain its reasoning.
-PROLOG
is well suited for processing input in a restricted form of natural
language. For example, it is easy to perform string searches needed to identify
related terms, as well as to properly recognize singular and plural forms of
words.
On the negative side, the version of PROLOG used in the development of the
system did not include facilities for menu-oriented input or for graphical input
or output. Menu selection capabilities, which were considered to be absolutely
essential at certain points in the procedure, had to be specifically written into
the program. At other points, less satisfactory dialogue sequences were used
because of the amount of programming required to support a menu interface.
The lack of graphical input/output
facilities was also unfortunate.
Human
database designers routinely use diagrams for communicating
with users, and
there can be little doubt that such a capability would enhance the effectiveness
of VCS as well. Altogether, encoding of the knowledge base and the dialogue
management facilities required approximately 21,000 lines of PROLOG.
4.2.2 Original System. The original version of the View Creation System was
implemented on a 48-Mbyte Amdahl 5850 running the Michigan Terminal
System. The interpreter
was C-PROLOG.
Under normal system loads, the
performance was quite satisfactory with the system usually waiting for response
from the user, rather than vice versa. A typical design session for a user view
consisting of about six entities and the same number of relationships
took
approximately
1.5 hours to complete.
4.2.3 Current System. The View Creation System has since been transferred
to a microcomputer environment using Arity PROLOG. Among the reasons for
this change were a desire to make the system as portable as possible and a wish
to take advantage of a number of additional capabilities that are present in Arity
PROLOG. The Arity PROLOG system includes a compiler that permits a
significant performance improvement over the interpreted version. Currently it
is only possible to compile about 60 percent of the code because of memory
limitations. This restriction should disappear, however, when Version 5 of Arity
PROLOG becomes available. Performance of the compiled portion of the system
running on an IBM PC/AT-class microcomputer compares favorably with that
of the interpreted version on the mainframe.
Arity PROLOG also contains a collection of screen management predicates
that facilitate implementation
of a “Macintosh-like”
menu-oriented user interface
in place of the current one, which relies primarily on a question-and-answer
dialogue. Finally, Arity PROLOG has facilities for interfacing to other languages
and systems that will make it possible to eventually add graphical input and
output capabilities.
4.3 System Testing
In the testing phase of the View Creation System, the system was used by real
users in real organizations to create views for real tasks. The resulting outputsets of relations representing database views-was
examined by the researchers
ACM Transactions
on Database Systems, Vol. 13, No. 3, September
1988.
Creating User Views in Database Design
Table I.
User and application
.
323
Test Sessions
Evaluation
System modification
User 1
Type: Systems analyst
Application:
Training
database
The view contains one redundant
relation
and one relation
where the information requirements are not represented at
the correct level of detail
Two rules were added: (1) A new
relation should not be constructed to represent instanceof relationships; and (2) for relationships
A instance-of B,
determine whether any other
entity appearing in a relationship with B should be associated with A instead
User 2
Type:
Some familarity
with database concepts
Application:
Student-advisory database
The output is a normalized set of
relations, but does not totally
represent the user’s application because of the difficulty
the user had when identifying
entities and attributes
The system’s instructions
were
modified to highlight some of
the more subtle points
User 3
Type: Knowledgeable
data modeling
The output is correct and free of
any undesirable properties
Two minor modifications
to the
user interface were made based
on User 3’s suggestions
User 4
Type: Naive user
Application:
Origin-destination
database
for
movement of traffic
The view produced is small, but
correct and free of any undesirable properties
No modification
User 5
Type: Naive user
Application:
Equipment
database
The view is correct and free of
undesirable
properties,
but
does not reflect all the user’s
requirements because the user
failed to model one dimension
of the application
A rule was added that allows one
to distinguish what role subset
and superset entities of is-a
relationships play in other relationships
when they both
have the same primary keys
Users 6 and 7
Type: Naive users who designed a single view
Application:
Database for
insurance claims
The view does not represent the
users’ information
requirements because of the difficulty
the users had in identifying entities in their applications
No modification
User 8
Type: Learning database
design
Application:
Database for
software
maintenance
control
The view appropriately
reflects
User 8’s information
requirements and is free of undesirable properties
One rule was added: When a relationship A have/has B (with
attributes) is converted to an
entity Adtnve/hm~B
and a
shorter name is not provided,
the entity name should later
be modified to reflect the appropriate
interpretation
of
have/has
in
and by database designers from cooperating organizations. When the system
failed to perform as expected, the responsible error or omission in the knowledge
base was identified and corrected.
Some of the users who participated in the testing had no prior exposure to
database concepts, whereas others had varying amounts of training and/or
experience. The test sessions are summarized in Table I.
ACM Transactions
on Database Systems, Vol. 13, No. 3, September
1988.
324
l
V. C. Storey and R. C. Goldstein
4.3.1 Results. The system performed best for users who knew something about
database design. For all these users, the design produced either was an accurate
representation of the user’s information
requirements, or highlighted a flaw in
the system’s knowledge base, which was subsequently corrected. The system
modifications were done immediately after each testing session so that the version
given to the succeeding user included that knowledge.
One potential criticism of this approach to testing is that it cannot be proved
that the system, and hence the methodology, has reached a correct “steady state”
where no further improvements are necessary. It may be noted, however, that no
major modifications were required after the testing session with User 5. All the
changes made after that point involved minor refinements to the knowledge base.
In general, these later changes were associated with capturing more of the
semantics of the application rather than correcting errors. The system, therefore,
did reach a reasonable degree of stability.
5. CONCLUSION
5.1 Summary
A methodology for generating user views has been formalized and expressed as a
set of rules that comprise the knowledge base of an expert View Creation System.
The methodology is based on the E-R model. Using this approach, a user’s
information
requirements
for a database are initially
expressed in terms of
entities, attributes, and relationships,
and later transformed into a set of normalized relations that represents the user’s database view.
The significance of this research lies in the insight it provides into the process
of database design through formalization
of part of the logical database design
task. In addition to providing a means for precisely expressing this formalization,
implementation
of the expert system has made it possible to experimentally
validate its adequacy and completeness. The primary contribution
of this research, however, lies in the rules and procedures that comprise the system’s
knowledge base.
5.2 Future Work
The original objective of this research was to develop a methodology for formalizing the creation of the user views that are essential input to most database
design procedures. This methodology, and the expert system implementing
it,
could, in principle, be used to design a complete database if a single user, or
group of users working together, could supply all the necessary information. This
is not, however, the way database design is done in real organizations. Rather,
different individuals or groups specify requirements in their own areas. Conflicting or inconsistent requirements must be identified and resolved, and then a
comprehensive design produced. This problem is usually referred to as view
integration. It appears that the appropriate point to introduce view integration
into the view creation methodology is after views have been expressed in E-R
form, but before they have been converted to sets of normalized relations. A
considerable amount of information
about the meaning of each view is acquired
ACM Transactions
on Database Systems, Vol. 13, No. 3, September
1988.
Creating User Views in Database Design
.
325
during the view creation process that is lost in the conversion to relational form.
The extension of VCS to include view integration is currently under way.
An interesting,
related problem concerns the reconciliation
of alternative
versions of the same user view. The availability
of an expert database designer
in economical software form makes it feasible to obtain database views from
several individuals concerned with a single application. Ideally, these views should
be identical, but as a practical matter, each individual is likely to have a slightly
different understanding
of the task from those of his or her colleagues. Thus,
obtaining several versions of a view might result in a more accurate or complete
representation
of the task than any one of them individually.
Detecting and
resolving differences among alternative versions of a view are special cases of the
view integration problem, so a solution to that problem will also solve this one.
Finally, it is believed that the effectiveness of VCS could be enhanced by
improving its user interface. One such improvement would involve the addition
of graphical input and output capabilities. Human database designers employ
various types of diagrams, both for communication
with users and to record the
evolving status of the design. A computerized expert system has its own techniques for the second of these tasks, but there is probably no effective substitute
for diagrams for aiding the user in understanding what is happening. In particular,
it seems likely that allowing the user to see an evolving E-R diagram would assist
him or her greatly in identifying missing entities and relationships.
We also propose to examine the feasibility of increasing the system’s “commonsense” knowledge. Although this is not, strictly speaking, a user interface
issue, it would appear to the user primarily as an enhancement in that area. The
current VCS has a very limited amount of general knowledge that permits it to
recognize that certain attributes are potential keys or that certain verbs imply
particular forms of relationships.
Much more could be done in this area to
produce a system that is less demanding of the user. This is a topic of current
research that should be applicable to the view creation problem.
Appendix A. Design Session
This section contains excerpts from a transcript of a design session. To make
the transcript more readable, it has been divided into sections as indicated by
the headlines that have been inserted. (These headlines do not actually appear
during a session.) Comments have also been added and are identified as text
between the symbols / * and * /.
A.1 Initialization
User? Please enter your name followed by a dot (.) and return.
I have to ask you to always respond in this manner,
i.e. give your answer followed by a dot and return.
Now enter your name please.
I: Veda.
Hello veda.
For what do you wish to design a database?
(If you need more than one word to respond,
characters.)
I: library-circulation.
connect the words with
ACM Transactions
underscore
on Database Systems, Vol. 13, No. 3, September
1988.
326
V. C. Storey and R. C. Goldstein
l
A.2 Entities
Tell me what things are of interest to you in a “library4rculation”.
I will refer to these “things” as ‘entities.’
I also want you to give me the ‘attributes’
of these “entities,”
i.e. the properties or characteristics that are of interest to you.
____-__--
E.g. AIRLINE-RESERVATION
ENTITIES
flight
passenger
aircraft
person
-- - -- - - - -
ATTRIBUTES
number cityfrom
city-to
name frequent-flyerlrtatus
id type max-passengers
name address phone
no-of-passengers
Note that you may use only one ‘word” for each attribute but the “word”
may consist of several wcrde connected by underscore characters.
If you enter something and then decide that it’s wrong, you can change it
later. Now please enter the name of an entity in its SINGULAR’
form.
I: borrower.
A.2.1 Entity Attributes
What are the attributes
for ‘borrower”?
Enter them separated only by a blank and with a period after the last one. Else enter
“end.” if you really do not know.
I: number name address phone book1 book2 bookb.
A.2.2 Repeating Attributes
Are “bookl”
“book2” “book3” different occurrences of “book”?
“yes.’ / “no.” / ‘why?
t why?
It sounds like ‘bookl’
“book2” “book3” all refer to the same “thing”-i.e.
A ‘thing” should be an entity.
Now, are “bookl’
“book2’ “book3” all occurrences of “book”?
‘yes.” / %o.” / ‘why?”
I: yes.
Then we shall make ‘book” an entity when we have finished with this one.
/* The system
ubook”. */
Vemembers’
that a relationship
‘book”.
will probably be required between ‘Borrower”
and
A.2.3 Candidate Keys
/’ uBorrower’
has two 4ey indicator attributes’,
namely humber”
and Same”.
suggests that the ueer consider these when identifying candidate keys. */
The system
Here are the attributes for:
borrower: [number, name, address, phone]
Enter an attribute or combination of attributes which could be used to identify UNIQUELY
an instance of the entity ‘borrower”.
I will call this attribute (or combination of attributes) a ‘key”. (If you really do not
know, enter “end.“)
In particular, you may wish to consider:
1) number 2) name.
I: number.
7 Entity names are requested in their singular form because the system will later use the form of an
entity (singular or plural) as it appears in a relationship to infer mapping ratios.
ACM Transactions
on Database Systems, Vol. 13, No. 3, September
1988.
327
Creating User Views in Database Design
Is there any other way that an instance of the entity ‘borrower”
could be identified?
“see.“, to see how I can identify it so far / ‘no.” / ‘yes.” / identifying
I: name address.
A.2.4 Multivalued
/* Since 4ook”
attribute(s)
Attributes
was a repeating attribute,
Is the term “book”
(‘yes.’ / %o.“)
I: yes.
in its SINGULAR
the system converts it to an entity.
*/
form?”
What are the attributes for ‘book”? / “end.”
Enter them separated only by a blank and with a period after the last one. Else enter
‘end.” if you really do not know.
I: catalogno title authorid volume publisher publisher-city
Any more attributes for “book”?
‘see.” / uyes.” / ‘no.” / new attributes
I: n.
Good.
Here are the attributes for:
book: [catalog-no, title, authorid, volume, publisher, publisher-city]
Can a single ‘book” have more than one value for any of these attribute(s)?
“yes.a / %o.” / “why?” / attribute(s) which can have more than one value / “end.”
if you do not know.
I: why?
If an attribute can have more than one value, then these values may be instances of
the same ‘thing’, i.e. an entity. Now consider
book: [catalog-no, title authorid,
volume, publisher, publisher-city]
Can any of these attribute(s) have more than one value for each occurrence of “book”?
‘yes. n / ‘no.” / attributes which can have more than one value / =end.” if you do
not know.
I: volume.
I will delete the attribute
‘volume”
and we will make it an entity later. . . .
and f) [TITLE,
/’ The ueer identifies two candidate keys f07 abook=: I) [CATALOG-NO]
AUTHOR-ID].
The system %emembers’ that a relationship will be required between abook’ and
%olume”. */
Now let’s deal with ‘volume”.
/*
The entity
“volume’
A.25 Attribute
is
..
added with attributes:
Name Is Entity
[COPY-NO,
title, author-name].
*/
Name
/’ Zn the following ezample, one entity is NOT needed in the identification
of a relationship is implied. */
of another.
The ezistence
Enter next entity (in its SINGULAR form) / ‘see.” / “end.”
I: library.
Enter the attributes for “library”.
/ ‘end.”
I: name address books.
I already know that “book” is an entity. Does this means that you will need to know
“book” before you will be able to identify UNIQUELY
an instance of “library”?
I: no.
Then you are really telling me that a relationship exists between “library” and ‘book”.
We’ll get to that in a minute. . . .
ACM Transactions
on Database Systems, Vol. 13, No. 3, September
1988.
328
l
V. C. Storey and FL C. Goldstein
/* One entity IS needed to identify
another.
*/
Enter next entity (in its SINGULAR form) / “see.” / “end.”
I: branch.
Enter the attributes for “branch” / “end.”
t name address library.
I already know that “library” is an entity. Does this mean that you will need to know
“library”
before you will be able to identify UNIQUELY
an instance of “branch”?
I: yes.
Fine. . . .
/’
The entity
%anchD is added with attributes:
[NAME,
LIBRARY,
address].
*/
A.2.6 Generated Identifiers
Enter next entity / %ee.” / ‘end.”
t librarian.
Enter the attributes for “librarian.”
/ “end.”
I: end.
Then should the entity ‘librariar?
exist?
(‘yes.’ / ‘no.” )
I: yes.
Then I shall make an attribute ‘librarianid.”
Enter next entity
I: employee . . .
A.2.7 Entity
(in its SINGULAR
form) / “see.” / “end.”
Modification
Here are the entities with their attributes and keys.
borrower: [NUMBER, name, address phone]
borrower: [NAME, ADDRESS, number, phone]
book: [CATALOG-NO,
author-id, title, publisher publisher-city]
book: [TITLE, AUTHOR-ID,
catalog-no, publisher publisher-city]
Do you wish to add, delete or change (rename, modify
entities?
(‘no.” / “add.” / =delete.” / “change.“)
I: add.
Enter the name of the entity to be added:
I: person.
Enter the attributes for “person” / “end.”
Any more modifications?
( ‘see.” / “no.” / “add.”
I: see.
/ ‘delete.”
attributes
/ %hange.‘)
borrower: [NUMBER, name, address phone]
borrower: [NAME, ADDRESS, number, phone]
book: [CATALOG-NO,
title, authorid,
publisher, publisher-city]
book: [TITLE, AUTHOR-ID,
catalog-no, publisher, publisher-city]
volume: [COPY-NO, title, author-name]
library: NAME, address]
branch: [NAME, LIBRARY, address]
librarian: [LIBRARIAN-ID]
ACM Transactions
on Database Systems, Vol. 13, No. 3, September
1983.
/ keys) any of the
Creating User Views in Database Design
employee: [NUMBER, name, address, years-ofservice]
employee: [NAME, ADDRESS, number, years-of-service]
person: [NAME, ADDRESS, phone, date-of-birth]
Do you wish to add, delete or change (rename, modify attributes
entities?
(‘yea.” / ‘no.’ / ‘delete.” / “change.‘)
t no.
Fine.
329
/ keys) any of the
A.2.8 Missing Entities
For the entity “book” I have an attribute ‘authorid”.
It sounds like “author” should aleo be an entity in your database.
Should I make it an entity?
(=yes.” / ‘no.” / “why?“)
t why?
An attribute that ends in “id” is often used to identify something. If that is the case
here, then that something haa not been explicitly defined. Should I make “author”
an entity?
(=yes.’ / ‘no.‘)
t y.
Enter the attributes for =author’. / “end.”
t id.
Any more attributes for ‘author”?
( ‘see.” / =yes.” / ‘no.’ / new attributes)
t n.
Would you like to include the attribute ‘authorname”?
(‘yes.” / %o.T / “why?“)
t w.
I know “author-name’
ie an attribute of at least one other entity, e.g. “volume”.
it starts with “autho?, I thought that it could be an attribute of “author”.
Do you wish to add ‘authorname”
(‘yes.‘/%o.“)
t yes. . . .
as an attribute
Since
of “author”?
A.3 Relationships
You have a ‘library-circulation’
with the things you told me about. I need to know
what happens with all of these things an how they interact with each other.
flights have aircrafts
passenger is-a pereon
paesengere reservefor lights
Please enter your information with three words (may use underscores) on each line
followed by a dot and return. From now on I will refer to the information that you
are going to enter ae ‘relationships”
Note that you should use both SINGULAR and PLURAL forms of the entity names.
This is important, so think carefully about how many instances of one entity can
occur in relationship to another. If you make a mistake, you can change it later.
Fir& would you like to see you entities again?
(‘yes.‘/%o.“)
t n.
ACM Transactions
on Database Systems, Vol. 13, No. 3, September
1988.
330
V. C. Storey and R. C. Goldstein
’
Fine. Enter a relationship
of the form ‘A verb B”.
. ...
A.3.1
Unidentified
A and B Values
/’ In the first of the following two examples the verb phrase is ahas’ and =BB” is the name of an
attribute. In the second ezample, aA” is unknown to the system. */
Enter next relationship
I: book has title.
/ =see.” / “end.
/ =see-ent” (to see entities)
Should “title” be an entity?
(Le. a “thing” of interest in your database.)
(Yyes.n / =no.“)
I: n.
Fine, then I do not need the relationship
=book”.
~&;~;;t;;~~n;&~,“,8/
=see.” / ‘end.”
because I know ‘title”
is an attribute
of
/ =see-ent’ ’ (to see entities)
Is =students” a new entity? You haven’t mentioned it before.
(=yes.I) / “no.“)
I: y.
Please give me the singular form for =students” / ‘end.”
I: student.
Enter the attributes for =student’ / “end.“. . .
A.3.2
Mapping
Ratios
/* 1. System infers all mapping ratios for relationships
A (1,l) and B (OJ). */
Enter next relationship / ‘see.”
I: borrower is-a person.
Enter next relationship
/ ‘end.’
of the form
/ ‘seeant.’
#A is-a B’. Mapping
ratios are
(to see entities)
...
/* 6. System makes an inference about the mapping ratioa when the relationship
(singular) verb phrase B (singular)‘.
*/
is of the form
aA
Enter next relationship / ‘see.” / ‘end.’
I: librarian directs library.
Is there
(‘yes.’
t YIs there
(‘yes.’
t y.
‘librarian
directs library”
at least one “librarian”
for every ‘library’?
/ ‘no.‘)
at most one ‘librarian’
/ =no.D)
for every ‘library”?
/* Since both Vibmrian” and ?ibmry# are in their
one and only one library. Therefore, the system
(1,l). It then queried fhe ueer for the min/maz
relationship will be represented by adding the key
singular forma, this implies that a libmrian directs
inferred that the min/maz values for libmrian are
values of library. Baaed on this information,
the
of “librarian= as a foreign key of ‘Library’.
*/
/* 3. System queries user for all the mapping mtios.
p;;;;e;t;;eJ:“/
‘see.” / ‘end.”
“libraries
ACM Transactions
*/
/ ‘seeant.”
(to see entities)
have branches=
on Database Systems, Vol. 13, No. 3, September
1988.
331
Creating User Views in Database Design
Is there
(‘yes.’
I: y.
Is there
(‘yes.”
I: Il.
at least one “branch”
/ ‘no.“)
for every “library”?
at most one ‘branch’
/ ‘no.‘)
for every “library?”
Is there
(‘yes.*
I: y.
Is there
(‘yes.”
I: y.
at least one “library”
/ ‘no.“)
for every “branch”?
at most one ‘library”
/ ‘no.‘)
for every “branch?”
/’ Hen, the min/maz values are A (l,N) and B (l,l),
i.e. each library can have one to many
branchca; each branch has one and only one library. This relationship will be represented by adding
the key of libmry aa a foreign key of branch. */
A.3.3
Relationship
Attributes
I: authors write books.
Is there at least one “book”
(=yeam / ‘no.‘)
I: y.
Is there at most one ‘book”
(=yes.” / yno.D)
I: n.
Is there
(‘yes.’
I: y.
Is there
(‘yes.’
I: n.
for every Uauthor”?
for every ‘author?”
at least one =authof
/ %o.“)
for every “book”?
at most one “authop
/ ‘no.“)
for every “book?”
/* Here, the min/maz
values are: A (l,N)
and B (l,N).
*/
An attribute of a relationship is a property or characteristic
is of interest to you.
E.g. Relationship:
#passengers reserve-for flights”
Acceptable response: “class’.
of the relationship
which
Note that %&ss~ is not an attribute of ‘passenger” or “flight” but is specifically an
attribute of the relationship:
“passengers reservefor flights”.
When I add certain relationships I will ask for corresponding attributes. Relationship
attributes may not always exist.
Enter attributes for ‘authors
I: title date-completed.
write books” / =end.”
/* The system checks that a relationship attribute has not previously been identified as an attribute of one of the entities. If this situation occurs, the system informs the user of the inconsistency
and asks whether the attrifute should belong to the entity or the relationship.
*/
I already know that “title” is an attribute of “book”.
Think carefully. Is “title” an attribute (i.e. a property
a) the entity “book’
b) the relationship “authors write books”?
(‘a.* / ‘b.” / “no.” if you do not know.)
I: a.
Fine.
Relationship attribute Udate-completed’
added.
ACM Transactions
or a characteristic)
of:
on Database Systems, Vol. 13, No. 3, September
1988.
332
l
V. C. Storey and R. C. Goldstein
/* System checks for multivalued
relationship
Can a single occurrence of ‘authors
following attribute(s)?
[date-completed]
I: n.
attributes.
write books”
*/
have more than one value for the
A.3.4 Prompts for Missing Relationships
/’
The system first checks for
%olo entities’4.c.
entities
that do not appear in any relationship.
*/
I have the entity ‘employee’ but it does not occur in any relationship.
to add a relationship(s)
which includes the entity yemployee’?
(‘yes.” / #no.* / new relationship of the form “A Verb B” / “end.“)
I: librarian is-an employee.
/* The user is prompted for a relationship between %ook# and %olumc’
originally a multivalued attribute of ‘book’ that wae converfed to an entity.
Would you like
because bolume”
*/
was
Is there a relationship between “book” and “volume”?
(‘yes.- / ‘no.’ / ‘why?” / new relationship of the form ‘A Verb B” / “end.“)
t why?
When you gave me the entity ‘book” you said that ‘volume” was one of its attributes.
That tells me that ‘book” and “volume” should be related. Is there a relationship
between ‘book’ and “volume”? . . .
I: books have volumes. . . .
/’ The ueer originally tried to add %ok’
relationship between %bmry’ and %ook’.
ae an attribute
*/
of ‘library”
which implies that there is a
Is there a relationship between ‘library’
and “book”?
(‘ye~.~ / ‘no.” / ‘why?” / new relationship of the form “A Verb B” / ‘end.“)
t why?
Earlier you wanted to add a variation of ‘book” as an attribute of “library”.
Since
‘book” was also an entity, you were really implying the existence of a relationship
between “library”
and ‘book’.
Now, is there a relationship between ‘library”
and
‘book”? . . .
I: libraries have books. . . .
/* The user has provided a a relationship between %ok’ and aauthor’ eo the system does not have
to prompt for it. ‘Author-id’
woe orginally an attribute of ‘book’. Since aauthor’ became an entity,
the original association between Sot’
and ‘author’ through this attribute suggests that there is a
is no longer needed as
relationehip befwccn the two. The system informs the wer that ‘author-id’
an attribute of 3ook’ and deletes it. Candidate keys for Book’ are reconsidered. */
Since I know that ‘book” and ‘authof
are somehow related, I no longer need Uauthorid’
as an attribute of “book” so I shall delete it.
Earlier you said that ‘author-id”
was (part of) an identifier for ‘book’ so we will now
have to go back and reconsider the possible identiliers for “book”. . .
/*
The entity
%ook’ ends up with only one candidate key: [CATALOG-NO].
*/
/* l Author-name’
is an attribute of the entity ‘volume’ but there ie no relationship between aauthor”
and bolume’.
The system prompts the wer for one but, since relationships e&t between 3ook’
and aauthor’ and %ooka and bolume’, a new relationship is not necessary. */
Do you wish to provide a relationship
(‘yes.” / ‘no.” / “why?“)
ACM Transactions
between ‘volume”
on Database Systems, Vol. 13, No. 3, September
1988.
and uauthor”?
333
Creating User Views in Database Design
I: why?
You have an attribute of the entity %olume” which contains some form of the term
Uauthor”. Since ‘autho? is also an entity, this suggests there could be at least one
relationship between “volume” and “author”. Do you wish to provide a relationship
between ‘volume’ and ‘author”.
. .
I: no.
Then should ‘author-name”
be an attribute of ‘volume’?
(‘yes.” / ‘no.” / “why?“)
I: why?
Having ‘authorname”
as an attribute of ‘volume” implies that a relationship could
exist between ‘volume” and “author”.
Since you did not provide a relationship I
was wondering if ‘author-name”
should really be an attribute of “volume”. Now,
should ‘authorname”
be an attribute of Qolume”?
(‘yes.’ / ‘no.“)
I: no.
Then I shall delete ‘author-name”
as an attribute of ‘volume”.
A.35
Relationship
Modification
RELATIONSHIPS
borrower isa person
employee8 are people
student8 borrow books (date-due]
students are people
librarian direct8 library
libraries have branches
authors write books [date-completed]
librarian is-an employee
books have volumes
borrowers borrow books Idate-due]
libraries have books [date-acquired]
End relationships.
Do you wish to add or delete any of the relationships?
I: no.
A.3.6 Have/Has
. . .
Relationships
/’ ‘Wave/hoe’ verb phrasea are modified to relect the appropriate
possession, component-of, instance-of or associated-with.
*/
interpretation
of 8Have/has”--i.e.
“books have volumes.”
Which of the following is true?
a) “book” owns / possesses ‘volume’
b) ‘volume” is a component of ‘books
c) ‘volume’ is an instance / example of ‘book”
d) “volume” associated with ‘book’ (in some other way)
(‘a.’ / “b.” / ‘c.” / ‘no.’ (if you do not know.)
t c.
/* The relationship %oke have volumes” is changed to %olume instance-of book’. */
/* The system checke that entities involved in %nstance-of* relationships are represented at the correct level of detail. */
You have previously
books’.
indicated
that ‘%olume instance-of
ACM Transactions
book’
and “borrowers
borrow
on Database Systems, Vol. 13, No. 3, September
1988.
334
’
V. C. Storey and R. C. Goldstein
Which relationship(s)
should I have:
a) borrowers borrow volume
b) borrowers borrow books
c) both
ua.a / “b.” / %.” / 910.” if you do not know.
I: a.
Fine, then I shall delete “borrowers borrow books”
ume.” . . .
and add “borrowers
borrow
/* A similar procedure is carried out for the relationship “students borrow books’.
becomes Wudents borrow volume’. */
/* Zibraries have books’ becomea 5ook component-of library”.
*/
A.4 Modifications:
vol-
The relationship
First
/* The user is given an opportunity
ships. */
to modify the current
set of entities,
attributes
and relation-
A.5 Synonyms
/* Relationships:
1) students borrow volume and i?) borrowers borrow volume.
Which of the following
is true about a ‘student’
“/
and a ‘borrower”?
a) They refer to exactly the same thing.
b) ‘student’
is a subset of ‘borrower”
(i.e. every “student” is a ‘borrow&
but not every ‘borrower”
is a “student”)
c) “borrower”
is a subset of ‘student”
(i.e. every ‘borrowe?
is a “student’ but not every ‘student” is a “borrower”)
d) None of the above or you do not know.
$.’
/ ‘b.” / =c.‘/ =d:)
/*If the rerponse was =aO”,the user would have been asked which entity should be used; if the response
was pb’, the relationship
Wudent is-a borrower’ would have been added; if the response was “c”,
the relationship ‘%orrower is-a student’ would have been added. */
A.6 Inherited Attributes
Since “librarian is-an employee”, we should be able to attribute to “librarian”
the attributes of ‘employee” (whether or not you are interested in them).
it be wrong to attribute any of the following attribute(s)
to “librarian”?
1) number 2) name 3) address 4) years-of-service
(‘yes.” / uno.“)
I: no.
Fine. . . .
A.7 Modifications:
all of
Would
Last
/* The user is given a final opportunity
tionships. */
to make modifications
to the entities,
attributes
and rela-
A.8 Primary Keys
/* Before the selection of primary keys is made, some attribute
names in order to make the attribute name3 unique. */
names
are prefized
by their
I have adjusted some attribute names to make then unique. Here is your new set of
entities.
borrower: [BORROWER-NUMBER,
borrowername,
borrower-address,
borrower-phone]
ACM Transactions
on Database Systems, Vol. 13, No. 3, September
1988.
entity
Creating User Views in Database Design
borrower: (BORROWER-NAME,
rower-phone]
A.8.1
Component-of
/* The relationship
“libraries
BORROWER-ADDRESS,
borrowernumber,
335
bor-
Relationships
have books s became Sook component-of
library”.
*/
Here is what, I have for the entity “book”
book: [CATALOG-NO,
publisher, publisher-city,
book-title]
I was wondering if:
[CATALOG-NO]
an instance of “book” or do I need to add:
is sufficient to identify UNIQUELY
[LIBRARY-NAME]
If you respond with “yes.” I will add the attribute(s).
(‘yes.” / “no.” / “why?“)
I: w.
Since “book component-of library” I thought that I might need to know “library”
before I could UNIQUELY identify “book”.
Now, consider the following for “book”
book: [CATALOG-NO,
publisher, publisher-city,
book-title]
Does:
(CATALOG-NO]
UNIQUELY
identify ‘book” or do I need to add:
[LIBRARY-NAME]
If you respond with ‘yes.” I will add the attribute(s).
(‘yes.’ / ‘no.“)
I: no.
Fine. . . .
/* The system has selected primary keys without interaction with the user. Note: 1) The key of
bolume’
has been augmented by the key of ‘book’ as a result of the %&once-ofrelationship
that ezists between the two entities. The attribute Yitle’ has been deleted from %olume’ because
%olume’ can inherit thin attribute from Sook#. g) The primary key of %anch” now contains the
key of Ilibmry” because the attribute qibmry” has been replaced by its primary key. 3) The attribute
=phone’ of borrower’
has been deleted because it can be inherited from aperaonB through the “is-a”
relationship aborrower is-a person’. 4) The key of ‘person” appears aa a foreign key of ‘Borrower”
due to the way the ayatcm considera ‘?a-a” nlationahipa when selecting primary keys. */
borrower: [BORROWER-NUMBER,
person-address, personname]
book: [CATALOG-NO,
publiiher, publisher-city,
book-title]
volume: [CATALOG-NO,
Cow-No]
library: LIBRARY-NAME,
library-address]
branch: BRANCH-NAME,
LIBRARY-NAME,
branch-address]
author: AUTHORID,
author-name]
A.9 Relations
/* The system constructs relations. Each entity ia represented by constructing a relation where the
key attributes of the entity Serve aa the relation’s key and the non-key attributer of the entity aa the
relation’s non-key attributes. A relationship ia represented by either: 1) adding the key attributes
of one entity as non-key attributes of another or g) constructing a new relation whose key is the
concatenation of the keys of the entities involved. */
ACM Transactions on Database Systems, Vol. 13, No. 3, September 1988.
336
9
V. C. Storey and R. C. Goldstein
authors-write-books:
[AUTHOR-ID,
CATALOGNO,
date-completed]
students-borrow-volume:
[STUDENT-NUMBER,
CATALOGNO,
COPY-NO,
borrower: [BORROWER-NUMBER,
personname, person-address]
book: [CATALOG-NO,
publisher, publisher-city,
book-title]
volume: [CATALOG-NO,
copy-No]
date-due]
A.1 0 Functional Dependencies
A.10.1 Partial Functional Dependencies
branch: [BRANCH-NAME,
LIBRARY-NAME,
branch-address]
Which of the following do I need before I know the “branch-address”?
a) branch-name
b) libraryname
c) branch-name &r libraryname
‘a.” / “b.” / ‘c.” / ‘no.’ if you do not know.
kc. . . .
A.10.2 Transitive Functional Dependencies
For each of the following, identify any non-key attributes (written in lower case) which
depend directly upon other non-key attributes rather than the key attributes.
Respond with: “non-key(s) depends-on non-key(s)“.
Separate the non-key names by only a blank.
E.g. flight: [FLIGHT-NO,
cityfrom,
city-to, departure-airport]
Acceptable response:
‘departure-airport
depends-on cityfrom
city-to.’
Note that this means that knowing the values for “city-from”
and “city-to”
implies
that one will know what the departure-airport
is.
Consider the following:
Book: [CATALOG-NO,
publisher, publisher-city,
book-title]
horn this lit enter any non-key attribute(s)
which depend on other non-key
tribute(s) / “end.’
I: end.
What about “publisher” and “publisher-city”?
Enter any “non-key attribute(s) which depend-on other non-key attribute(s)
Respond with ‘non-key(s) depends-on non-key(s).”
I: publisher-city
depends-on publisher.
/ “end.”
/’ Relations produced are:
Book: [CATALOG-NO,
publisher, book-title]
Publisher: [PUBLISHER,
publisher-city].
*/
A.1 1 End Session
That’s all I have to ask you about.
Here is what I have:
student: [STUDENT-NUMBER,
department, person-name, person-address]
author: [AUTHOR-ID,
authorname]
person: [PERSON-NAME,
PERSON-ADDRESS,
person-phone, date-of-birth]
employee: [EMPLOYEE-NUMBER,
years-of-service, personname, person-address]
librarian: [LIBRARIAN-EMPLOYEE-NUMBER]
branch: [BRANCH-NAME,
LIBRARY-NAME,
branch-address]
ACM Transactions
on Database Systems, Vol. 13, No. 3, September
1988.
at-
337
Creating User Views in Database Design
library: [LIBRARY-NAME,
library-address,
librarian-employeenumber]
volume: [CATALOG-NO,
COPY-NO]
book: [CATALOG-NO,
publisher, book-title]
borrower: [BORROWER-NUMBER,
personname, person-address]
authors-write-books:
[AUTHOR-ID,
CATALOGNO,
date-completed]
students-borrow-volume:
[STUDENT-NUMBER,
CATALOGNO,
COPY-NO, date-due]
borrowers-borrow-volume:
[BORROWER-NUMBER,
CATALOG-NO,
COPY-NO, date-due]
publisher: [PUBLISHER,
publisher-city]
End Relations
/’ Zibmria#
no longer needa ifs generated key attribute
superset entity, Smployee’.
*/
since it has adopted the primary
key of its
ACKNOWLEDGMENTS
We would like to thank our anonymous referees for their helpful comments on
an earlier version of this paper.
BIBLIOGRAPHY
1. BALDISSERA, C., CERI, S., PALEGATTI, G., AND BRACCHI, G. Interactive and formal specification
of user’s views in data base design. In Proceedings of the International
Conference on Very Large
Data Bases (Rio de Janiero, 1979), pp. 262-272.
2. BRACCHI, G. Methodologies
and tools for logical database design. In Database Management:
Theory and Applications, C. W. Holsapple and A. B. Whinston, Eds. Reidel, Hingham, Mass.,
1981, pp. 59-86.
3. BRACHMAN, R. J. What is-a is and isn’t: An analysis of taxonomic links in semantic networks.
Computer (Oct. 1983), 30-36.
4. CHEN, P. P.-S. The Entity-Relationship
model-Toward
a unified view of data. ACM Trans.
Database Syst. 1, 1 (Mar. 1976), 9-36.
5. CLARK, K., AND MCCABE, F. PROLOG: A language for implementing expert systems. Tech.
Rep. Dot 80/21, Imperial College, Univ. of London, 1980.
6. COELHO, H. The art of knowledge engineering with PROLOG. INFOLOG Pr06, Fat. Ciencias,
Univ. Lisboa, Portugal, 1983.
7. DATE, C. J. An Introduction
to Database Systems. Vol. 1, 4th ed. Addison-Wesley,
Reading,
Mass., 1986.
8. HAMMOND, P. Logic programming for expert systems. Tech. Rep. Dot 82/4, Dept. of Computing,
Imperial College of Science and Technology, Univ. of London, 1982.
9. HOWE, D. R. Data Analysis for Data Base Design. Arnold, London, 1983.
10. MARTIN, J. An End User’s Guide to Data Bases. Prentice-Hall,
Englewood Cliffs, N.J., 1981.
11. NAVATHE, S. B., AND ELMASRI, R. Integrating user views in database design. Computer (Jan.
1986), 50-62.
12. NAVATHE, S. B., AND GADQIL, S. G. A methodology for view integration in logical database
design. In Proceedings of the 8th International
Conference on Very Large Data Bases (Mexico
City). 1982, pp. 142-164.
13. NAVATHE, S. B., AND SCHKOLNICK, M. View representation
in logical database design. In
Proceedings of the ACM-SZGMOD
International
Conference (Austin, Tex., May 31June 2,1978).
ACM, New York, 1978, pp. 144-156.
14. PARSAYE, K. Database management, knowledge base management and expert systems development in Prolog. ACM-SZGMOD
Database Week for Business and Office Applications (San Jose,
Calif., May). ACM, New York, 1983, pp. 159-178.
15. RAVER, N., AND HUBBARD, G. U. Automated logical data base design: Concepts and applications.
IBM Syst. J. 16,3 (1977).
16. SMITH, J. M., AND SMITH, D. C. P. Database abstractions: Aggregation and generalization.
ACM Trans. Database Syst. 2,2 (June 1977), 105-133.
ACM Transactions
on Database Systems, Vol. 13, No. 3, September
1988.
338
.
V. C. Storey and FL C. Goldstein
17. STOREY, V. C. View Creation:An Expert System for Database Design. Ph.D. dissertation, Faculty
of Commerce and Business Administration,
Univ. of British Columbia, Vancouver, B.C., Canada,
Oct. 1986, ICIT Press, 1988.
18. TSICHRITZIS, D., AND LOCKOVSKY, F. Data Models. Prentice-Hall,
Englewood Cliffs, N.J., 1982.
Received December 1986; revised October 1987; accepted November
ACM Transactions
on Database Systems, Vol. 13, No. 3, September
1988.
1987