Given a database, we define a datum/proposition to be a structure

advertisement
Given a database, we define a datum/proposition to be a structure consisting of an
n-ary relation from the schema and n objects from the domain. If r is an n-ary relation
and a1, ..., an are entities in a domain, we write r(a1, ..., an) to denote the proposition that r
“holds” of the objects a1, ..., an.
The Herbrand Base for a database is the set of all propositions that can be formed
from the relations in the schema and the entities in the domain. Example - for domain D
= {a, b} and schema {p, q} where p has arity 1 and q has arity 2, the Herbrand base is
{p(a), p(b), q(a,a),q(a,b),q(b,a),q(b,b)}.
An instance of a database is a finite subset of its Herbrand base.
In the standard notation for relational logic, variables begin with upper case
letters, and constants begin with lower case letters or digits; so all of the arguments here
are variables.
Here, p and p1, …, pk are names for relations, and the various tij are variables or
constants. p(t1,…,tn) is called the head of the rule, and the expressions p1(t11,…,t1n),…,
pk(tk1,…,tkn’) are called subgoals.
p(t1,…,tn) :- p1(t11,…,t1n),…, pk(tk1,…,tkn)
§3.3 Proposition Databases
It is possible to develop the full theory of relational logic based on just objects
and relations. In fact, this is typically the way it is done in courses on databases.
However, the theory is somewhat simpler if we consider instead a slight variation on the
relational model.
The essence of this variation is the notion of a proposition. Intuitively, a
proposition is an n-ary relation together with n objects. It is a tuple together with a
relation.
Clearly, this version and the relational model are isomorphic. For every …
The reason this version is desirable is that it allows us to think of a database as a
set of propositions rather than a collection of relations, each of which is a set of tuples.
Although they are equivalent, “flattening” the notion of a database makes it easier to
think about and define and analyze many of the concepts of deductive databases. For this
reason, we use the propositional model in what follows. It should, however, be
understood that everything applies equally well to the relational model.
As an example, consider the teaches relation. We can think of the extension
illustrated above as a set of data, as shown below.
{teaches(donna_daring,arch101),
teaches(bill_boring,comp101),
teaches(cathy_careful,comp225),
teaches(gary_grump,comp235),
teaches(oren_overbearing,comp257),
teaches(helen_heavenly,comp310)}
A datum for a database with domain D and schema {r1,...,rn} is an n+1-tuple consisting
of an n-ary relation constant ri and n objects from D.
The Herbrand base for a database is the set of all data that can be formed from the
domain and the schema.
A database instance is a finite subset of the Herbrand base of the database.
§2.2 Objects
The objects. Including numbers, strings, and complex structures of various sorts.
The basis for a relational model is a conceptualization of an application area in
terms of the objects presumed or hypothesized to exist in that application area. The
notion of an object here is quite broad. Objects can be concrete (e.g. people, computers,
buildings) or abstract (e.g. courses, numbers, sets). Objects can be primitive (e.g.
individuals) or composite (e.g. organizations). Objects can even be fictional. In short, an
object can be anything about which we want to say something.
§2.2 Domains
Not all applications require that we consider all objects in the application area. In
some cases, only some of these objects are relevant. For example, in a directory for a
university, we would expect to see references to people, offices, buildings, phone
numbers, email addresses; but we would not expect to see information about the
university’s courses, its computers, its vehicles, and its financial transactions. The set of
objects of interest in a particular application is called the domain or, sometimes, the
universe of discourse.
§2.3 Relations
A relation is a property of individual objects or combinations of objects in an
application area. In the university context, consider the relations student, teaches, and
gradesheet. The student relation is a property of a person that holds if and only if that
person is a student. The teaches relation holds of a faculty member and a course that
holds if and only if the faculty member teaches the course. The gradesheet relation holds
of a course, a student, and a grade if and only if the student received the grade in the
course.
The arity of a relation is the number of objects involved in any instance of that
relation. For example, every instance of the teaches relation involves two objects, viz. a
faculty member and a course; therefore, it has arity 2. Arity is an inherent property of a
relation and is the same in every state of the world.
The extension of a relation in a particular state of the world is the set of all objects
or combinations of objects that satisfy that relation in the given state of the world.
The cardinality of an extension of a relation is the number of objects or
combinations of objects that satisfy the relation in a particular state of the world. Unlike
arity, the cardinality of a relation can change as the state of the world changes.
In the theory of databases, it is common to conceptualize the extension of a
relation as a table. The number of columns in the table corresponds to the arity of the
relation and the number of rows corresponds to the cardinality of the extension.
For example, the extension of the relation teaches is a set of pairs of faculty
members and courses, one pair for each faculty member and each course that that faculty
member teaches. If we consider a state of the world in which there are 6 pairs of faculty
members and courses that they teach, we can visualize this extension as a table with 2
columns and 6 rows, as shown below. Here, the extension has arity 2 and cardinality 6.
teaches
donna_daring
bill_boring
cathy_careful
gary_grump
oren_overbearing
helen_heavenly
arch101
comp101
comp225
comp235
comp257
comp310
The student relation is a property of a person in and of itself, not with respect to
other people or other objects. Since there is just one object involved in any instance of
the relation, the table has just 1 column. By contrast, there are 12 rows, one row for each
student. In this case, the extension has arity 1 and cardinality 12.
student
aaron_aardvark
belinda_bat
calvin_carp
george_giraffe
kitty_kat
minnie_mouse
patty_panda
rory_rhino
sally_squirrel
tony_tuna
wanda_wolf
zack_zebra
The gradesheet relation is a relation among students and courses and grades; and
so the table requires three columns, as shown below. In this case, we have an extension
with arity 3 and cardinality 4.
gradesheet
arch101 aaron.aardvark
arch101
calvin.carp
comp101 aaron.aardvark
comp101 sally.squirrel
a
b
a
a
One thing that our tabular representation for relations makes clear is that, for a
finite domain, there is an upper bound on the number of possible extensions for an n-ary
relation. In particular, for a universe of discourse of size b, there are bn distinct n-tuples.
Every n-ary relation is a subset of these bn tuples. Therefore, an n-ary relation must be
one of at most 2^(bn) possible sets.
One thing to bear in mind is that the order of columns in a relational table is
crucial. It would make no sense, for example, to put a student or a course in the third
column of the grade table. In some case, it might. More problematic
Order of rows does not matter.
In fact, we sometimes talk about the extension of a relation as a set of tuples, each
tuple representing a single row of the corresponding table.
As an example, consider the teaches relation. We can think of the extension
illustrated above as a set of 2 tuples, as shown below.
{donna_daring, arch101, bill_boring, comp101, cathy_careful, comp225,
gary_grumpcomp235, oren_overbearing, comp257, helen_heavenly, comp310}
The generality of relations can be determined by comparing their extensions. For
example, the student relation is less general than the person relation, since its extension is
always a subset of the extension of the person relation. Note that it is possible for the
extensions of some relations to be empty, and it is possible for the extensions of some
relations to consist of all n-tuples over the domain.
Before leaving our introduction to relations, we need to look at the similarities
and differences between the definitions given here and the definitions given elsewhere.
In mathematics, a relation is defined to be a set of tuples. This has two
consequences. First of all, a mathematical relation can never correspond to different sets
of tuples in different states of the world. Those would be different relations. Second, if
two sets of tuples are the same, they are the same mathematical relation. In the version
presented here, it is possible for a single relation to have different extensions in different
states of the world. Moreover, it is possible for two distinct relations to have the same
extension without being the same relation.
In non-logical treatments of databases, relations are defined in a different way
(though all algorithms and results remain the same).
§2.4 Schemas
A schema for a database is a finite set {r1, ..., rn} of relations with associated
arities.
§2.5 Database Instances
An instance of a database with domain D and schema {r1, ..., rn} is a mapping that
gives an extension for each relation, i.e. that assigns each n-ary relation ri to a subset of
Dn.
This worries me. Is it an instance if it does not satisfy the constraints? It is an
instance of something but maybe not an instance of the database.
§2.6 Static Constraints
A static constraint for a database is an arbitrary set of instances of the database.
Functional dependencies. gradesheet
Type constraints. Gradesheet – student, course, grade
Inclusion dependencies.
Referential integrity constraints – combination of functional dependency and
inclusion dependency.
Many different types of constraints in subsequent chapters.
§2.7 Database Histories
A database history is a sequence of instances.
§2.8 Dynamic Constraints
A dynamic constraint for a database is a set of histories.
Example.
Preferences.
§2.9 Relational Nets
The domain of the database is a set of objects.
Or should we do constraints structurally rather than in terms of sets?
Version 1 - The static constraints on a database can be defined implicitly via a
relational net. The dynamic constraints can be modeled via a relational automaton.
Version 2 - We can also write both static and dynamic constraints as formulas in
Logic / GDL.
Download