Given a database, we define a datum/proposition to be a structure consisting of an n-ary relation from the schema and n objects from the domain. If r is an n-ary relation and a1, ..., an are entities in a domain, we write r(a1, ..., an) to denote the proposition that r “holds” of the objects a1, ..., an. The Herbrand Base for a database is the set of all propositions that can be formed from the relations in the schema and the entities in the domain. Example - for domain D = {a, b} and schema {p, q} where p has arity 1 and q has arity 2, the Herbrand base is {p(a), p(b), q(a,a),q(a,b),q(b,a),q(b,b)}. An instance of a database is a finite subset of its Herbrand base. In the standard notation for relational logic, variables begin with upper case letters, and constants begin with lower case letters or digits; so all of the arguments here are variables. Here, p and p1, …, pk are names for relations, and the various tij are variables or constants. p(t1,…,tn) is called the head of the rule, and the expressions p1(t11,…,t1n),…, pk(tk1,…,tkn’) are called subgoals. p(t1,…,tn) :- p1(t11,…,t1n),…, pk(tk1,…,tkn) §3.3 Proposition Databases It is possible to develop the full theory of relational logic based on just objects and relations. In fact, this is typically the way it is done in courses on databases. However, the theory is somewhat simpler if we consider instead a slight variation on the relational model. The essence of this variation is the notion of a proposition. Intuitively, a proposition is an n-ary relation together with n objects. It is a tuple together with a relation. Clearly, this version and the relational model are isomorphic. For every … The reason this version is desirable is that it allows us to think of a database as a set of propositions rather than a collection of relations, each of which is a set of tuples. Although they are equivalent, “flattening” the notion of a database makes it easier to think about and define and analyze many of the concepts of deductive databases. For this reason, we use the propositional model in what follows. It should, however, be understood that everything applies equally well to the relational model. As an example, consider the teaches relation. We can think of the extension illustrated above as a set of data, as shown below. {teaches(donna_daring,arch101), teaches(bill_boring,comp101), teaches(cathy_careful,comp225), teaches(gary_grump,comp235), teaches(oren_overbearing,comp257), teaches(helen_heavenly,comp310)} A datum for a database with domain D and schema {r1,...,rn} is an n+1-tuple consisting of an n-ary relation constant ri and n objects from D. The Herbrand base for a database is the set of all data that can be formed from the domain and the schema. A database instance is a finite subset of the Herbrand base of the database. §2.2 Objects The objects. Including numbers, strings, and complex structures of various sorts. The basis for a relational model is a conceptualization of an application area in terms of the objects presumed or hypothesized to exist in that application area. The notion of an object here is quite broad. Objects can be concrete (e.g. people, computers, buildings) or abstract (e.g. courses, numbers, sets). Objects can be primitive (e.g. individuals) or composite (e.g. organizations). Objects can even be fictional. In short, an object can be anything about which we want to say something. §2.2 Domains Not all applications require that we consider all objects in the application area. In some cases, only some of these objects are relevant. For example, in a directory for a university, we would expect to see references to people, offices, buildings, phone numbers, email addresses; but we would not expect to see information about the university’s courses, its computers, its vehicles, and its financial transactions. The set of objects of interest in a particular application is called the domain or, sometimes, the universe of discourse. §2.3 Relations A relation is a property of individual objects or combinations of objects in an application area. In the university context, consider the relations student, teaches, and gradesheet. The student relation is a property of a person that holds if and only if that person is a student. The teaches relation holds of a faculty member and a course that holds if and only if the faculty member teaches the course. The gradesheet relation holds of a course, a student, and a grade if and only if the student received the grade in the course. The arity of a relation is the number of objects involved in any instance of that relation. For example, every instance of the teaches relation involves two objects, viz. a faculty member and a course; therefore, it has arity 2. Arity is an inherent property of a relation and is the same in every state of the world. The extension of a relation in a particular state of the world is the set of all objects or combinations of objects that satisfy that relation in the given state of the world. The cardinality of an extension of a relation is the number of objects or combinations of objects that satisfy the relation in a particular state of the world. Unlike arity, the cardinality of a relation can change as the state of the world changes. In the theory of databases, it is common to conceptualize the extension of a relation as a table. The number of columns in the table corresponds to the arity of the relation and the number of rows corresponds to the cardinality of the extension. For example, the extension of the relation teaches is a set of pairs of faculty members and courses, one pair for each faculty member and each course that that faculty member teaches. If we consider a state of the world in which there are 6 pairs of faculty members and courses that they teach, we can visualize this extension as a table with 2 columns and 6 rows, as shown below. Here, the extension has arity 2 and cardinality 6. teaches donna_daring bill_boring cathy_careful gary_grump oren_overbearing helen_heavenly arch101 comp101 comp225 comp235 comp257 comp310 The student relation is a property of a person in and of itself, not with respect to other people or other objects. Since there is just one object involved in any instance of the relation, the table has just 1 column. By contrast, there are 12 rows, one row for each student. In this case, the extension has arity 1 and cardinality 12. student aaron_aardvark belinda_bat calvin_carp george_giraffe kitty_kat minnie_mouse patty_panda rory_rhino sally_squirrel tony_tuna wanda_wolf zack_zebra The gradesheet relation is a relation among students and courses and grades; and so the table requires three columns, as shown below. In this case, we have an extension with arity 3 and cardinality 4. gradesheet arch101 aaron.aardvark arch101 calvin.carp comp101 aaron.aardvark comp101 sally.squirrel a b a a One thing that our tabular representation for relations makes clear is that, for a finite domain, there is an upper bound on the number of possible extensions for an n-ary relation. In particular, for a universe of discourse of size b, there are bn distinct n-tuples. Every n-ary relation is a subset of these bn tuples. Therefore, an n-ary relation must be one of at most 2^(bn) possible sets. One thing to bear in mind is that the order of columns in a relational table is crucial. It would make no sense, for example, to put a student or a course in the third column of the grade table. In some case, it might. More problematic Order of rows does not matter. In fact, we sometimes talk about the extension of a relation as a set of tuples, each tuple representing a single row of the corresponding table. As an example, consider the teaches relation. We can think of the extension illustrated above as a set of 2 tuples, as shown below. {donna_daring, arch101, bill_boring, comp101, cathy_careful, comp225, gary_grumpcomp235, oren_overbearing, comp257, helen_heavenly, comp310} The generality of relations can be determined by comparing their extensions. For example, the student relation is less general than the person relation, since its extension is always a subset of the extension of the person relation. Note that it is possible for the extensions of some relations to be empty, and it is possible for the extensions of some relations to consist of all n-tuples over the domain. Before leaving our introduction to relations, we need to look at the similarities and differences between the definitions given here and the definitions given elsewhere. In mathematics, a relation is defined to be a set of tuples. This has two consequences. First of all, a mathematical relation can never correspond to different sets of tuples in different states of the world. Those would be different relations. Second, if two sets of tuples are the same, they are the same mathematical relation. In the version presented here, it is possible for a single relation to have different extensions in different states of the world. Moreover, it is possible for two distinct relations to have the same extension without being the same relation. In non-logical treatments of databases, relations are defined in a different way (though all algorithms and results remain the same). §2.4 Schemas A schema for a database is a finite set {r1, ..., rn} of relations with associated arities. §2.5 Database Instances An instance of a database with domain D and schema {r1, ..., rn} is a mapping that gives an extension for each relation, i.e. that assigns each n-ary relation ri to a subset of Dn. This worries me. Is it an instance if it does not satisfy the constraints? It is an instance of something but maybe not an instance of the database. §2.6 Static Constraints A static constraint for a database is an arbitrary set of instances of the database. Functional dependencies. gradesheet Type constraints. Gradesheet – student, course, grade Inclusion dependencies. Referential integrity constraints – combination of functional dependency and inclusion dependency. Many different types of constraints in subsequent chapters. §2.7 Database Histories A database history is a sequence of instances. §2.8 Dynamic Constraints A dynamic constraint for a database is a set of histories. Example. Preferences. §2.9 Relational Nets The domain of the database is a set of objects. Or should we do constraints structurally rather than in terms of sets? Version 1 - The static constraints on a database can be defined implicitly via a relational net. The dynamic constraints can be modeled via a relational automaton. Version 2 - We can also write both static and dynamic constraints as formulas in Logic / GDL.