Direct SQL is rarely used: usually, SQL is embedded in some application code.
We need some method to reference SQL statements.
But: there is an impedance mismatch problem
So: we use cursors.
Host language + Embedded SQL
Host Language + function calls
Host language program
The host language manipulates variables, values, pointers
SQL manipulates relations.
There is no construct in the host language for manipulating relations.
Why not use only one language?
• Forgetting SQL: definitely not a good idea!
• SQL cannot do everything that the host language can do.
Values get passed through shared variables.
Colons precede shared variables when they occur within the
SQL statements.
EXEC SQL: precedes every SQL statement in the host language.
The variable SQLSTATE provides error messages and status reports (e.g., 00000 says that the operation completed with no problem).
EXEC SQL BEGIN DECLARE SECTION; char productName[30];
EXEC SQL END DECLARE SECTION;
Void simpleInsert() {
EXEC SQL BEGIN DECLARE SECTION; char productName[20], company[30]; char SQLSTATE[6];
EXEC SQL END DECLARE SECTION;
/* get values for productName and company somehow */
}
EXEC SQL INSERT INTO Product(name, company)
VALUES (:productName, :company );
Void getPrice() {
EXEC SQL BEGIN DECLARE SECTION; char productName[20], company[30]; integer price; char SQLSTATE[6];
EXEC SQL END DECLARE SECTION;
/* read value of product name */
EXEC SQL SELECT price
INTO :price
FROM Product
WHERE Product.name = :productName;
/* print out value of price */ }
EXEC SQL DECLARE cursorName CURSOR FOR
SELECT …. FROM …. WHERE …. ;
EXEC SQL OPEN cursorName ; while (true) {
EXEC SQL FETCH FROM cursorName INTO :variables; if ( NO_MORE_TUPLES ) break;
/* do something with values */
}
EXEC SQL CLOSE cursorName ;
• cursors can modify a relation as well as read it.
• We can determine the order in which the cursor will get tuples by the ORDER BY keyword in the SQL query.
• Cursors can be protected against changes to the underlying relations.
• The cursor can be a scrolling one: can go forward, backward
+n, -n, Abs(n), Abs(-n).
Another formalism for expressing queries:
- cleaner
- closer to a “logic” notation
- more convenient for analysis
- equivalent in power to relational algebra
- will later allow us to consider queries with recursion
- relations are represented by predicates
- tuples are represented by atoms.
Purchase( “joe”, “bob”, “Nike Town”, “Nike Air”, 2/2/98)
- arithmetic atoms:
X < 100, X+Y+5 > Z/2
- negated atoms:
NOT Product(“Brooklyn Bridge”, $100, “Microsoft”)
A datalog rule has the following form: head :- atom1, atom2, …., atom,…
ExpensiveProduct(X) :- Product(X,Y,P) & P > $100
BritishProduct(X) :- Product(X,Y,P) & Company(P, “UK”, SP)
P(X,Y) :- Between(X,Y,Z) & NOT Direct(X,Z)
A single rule can express exactly select-from-where queries.
ExpensiveProduct(X) :- Product(X,Y,P) & P > $100
Consider every assignment from the variables in the body to the constants in the database.
If each of the atoms in the body is made true by the assignment, then add the tuple for the head into the relation of the head.
Every variable that appears anywhere in the query must appear also in a relational, nonnegated atom in the query.
Q(X,Y,Z) :- R1(X,Y) & X < Z not safe
Q(X,Y,Z) :- R1(X,Y) & NOT R2(X,Y,Z) not safe
Extensional predicates: represent relations appearing in the database.
Intentional predicates: defined by rules. These can be thought of as being views.
Datalog rules may be composed in order to express more complex queries.
With composition:
• it becomes easier to express certain queries (define views)
• we can define queries including unions,
• we can define recursive queries.
We can translate any relational algebra operation to datalog:
- projection
- selection
- union
- intersection
- join
Product ( name, price, category, maker)
Purchase (buyer, seller, store, product)
Company (name, stock price, country)
Person( name, phone number, city)
Ex #1: Find people who bought telephony products.
Ex #2: Find names of people who bought American products
Ex #3: Find names of people who bought American products and did not buy French products
Ex #4: Find names of people who bought American products and they live in Seattle.
Ex #5: Find people who bought stuff from Joe or bought products from a company whose stock prices is more than $50.
Suppose we are representing a graph by a relation Edge(X,Y):
Edge(a,b), Edge (a,c), Edge(b,d), Edge(c,d), Edge(d,e) b a c
I want to express the query:
Find all nodes reachable from a.
d e
Path( X, Y ) :- Edge( X, Y )
Path( X, Y ) :- Path( X, Z ), Path( Z, Y ).
Semantics: evaluate the rules until a fixedpoint:
Iteration #0: Edge: {(a,b), (a,c), (b,d), (c,d), (d,e)}
Path: {}
Iteration #1: Path: {(a,b), (a,c), (b,d), (c,d), (d,e)}
Iteration #2: Path gets the new tuples:
(a,d), (b,e), (c,e)
Iteration #3: Path gets the new tuple:
(a,e)
Iteration #4: Nothing changes -> We stop.
Note: number of iterations depends on the data. Cannot be anticipated by only looking at the query!
We distinguish two types of relations in our database:
•
Extensional relations (EDB): their extent is stored in the database just like in ordinary relational databases.
• Intentional relations (IDB): their extension is defined by a set of possibly recursive datalog rules.
Intentional relations can either be materialized or computed on demand.
Note: a query and a definition of an intentional predicate look exactly the same (I.e., they’re both datalog programs).
Hard problem: how do we optimize queries in the presence of recursion.
Harder problem: do we really need recursion?
Limited forms of recursion are considered important.
Linear recursion: only 1 occurrence of a recursive predicate in the body
Path( X, Y ) :- Edge( X, Y )
Path( X, Y ) :- Edge( X, Z ), Path( Z, Y ).
WITH
Pairs AS SELECT origin, dest FROM EDGE
RECURSIVE Path(origin, dest) AS
Pairs
UNION
(SELECT Pairs.origin, Path.to
FROM Pairs, Path
WHERE Pairs.to = Path.origin)
SELECT * FROM Path;