Embedded SQL and Datalog

advertisement

Embedded SQL

Direct SQL is rarely used: usually, SQL is embedded in some application code.

We need some method to reference SQL statements.

But: there is an impedance mismatch problem

So: we use cursors.

Programs with SQL

Host language + Embedded SQL

Host Language + function calls

Host language program

The Impedance Mismatch

Problem

The host language manipulates variables, values, pointers

SQL manipulates relations.

There is no construct in the host language for manipulating relations.

Why not use only one language?

• Forgetting SQL: definitely not a good idea!

• SQL cannot do everything that the host language can do.

Interface: SQL / Host Language

Values get passed through shared variables.

Colons precede shared variables when they occur within the

SQL statements.

EXEC SQL: precedes every SQL statement in the host language.

The variable SQLSTATE provides error messages and status reports (e.g., 00000 says that the operation completed with no problem).

EXEC SQL BEGIN DECLARE SECTION; char productName[30];

EXEC SQL END DECLARE SECTION;

Using Shared Variables

Void simpleInsert() {

EXEC SQL BEGIN DECLARE SECTION; char productName[20], company[30]; char SQLSTATE[6];

EXEC SQL END DECLARE SECTION;

/* get values for productName and company somehow */

}

EXEC SQL INSERT INTO Product(name, company)

VALUES (:productName, :company );

Single-Row Select Statements

Void getPrice() {

EXEC SQL BEGIN DECLARE SECTION; char productName[20], company[30]; integer price; char SQLSTATE[6];

EXEC SQL END DECLARE SECTION;

/* read value of product name */

EXEC SQL SELECT price

INTO :price

FROM Product

WHERE Product.name = :productName;

/* print out value of price */ }

Cursors

EXEC SQL DECLARE cursorName CURSOR FOR

SELECT …. FROM …. WHERE …. ;

EXEC SQL OPEN cursorName ; while (true) {

EXEC SQL FETCH FROM cursorName INTO :variables; if ( NO_MORE_TUPLES ) break;

/* do something with values */

}

EXEC SQL CLOSE cursorName ;

More on Cursors

• cursors can modify a relation as well as read it.

• We can determine the order in which the cursor will get tuples by the ORDER BY keyword in the SQL query.

• Cursors can be protected against changes to the underlying relations.

• The cursor can be a scrolling one: can go forward, backward

+n, -n, Abs(n), Abs(-n).

Datalog

Another formalism for expressing queries:

- cleaner

- closer to a “logic” notation

- more convenient for analysis

- equivalent in power to relational algebra

- will later allow us to consider queries with recursion

Predicates and Atoms

- relations are represented by predicates

- tuples are represented by atoms.

Purchase( “joe”, “bob”, “Nike Town”, “Nike Air”, 2/2/98)

- arithmetic atoms:

X < 100, X+Y+5 > Z/2

- negated atoms:

NOT Product(“Brooklyn Bridge”, $100, “Microsoft”)

Datalog Rules and Queries

A datalog rule has the following form: head :- atom1, atom2, …., atom,…

ExpensiveProduct(X) :- Product(X,Y,P) & P > $100

BritishProduct(X) :- Product(X,Y,P) & Company(P, “UK”, SP)

P(X,Y) :- Between(X,Y,Z) & NOT Direct(X,Z)

A single rule can express exactly select-from-where queries.

The Meaning of Datalog Rules

ExpensiveProduct(X) :- Product(X,Y,P) & P > $100

Consider every assignment from the variables in the body to the constants in the database.

If each of the atoms in the body is made true by the assignment, then add the tuple for the head into the relation of the head.

Rule Safety

Every variable that appears anywhere in the query must appear also in a relational, nonnegated atom in the query.

Q(X,Y,Z) :- R1(X,Y) & X < Z not safe

Q(X,Y,Z) :- R1(X,Y) & NOT R2(X,Y,Z) not safe

Composing Datalog Rules

Extensional predicates: represent relations appearing in the database.

Intentional predicates: defined by rules. These can be thought of as being views.

Datalog rules may be composed in order to express more complex queries.

With composition:

• it becomes easier to express certain queries (define views)

• we can define queries including unions,

• we can define recursive queries.

From Relational Algebra to

Datalog

We can translate any relational algebra operation to datalog:

- projection

- selection

- union

- intersection

- join

Exercises

Product ( name, price, category, maker)

Purchase (buyer, seller, store, product)

Company (name, stock price, country)

Person( name, phone number, city)

Ex #1: Find people who bought telephony products.

Ex #2: Find names of people who bought American products

Ex #3: Find names of people who bought American products and did not buy French products

Ex #4: Find names of people who bought American products and they live in Seattle.

Ex #5: Find people who bought stuff from Joe or bought products from a company whose stock prices is more than $50.

Transitive Closure

Suppose we are representing a graph by a relation Edge(X,Y):

Edge(a,b), Edge (a,c), Edge(b,d), Edge(c,d), Edge(d,e) b a c

I want to express the query:

Find all nodes reachable from a.

d e

Recursion in Datalog

Path( X, Y ) :- Edge( X, Y )

Path( X, Y ) :- Path( X, Z ), Path( Z, Y ).

Semantics: evaluate the rules until a fixedpoint:

Iteration #0: Edge: {(a,b), (a,c), (b,d), (c,d), (d,e)}

Path: {}

Iteration #1: Path: {(a,b), (a,c), (b,d), (c,d), (d,e)}

Iteration #2: Path gets the new tuples:

(a,d), (b,e), (c,e)

Iteration #3: Path gets the new tuple:

(a,e)

Iteration #4: Nothing changes -> We stop.

Note: number of iterations depends on the data. Cannot be anticipated by only looking at the query!

Deductive Databases

We distinguish two types of relations in our database:

Extensional relations (EDB): their extent is stored in the database just like in ordinary relational databases.

• Intentional relations (IDB): their extension is defined by a set of possibly recursive datalog rules.

Intentional relations can either be materialized or computed on demand.

Note: a query and a definition of an intentional predicate look exactly the same (I.e., they’re both datalog programs).

Hard problem: how do we optimize queries in the presence of recursion.

Harder problem: do we really need recursion?

Recursion in SQL-3

Limited forms of recursion are considered important.

Linear recursion: only 1 occurrence of a recursive predicate in the body

Path( X, Y ) :- Edge( X, Y )

Path( X, Y ) :- Edge( X, Z ), Path( Z, Y ).

WITH

Pairs AS SELECT origin, dest FROM EDGE

RECURSIVE Path(origin, dest) AS

Pairs

UNION

(SELECT Pairs.origin, Path.to

FROM Pairs, Path

WHERE Pairs.to = Path.origin)

SELECT * FROM Path;

Download