Garcia-Molina, Ullman, and Widom Database Systems Chapter 2 Data Models • A data model is a notation for describing the structure of the data in a database , – along with the constraints on that data. • a day of the week is an integer between 1 and 7 • The data model also normally provides a notation for describing operations on that data : – queries and data modifications. Relational Model • Relations are tables representing information . – The set of tuples for a given relation is an instance of that relation • Columns are headed by attributes ; – each attribute has an associated domain , or data type. • Rows are called tuples , and – a tuple has one component for each attribute of the relation. Schemas • A relation name , together with the attributes of that relation and their types , form the relation schema. – Movies(title, year, length, genre) • A collection of relation schemas forms a database schema. • Particular data for a relation or collection of relations is called an instance of that relation schema or database schema. Keys • An important type of constraint on relations is the assertion that an attribute or set of attributes forms a key for the relation. – known as key constraints • No two tuples of a relation can agree on all attributes of the key , – although they can agree on some of the key attributes. Semistructured Data Model • In this model , data is organized in a tree or graph structure . • XML is an important example of a semistructured data model – Tags similar to those used in HTML define the role played by different pieces of data • much as the column headers do in the relational model SQL • The language SQL is the principal query language for relational database systems . • The current standard is called SQL-99 . • Commercial systems generally vary from this standard but adhere to much of it . Relation implementation • Stored relations, called tables, exist in the database and can be modified by changing its tuples, as well as queried. • Views are relations defined by a computation – these relations are not stored, but are constructed, in whole or in part, when needed • Temporary tables are constructed by SQL during the execution of queries and data modification – after the transaction these are thrown away Data Definition • SQL has statements to declare elements of a database schema . • The CREATE TABLE statement allows us to declare the schema for stored relations – specifying the attributes , their types , default values , and keys. Altering Relation Schemas • We can change parts of the database schema with an ALTER statement . • These changes include – adding and removing attributes from relation schemas and – changing the default value associated with an attribute . • We may also use a DROP statement to completely eliminate relations or other schema elements. Modifying relations examples DROP TABLE R; Relation R is no longer part of the database schema, and its tuples can no longer be accessed ALTER TABLE MovieStar ADD phone CHAR(16); ALTER TABLE MovieStar DROP birthdate; CREATE TABLE MovieStar ( name CHAR(30), address VARCHAR(255), gender CHAR(1), birthdate DATE, PRIMARY KEY (name) ); Algebra • An algebra consists of operators and atomic operands. • In relational algebra, the atomic operands are – Variables that stand for relations – Constants which are finite relations Relational Algebra • This algebra underlies most query languages for the relational model. • Its principal operators are – union , intersection, difference , selection , – projection , Cartesian product , – natural join , theta-join , and – renaming . Section 2.4 Relational algebra and SQL • From the text – Relational algebra is not used today as a query language in commercial DBMS’s, although some of the early prototypes did use this algebra directly. – SQL incorporates relational algebra at its center, and many SQL programs are really “syntactically sugared” expressions of relational algebra. Some SQL CREATE TABLE Product ( maker CHAR(30), model CHAR(10) PRIMARY KEY, type CHAR(15) ); CREATE TABLE PC ( model CHAR(30), speed DECIMAL(4,2), ram INTEGER, hd INTEGER, price DECIMAL(7,2) ); CREATE TABLE Laptop ( model CHAR(30), speed DECIMAL(4,2), ram INTEGER, hd INTEGER, screen DECIMAL(3,1), price DECIMAL(7,2) ); CREATE TABLE Printer ( model CHAR(30), color BOOLEAN, type CHAR (10), price DECIMAL(7,2) ); Selection and Projection • The selection operator produces a result consisting of all tuples of the argument relation that satisfy the selection condition . • Projection removes undesired columns from the argument relation to produce the result Some Data for Product M a k e r M o d e l T y p e A A A A A A B B B B C D D 1001 1002 1003 2004 2005 2006 1004 1005 1006 2007 1007 1008 1009 pc pc pc laptop laptop laptop pc pc pc laptop pc pc pc D D D E E E E E E E E E F F G H H 1010 3004 3005 1011 1012 1013 2001 2002 2003 3001 3002 3003 2008 2009 2010 3006 3007 pc printer printer pc pc pc laptop laptop laptop printer printer printer laptop laptop laptop printer printer Some data for PC model 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 speed 2.66 2.10 1.42 2.80 3.20 3.20 2.20 2.20 2.00 2.80 1.86 2.80 3.06 ram 1024 512 512 1024 512 1024 1024 2048 1024 2048 2048 1024 512 hd 250 250 80 250 250 320 200 250 250 300 160 160 80 price 2114 995 478 649 630 1049 510 770 650 770 959 649 529 What PC models have a speed of at least 3.00? R1 := σspeed ≥ 3.00 (PC) R2 := πmodel(R1) model 1005 1006 1013 Some data for Laptop model 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 speed 2.00 1.73 1.80 2.00 2.16 2.00 1.83 1.60 1.60 2.00 ram 2048 1024 512 512 1024 2048 1024 1024 512 2048 hd 240 80 60 60 120 80 120 100 80 160 screen 20.1 17.0 15.4 13.3 17.0 15.4 13.3 15.4 14.1 15.4 price 3673 949 549 1150 2500 1700 1429 900 680 2300 Which manufacturers make laptops with a hard disk of at least 100GB? R1 := σhd ≥ 100 (Laptop) R2 := Product (R1) R3 := πmaker (R2) maker E A B F G Joins • We join two relations by comparing tuples , one from each relation. • In a natural join , we splice together those pairs of tuples that agree on all attributes common to the two relations. • In a theta-join , pairs of tuples are concatenated if they meet a selection condition associated with the theta-join. Constraints in Relational Algebra • Many common kinds of constraints can be expressed as – the containment of one relational algebra expression in another , or – as the equality of a relational algebra expression to the empty set.