Introduction to relational databases Introduction to the database field: Applications, concepts and terminology Introduction to the relational model FEN 2014-02-02 1 Databases and Database Applications Textual and numerical databases Multimedia Databases GIS (Geographic Information Systems) Data Warehouse Real-time and active databases And many-many more FEN 2014-02-02 2 What is a Database? A database is a collection of related data. A database is a logical coherent collection of data with some inherent meaning. Hence a random collection of data is not a database. A database is designed, built and populated with data for a specific purpose for a specific group of users with requirements for specific applications. A database represents some aspect of the real word, sometimes called the miniworld or domain. FEN 2014-02-02 3 What is a DBMS? Sometimes called a Database Server or Database Engine A DBMS (Database Management System) is a collection of programs that enables the users to create and maintain databases. This involves Defining a database (data types, structures and constraints) and handling meta data. Storing the data on some suitable storage medium and controlling the storage. Manipulating the database (executing queries and updating the database) Sharing the database allowing multiple users and application to access the database simultaneously. Providing interfaces to interactive users and/or application programs. FEN 2014-02-02 Do you know some DBMS? For instance: Oracle MySQL MS SQL Server PostgreSQL and many more 4 What is a Database System? A database system is a database and a database management system put together: Database + DBMS == Database system FEN 2014-02-02 5 Overview Database DBMS Database System FEN 2014-02-02 6 Data Models Database systems are build upon a data model. A data model is a collection of concepts used to describe the structure of a database. A data model provides data abstraction, hides storage details and gives users (and developers) a conceptual view of the database. A data model should provide means to describe: Structure Data types Relationships Constraints Operations FEN 2014-02-02 7 Data Models Legacy (pre-relational) Hierarchical Network Record based Relational Post-relational So the relational model is very central. Object oriented Object-relational Temporal XML Cubes in Data Warehouse ... FEN 2014-02-02 8 Relational Databases All data is organised in tables with atomic values Associations are represented by primary key/foreign key connections Every operation operates on tables and returns tables Small exercise (5 min..): • • Describe step by step how you will retrieve this information: When did Mr Smith attend the “Intro to Computer Science” course, who was the instructor and what grade did he receive? FEN 2014-02-02 9 The Relational Model A sound theoretical data model (Codd, 1970). Based on the mathematical theory of relations, sets and first order predicate logic. De facto standard since the late eighties. Many-many implementations – most SQL-based. The Notorious Supplier-Part Database (Date) For instance: Oracle MySQL MS SQL Server PostgreSQLA FEN 2014-02-02 10 The Relational Model: Concepts The Notorious Supplier-Part Database (Date) Central concepts: Tables (relations). Columns (attributes). Type (domain). Rows (tuples). Tuples are unordered. Tuples are unique. A relation is a set (mathematical) of tuples. Primary and foreign keys FEN 2014-02-02 11 The Relational Model Data is organised in a number of tables (relations). Each table has a number (>=1) columns (attributes). Attributes are atomic and defined over some domain. A table holds a number (maybe none) rows (tuples). Tuples are unordered. Tuples are unique (existence of a key is guaranteed). A relation is a set (mathematical) of tuples. FEN 2014-02-02 12 Attributes and Domains A domain defines the valid value of an attribute. Domains are based on the built-in standard data types (int, char etc.) offered by the DBMS. Theoretically it should be possible to define problem specific domains as CPR numbers, account numbers, IP addresses etc. and complex aggregate (structured) domain as maps, diagrams, pictures, sound bites, video clips etc. More attributes may be defined over the same domain. An attribute may have the value “empty” (not known /not defined for this instance). Empty is notated NULL. FEN 2014-02-02 13 Properties of a Relation Follows from the fact that relations are (mathematically) sets: Tuples must be unique within a relation (hence a primary key always exists) Tuples are unordered (vertically) Attributes are unordered (horizontally) Attribute values are atomic Note the difference to the usual notion of a table FEN 2014-02-02 14 Keys A key is a combination of attributes that is: Unique and Minimal An attribute combination that is unique, but not minimal is called a superkey The set of all attributes will always be a superkey, hence a superkey (and a key) always exists. A relation (table) may have several candidate keys. One these is appointed primary key. Any primary keys here? FEN 2014-02-02 15 Associations Between Relations Is represented by foreign keys. A foreign key is an attribute (combination) that corresponds to an attribute (combination) of the primary key of some other relation. A foreign key references a tuple in another relation and indicates that here is more information about the entity. Foreign key attributes and corresponding primary key attributes must be defined over compatible domains (or even the same domain). Any foreign keys here? FEN 2014-02-02 16 Integrity Constraints Domain constraints Attributes may only hold valid values Entity Integrity Primary key attributes may not hold NULL-values Referential Integrity (foreign key constraint) A foreign key must either be NULL or reference an existing primary key in the other relation Semantic Integrity Constraints depending on the problem domain Any constraints here? FEN 2014-02-02 17 Example: MiniBank Two tables: Customers Accounts Associated: An account must belong to exactly one customer Association Any constraints here? (primary keys – foreign keys) FEN 2014-02-02 18 Example: MiniBank What happens if: We try to insert a customer with an existing custNo? We try to insert an account with a not existing custNo? Let’s try in MS SQL Server FEN 2014-02-02 19 Example: MiniBank Table definitions (schemas): Constraint FEN 2014-02-02 20 Quering a relational database Database Languages: Data Definition DDL Should provide constructs for defining all the previous (as “create table) Data Manipulation DML (queries, insert, delete, update) procedural (How?) nonprocedural (What?) The Relational Algebra is a procedural DML SQL includes a (sort of) nonprocedural DML FEN 2014-02-02 21 The Relational Algebra Data Manipulation in the Relational Model Operates on relations, which are input to the operations is tables and the result is a table Operations Row selection (RESTRICT/SELECT) Column selection (PROJECT) Combining tables (JOIN) Set operations (UNION, INTERSECTION, DIFFERENCE, PRODUCT) More advanced operations (OUTER (LEFT/RIGTH) JOIN) FEN 2014-02-02 22 Relational Algebra - Overview FEN 2014-02-02 23 Example: MiniBank Retrieve information about customer number 3: Row selection on custNo = 3 from Customer Retrieve account number, balance and customer number for accounts with a balance between 1000 and 2000: Row selection on 1000 <= balance and balance <= 2000 from Account Column selection on accNo, balance, custNo Retrieve information about customer Tommy and his accounts: Row selection on name = ‘Tommy’ from Customer Join with Account on custNo FEN 2014-02-02 24