IIS WS 2015/16 Intelligent Information Systems - WS 2015/16 - Prof. Dr. Rainer Manthey Wednesday Lecture 10:30 – 12:00 a.m. Exercises 12:45 – 14:15 p.m. (MA-INF 3203) © 2015 Prof. Dr. Rainer Manthey Intelligent Information Systems 1 Chap. 2: Datalog and SQL Intelligent Information Systems WS 2015/16 Organisation and Motivation Chapter 1 © 2015 Prof. Dr. Rainer Manthey Intelligent Information Systems 2 Vita Rainer Manthey 1973 1973 Kiel Kiel 1953 1953 Wilhelmshaven Wilhelmshaven University of Kiel Informatics/Mathematics Student (Diploma 1979) Research assistant (PhD 1984) 1992 1992 Bonn Bonn University of Bonn Professor © 2015 Prof. Dr. Rainer Manthey European Computer-Industry Research Centre (ECRC) Researcher/ 1984 1984 München München Teamleader Intelligent Information Systems 3 Modules Offered by the IDB Group IDB (Intelligent Databases) Group: Prof. Dr. Rainer Manthey PD Dr. Andreas Behrend Sahar Vahdati, MSc WS Intelligent Information Systems WS+SS (MA-INF 3203) Seminar Selected Topics in Intelligent IS SS (MA-INF 3210) Temporal Information Systems Intelligent Information Systems II (MA-INF 3302) (MA-INF 3104) Lab Intelligent Information Systems (MA-INF 3313) © 2015 Prof. Dr. Rainer Manthey Intelligent Information Systems 4 IDB: Perspectives for the Next Three Years Prof. Dr. Rainer Manthey: • SS 2016: „Sabbatical“ semester, i.e., research only, no teaching! • 28.2.2019: Day of retirement • => 5 semesters of teaching left (after this semester): WS 16/17, SS 17, WS 17/18, SS 18, WS 18/19 • => Supervision of master thesises: 1.10.2016 – 30.9.2018 PD. Dr. Andreas Behrend: • PD/Habilitation: Full qualifications for any kind of academic teaching (incl. thesis supervision), independent teaching schedule • Position at Uni Bonn ends at 31.12.2017 (at latest) • => At most 3 semesters of teaching left (after this semester): SS 16, WS 16/17, SS 17 © 2015 Prof. Dr. Rainer Manthey Intelligent Information Systems 5 Organisation Intelligent Information Systems WS 2015/16 Organisation © 2015 Prof. Dr. Rainer Manthey Intelligent Information Systems 6 Weekly Schedule (nearly) every Wednesday during this semester 10:00 11:00 10:30 – 12:00 Lecture 12:00 Long break (45 mins) 13:00 12:45 – 14:15 Exercises 14:00 © 2015 Prof. Dr. Rainer Manthey Intelligent Information Systems 7 Schedule WS 2015/16 Wednesday October 21 28 November Begin of exercises 4 11 18 25 December 2 Dies academicus 9 16 January Xmas break 13 20 27 February 13 lectures © 2015 Prof. Dr. Rainer Manthey 3 10 End of exercises 12 exercises Intelligent Information Systems 8 Exercises and Exams: „Rules of the Game“ • Exercises: • In the same room every Wednesday, following the lecture after 45 minutes break, for the entire auditorium, no small groups. • Exercises held by Prof. Manthey and/or Mrs. Sahar Vahdati. • Goals: • To make you fit for the exam! • To provide some „hands on“ experience with theoretically introduced concepts. • Participation will not be checked, but is strongly recommended!! • No prerequisites for getting admission to exams! • No „homework“ to be delivered, but motivation/encouragement for individual activity provided in exercises. • No individual feedback possible. • Exams: • Written exams for both exam dates (MSc CS: 6 credits, MSc MI: 4 credits) • Exam dates to be determined: Most likely end of February + end of March • Registration in BASIS (MSc CS only): three-weeks period in December – to be announced • Registration using special forms for all others: same period (forms available in exercises or by download) © 2015 Prof. Dr. Rainer Manthey Intelligent Information Systems 9 IIS Homepage http://www.iai.uni-bonn.de/III//lehre/vorlesungen/IntelligentIS/WS15/ Slides for download © 2015 Prof. Dr. Rainer Manthey Intelligent Information Systems 10 No Book, just Slides! There is no textbook which could be recommended for this lecture . . . . . . just the slides serve as a substitute instead (representing a compromise between a good background presentation and too much text) © 2015 Prof. Dr. Rainer Manthey Intelligent Information Systems 11 A Word of Warning But: Only a small fraction of the attendees will have a chance to get a place in seminars (and labs) or to get a master thesis in this area! IIS 2014: 101 participants in the exam !! © 2015 Prof. Dr. Rainer Manthey Intelligent Information Systems 12 Background Intelligent Information Systems WS 2015/16 Background © 2015 Prof. Dr. Rainer Manthey Intelligent Information Systems 13 Information Systems: The DB-centered View Information System External media of communication Applicationspecific methods Database System This is the most commonly agreed view on the concept of an IS in informatics – provided people agree on the meaning of DBS!! © 2015 Prof. Dr. Rainer Manthey Intelligent Information Systems 14 Databases and Database Systems Database System DBMS DB .... Databases Users and application programmes © 2015 Prof. Dr. Rainer Manthey DBMS: Data Base Management System (Many powerful application-independent services: schema mgt, query optimization, storage mgt, transaction mgt, etc.) Intelligent Information Systems 15 IDBS rather than IIS This lecture will be more accurately concerned with Intelligent Database Systems rather than with Intelligent Information Systems The naming of the module is more a matter of convention rather than precision! © 2015 Prof. Dr. Rainer Manthey Intelligent Information Systems 16 Query Languages vs. Programming Languages Imperative programming language Declarative query language DBMS DB Data Dictionary Interpreter • „Real“ DBMS support a separate kind of DB-specific „programming language“ for accessing and manipulating data in the DB: query language • In contrast to the external imperative programming languages, a query language is usually a declarative language, the performance of which is optimised by the DBMS. • „Programs“ of the query language may be stored in the data dictionary within the DB. © 2015 Prof. Dr. Rainer Manthey Intelligent Information Systems 17 Relational Data Model and SQL • The most widely used data model nowadays is the relational model (introduced around 1970). Relations are the mathematical basis for data represented in tables (rows/columns). • All relational DBMS support a predominant declarative query language based on logical and algebraic operators: SQL (Structured Query Language) © 2015 Prof. Dr. Rainer Manthey Intelligent Information Systems 18 Background in Relational Databases and SQL: Strictly Necessary ! Material for self-study (in case your background is weak, dated, or missing): • Extra slides via IIS homepage • Cheap and easy tutorials from the Schaum‘s series A good background in relational databases and in SQL is expected from everybody attending this lecture!! SQL will frequently be used during the semester, even though we are going to learn a different relational language! © 2015 Prof. Dr. Rainer Manthey Intelligent Information Systems 19 Motivation Intelligent Information Systems WS 2015/16 Motivation © 2015 Prof. Dr. Rainer Manthey Intelligent Information Systems 20 Intelligent Database or Intelligent Database System? DBS DB DBMS .... ? Where is „intelligence“ located? In the DB or in the DBMS? Or even outside the DBS? © 2015 Prof. Dr. Rainer Manthey Intelligent Information Systems 21 IDBS: „Intelligent Services“ in a DBMS DBS DBMS DB generic specific Certainly required: „Intelligent“ behaviour of the DBS, i.e., generic (application-independent) services inside the DBMS, able to „simulate intelligence“ © 2015 Prof. Dr. Rainer Manthey Intelligent Information Systems 22 IDBS: „Knowledge“ Inside a DB DBS DD DBMS DB generic specific Also certainly required: „Knowledge“ about the resp. application domain in the DD (Data Dictionary) „Knowledge“: Rules from the application domain as a basis for drawing intelligent conclusions from stored data © 2015 Prof. Dr. Rainer Manthey Intelligent Information Systems 23 IDBS: „Traditional“ Approach with External System Components Inference System, Agent System, Expert System Knowledge Base, Rule Base DBS „loose coupling“ DB DBMS Preferred by many: Move „Intelligence“ and „Knowledge“ out of the DBS © 2015 Prof. Dr. Rainer Manthey Intelligent Information Systems 24 IDBS: Our Approach – „In database-Intelligence“! DBS DB DBMS „tight coupling“ Approach favoured by our research group (and thus in this lecture): • Try to reach as much „intelligence“ as possible using existing DB technology! • Identify weaknesses of this technology and think about reasonable extensions, without leaving the DB context! © 2015 Prof. Dr. Rainer Manthey Intelligent Information Systems 25 At the Core of IIS: Theory and Practice of Deductive Databases This approach – which is a special one – explains the drawing on the title slide of this lecture. Therefore: Theory and Practice of the established research area of „Deductive Databases“ will be at the core of this lecture. The essence of this area of research can be described as follows: How to analyse data using stored queries (in SQL: views) that serve as declarative analytical programs? © 2015 Prof. Dr. Rainer Manthey Intelligent Information Systems 26 Datalog and SQL • Research in deductive databases has a nearly 40-years history (as old as SQL), but has been using a different declarative language (not SQL!) most of the time, strongly influenced by the logic programming language PROLOG: Datalog • Nearly all publications in this area have been using Datalog – that‘s why we will use Datalog during this lecture, too (and you will have to learn it!). • Many results of DDB research have been transferred to the SQL world recently! That‘s why SQL will also be appearing throughout the lecture in various places. SQL: • • • • • Datalog: used in industry and commerce supported by many DBMS products standardized user-friendly („controlled English“) rich set of syntactic features © 2015 Prof. Dr. Rainer Manthey • • • • • used in academia only just few academic protoypes no standards mathematical style minimalistic syntax Intelligent Information Systems 27 Datalog vs. SQL: Comparison in a Nutshell SQL views Datalog rules s(X) p(X,Y). s(X) r(Y,X). t(X,Y,Z) p(X,Y), r(Y,Z). w(X) s(X), not q(X). CREATE VIEW s AS (SELECT a FROM p) UNION (SELECT b FROM r); CREATE VIEW t AS SELECT a, b, c FROM p, r WHERE p.b = r.a, CREATE VIEW w AS (TABLE s) MINUS (TABLE q); © 2015 Prof. Dr. Rainer Manthey Intelligent Information Systems 28 Datalog Basics on a Single Slide Constants Facts p(1,a). p(2,b). p(3,c). q(2). q(5). Rules s(X) p(X,Y). s(X) r(Y,X). Variables t(X,Y,Z) p(X,Y), r(Y,Z). w(X) s(X), not q(X). r(a,1). r(a,2). r(b,3). Conjunction Negation Relation Names p, q, r: Base relations © 2015 Prof. Dr. Rainer Manthey s, t, w: Derived relations Intelligent Information Systems 29 Structure of the Course This is how the lecture will be structured – the number of lectures might be slightly varying in „real life“ 1. Organisation and Motivation 1 lecture 2. Deduction in Datalog and SQL 3. Semantics of Deductive Databases 4. Efficient Query Evaluation in DDBs 3 lectures 4 lectures 4 lectures 5. Perspectives 1 lecture Timetable: • • • • • relevant for exam Chapter 1: today Chapter 2: Oct/Nov Chapter 3: Nov/Dec Chapter 4: Jan/Febr Chapter 5: Febr © 2015 Prof. Dr. Rainer Manthey Intelligent Information Systems 30 Static Analysis Followed by Dynamic Analysis WS: Intelligent Information Systems (MA-INF 3203) SS: Intelligent Information Systems II (MA-INF 3104) © 2015 Prof. Dr. Rainer Manthey Focussing on foundations of declarative languages and applying queries and views for analysing Static scenarios (i.e., individual DB states) Continuing IIS by adding the deductive analysis of updates and transactions and thus analysing Dynamic scenarios (i.e., sequences of changes of DB states) Intelligent Information Systems 31 Static Analysis: A Motivational Case Study (1) As a case study, we will use a well-known noble family: In the next lecture, we will discuss a typical example of a relational database providing a lot of opportunities for static analysis of data using stored queries: Genealogical Databases Genealogy is the discipline of exploring family relationships between persons and their ancestors/descendants. © 2015 Prof. Dr. Rainer Manthey Intelligent Information Systems 32 Static Analysis: A Motivational Case Study (2) ? How to „put a family tree into a (relational) database“? © 2015 Prof. Dr. Rainer Manthey Intelligent Information Systems 33 Static Analysis: A Motivational Case Study (3) Is this a „good“ DB schema? One possible relational format for family trees: just two tables! © 2015 Prof. Dr. Rainer Manthey Intelligent Information Systems 34 Static Analysis: A Motivational Case Study (4) • A lot of implicit data about the resp. family are „hidden“ in the family tree data! • There is plenty of well-known genealogical terminology, such as forms of being relatives or relatives-in-law, each of which can be specified by means of DB queries. • Applying such declarative specifications to the stored data is a typical example of static analysis of a database. e.g.: How to specify the concept of being an uncle of somebody in SQL? © 2015 Prof. Dr. Rainer Manthey Intelligent Information Systems 35