CS3431 – Database Systems I Introduction Instructor: Murali Mani mmani@cs.wpi.edu Mani-CS3431 1 What is a Database System? • Database: a large collection of related data usually too large to fit in computer memory at once usually data is important for the application usually many users may need fast access to data Focus: information, rather than computation Mani-CS3431 2 Database Applications Have you ever used a database application? E-commerce: books etc at Amazon, B&N Banks -- your valuable $$ and ATM transactions Airlines – manage flights to get you places Universities – manage student enrollment GIS (Maps) – find restaurants closest to WPI WWW (World Wide Web) – blobs, wikis, etc. ? Bio-informatics (genome data) Datasets increasing in diversity and volume everywhere !!! Mani-CS3431 3 Example Database : Relational Tabular View of Data: Airline System Flight flightNo start destination miles 101 BOS LAX 3000 102 PVD LAX 2900 Passenger FlewIn Tabular pName ffNumber DoB milesEarned Joe 1001 1980 12000 Mary 1002 1981 11000 flightNo ffNumber date 101 1001 Jan 4 102 1002 Jan 5 view of data is called Relational Model Mani-CS3431 4 Basic Terminology Data Model: Schema: Describes “structures” for a particular application, using the given data model Database : A collection of “concepts” used for describing data Collection of actual data that conforms to given schema Database Management System : Software that allows us to create, use and maintain a database (conforming to given model). Mani-CS3431 5 Relational Data Models The relational model of data The most widely used model today. Main concept: relation, basically a table with rows and columns. Every relation has a schema, which describes the columns, or fields. Mani-CS3431 Levels of Abstraction • External schema (view) -describes how users see the data • Logical schema – describes the logical structures used View1 View2 View3 Logical Schema Physical Schema disk • Physical schema -describes files and indexes Mani-CS3431 7 Levels of Abstraction: Example Logical (Conceptual) Schema: Physical Schema Flight, Passenger, FlewIn tables Flight table stored as a sorted file Index on flightNo attribute for Flight relation Views ( External Schema ) NoOfPassengers (flightNo, date, numPassengers) Mani-CS3431 8 Why use DBMS, and not files? Data independence (robustness under change) Efficient access even on huge data sets Reduced application development time Data integrity ensures consistency of data even with multiple users Recovery from crashes, security, etc. Mani-CS3431 9 Who use databases? End users DB application programmers Database Administrators Database design Security, Authorization Data availability, crash recovery Database tuning (for performance) Mani-CS3431 15 Summary : Why study DBMS? Need to process large amounts of data increasing Video, WWW, computer games, geographic information systems (GIS), genome data, digital libraries, etc. DB administrators and programmers hold rewarding jobs. DBMS research is one of the most exciting areas in Computer Science !! Mani-CS3431 16 Introductory Material Sets, Relations and Functions Mani-CS3431 17 Sets Unordered collection of objects Characteristics Unordered No duplicates (no object appears more than once in a set) Eg: Set of passengers, set of flights Recall the main set operations Union, intersection, complement Check subset Mani-CS3431 18 Relations Given multiple sets A1, A2, …, An, a relation is a set of n-tuples of the form (a11, a12, …, a1n), where a11 is an element of A1, a12 is an element of A2, and so on. Eg: suppose the set of course = {DB1, DB2}, the set of TAs = {Hong, Song}, then a relation between these two sets could be {(DB1, Hong), (DB1, Song), (DB2, Hong)} Mani-CS3431 19 Functions Given two sets A, B, a function f from A to B is denoted as f: A B. This maps any value of A to one value of B. Eg: consider function from faculty members to depts {(Mike Gennert CS), (Peter Hansen Humanities)} Characteristics A is called domain B is called range No value of A can map to multiple B’s. Mani-CS3431 20 Functions Injection (one to one): Surjections (onto) No 2 values in A map to the same B Eg: set of Husbands set of wives Every value in B has at least 1 value in A that maps to it Bijections One to one and onto Mani-CS3431 21