Tools for Memory: Database Management Systems Susan D. Urban School of Computing and Informatics Department of Computer Science and Engineering Arizona State University 7/12/2016 CPI 101 1 What is a Database? A database is a structured collection items that is used for the operation of an enterprise Financial data Customer information Sales records Maps and directions Medical records Scientific and biological data Historical data A database supports 7/12/2016 Correctness and integrity of data Integrated retrieval of related data Concurrent user access to shared data Analysis and decision making Data mining and discovery Security of data CPI 101 2 Types of Data Textual Numeric Image Audio Video Geographic 7/12/2016 CPI 101 3 What is a Database Management System? A database management system (DBMS) is the software that assists in the creation and maintenance of a database as well as the retrieval of data. A DBMS includes 7/12/2016 A data model and data definition language (DDL) for specifying a database schema (i.e., the types, structures, and constraints on the data to be stored) A database catalog for storing the database schema. A data manipulation language (DML) for creating and maintaining the database (i.e., inserting, modifying, and deleting data) A query language for retrieving data. A transaction processing system for running application programs Other features to ensure security and integrity of the data CPI 101 4 DBMS Architectures User View 1 User View 2 User View N Conceptual Schema Data Abstraction Logical Data Independence Implementation Schema Physical Data Independence DBMS 7/12/2016 CPI 101 5 Different Types of Database Systems Legacy Systems (originated in the 1960’s) Relational Database Systems Extends relational database systems with object-oriented features Geographic Database Systems Models data using object-oriented concepts Object-Relational Database Systems Models data in the form of tables Accounts for the majority of new database projects A $14 billion industry Object-Oriented Database Systems Hierarchical Network Databases specifically for modeling 3D, geographic data Multimedia Database Systems 7/12/2016 Databases for modeling data in different forms (audio, video, image) CPI 101 6 Relational Database Products Commercial Oracle 10g IBM DB2 Microsoft Access Microsoft SQL Server Informix Sybase Open Source 7/12/2016 MySQL Ingres PostgreSQL Berkeley DB Cloudscape/Derby CPI 101 7 Data Models and the Database Design Process A database is initially designed by using a conceptual data model. A conceptual data model is DBMSindependent Provides a more logical way of viewing data A conceptual design can then be mapped to DBMS-dependent models. The DBMS handles the internal implementation details of the representation. 7/12/2016 CPI 101 8 Relationship Between Data Modeling Concepts Database Models Implementation Models Legacy Models Relational Model Network Model Conceptual Data Models Object-Oriented Model Binary Model Entity-Relationship Model Object-Relational Model Semantic Models Functional Model Enhanced EntityRelationship Model Hierarchical Model Unified Modeling Language Class Diagrams 7/12/2016 CPI 101 9 Conceptual Modeling with the Entity-Relationship Model ID NAME DNUM EMP WORKS DEPT SAL MGRID DATE TAKES INST CRSID CNAME 7/12/2016 DNAME COURSE CPI 101 LENGTH 10 The Relational Data Model Develop by Ted Codd at IBM in 1970 Models data in the form of tables. Tables are based on the theoretical concept of mathematical relations set theory first-order predicate logic Uses SQL as a declarative query language for data retrieval 7/12/2016 Describes what you want to retrieve and not how to retrieve it (the DBMS figures out the “how to” part for you) CPI 101 11 Relational Database Concepts Column/Attribute Table/Relation ID Row/Tuple Name Addr Phone Age 123 Joe Phx 9991 30 124 Sue Phx 8888 29 125 Ann Mesa 7772 25 Each column has a specific domain (integer, string, real, etc.) 7/12/2016 CPI 101 12 Relational Database Example Course: crsid, cname, inst, length Takes: id, crsid, date Dept: dnum, dname, mgrid Emp: id, name, sal, dnum Names in bold are primary keys. Names in italics are foreign keys used to define relationships between tables. These relationships help to support meaningful queries over the data. 7/12/2016 CPI 101 13 SQL (Structured Query Language) Find the name of the department that each employee works in. select name, dname from emp, dept where emp.dnum = dept.dnum Find the average salary of employees in each department Select dnum, avg(sal) from emp group by dnum order by dnum 7/12/2016 CPI 101 14 Data Import and Export Data Import – Getting large amounts of data into the database Data export – Getting large amounts of data out of the database Import and export is often needed to transport data from one database to another XML (Extensible Markup Language) has become a standard for data transport. Most current database systems are XMLenabled, providing a way to import and export data using XML and to query XML data types. 7/12/2016 CPI 101 15 XML Example <employees> <emp> <id>123</id> <name>Joe</name> <sal>30K</sal> <dnum>D1<dnum> </emp> …. </employees> More about this next week….. 7/12/2016 CPI 101 16 Distributed and Internet-Scale Database Systems Semistructured Data Structured Data DB XML … Query? DB Querying data from different data sources in an integrated manner is a challenging task for the area of information science! Unstructured Data 7/12/2016 CPI 101 17