Introduction to Database Systems Purpose of Database Systems Views of Data Data Models Data Definition Language Data Manipulation Language Transaction Management Storage Management Database Administrator Database Users Overall System Structure 3/22/2016 1 Database Management System (DBMS) Collection of interrelated data Set of programs to access the data DBMS contains information about a particular enterprise DBMS provides an environment that is both convenient and efficient to use. 3/22/2016 2 Purpose of Database Systems Difficulties in conventional file-processing systems: (DBMS attempts to solve) Data redundancy and inconsistency Difficulty in accessing data Data isolation -- multiple files and formats Integrity problems Atomicity of updates Concurrent access by multiple users Security problems 3/22/2016 3 View of Data View level View 1 View 2 …... View n Logical level Physical level 3/22/2016 4 Levels of Abstraction Physical level: describes how a record (eg.: customer) is stored in terms of block#, sector#, byte# etc. Logical level: describes data stored in database, and the relationship among the data: customer { string name; string street; integer city; } View level: Application programs/GUI etc hide the actual representation and present only the required data in a convenient way. Eg: Visual forms, graphical forms. 3/22/2016 5 Instances and Schemas Schema is a description and an instance is a set of data that fits the description. Schema : logical structure of the database (eg. Set of customers and accounts and the relationship between them) Instance : actual content of the database at a particular point in time. One schema may have many instances. Analogy : type and variable in a programming language. 3/22/2016 6 Data Independence Ability to modify a schema definition in one level without affecting a scheme definition in the next higher level. Interfaces between the various levels and components should be well defined so that changes in some parts do not seriously influence others: Two levels of data independence: Physical data independence Logical data independence 3/22/2016 7 Data Models A collection of representations for describing: data, data relationships, data semantics, data constraints Object-based logical models Entity-Relationship (ER) model Object-oriented model Semantic model Functional model Record-based logical models Relational model (e.g. SQL/DS, DB2) Network model Hierarchical Model 3/22/2016 Object relational model 8 Entity-Relationship Model SSN street number balance city name customer depositor account entity attribute relationship 3/22/2016 9 Relational Model Customer Table Name Johnson SSN 192-83-7465 street Alma city Palo Alto account no. A-101 Account Table Account No. A-101 3/22/2016 Balance 500 10 Data Definition Language (DDL) Specification notation for defining database scheme DDL compiler generates a set of tables in a data dictionary Data dictionary contains metadata (data about data) Data storage and definition language - special type of DDL in which storage and access methods used by the dbms are specified. SQL create, define tables 3/22/2016 11 Data Manipulation Language Language for accessing and manipulating the data organized by appropriate data model Two classes of languages: Procedural - user specifies what data is required and how to get those data Non-procedural -- user specifies what data is required without specifying how to get those data Query language (sequential query 3/22/2016 language: sql) : insert data, update data 12 Relational Algebra Tables representing relations Row uniquely identified by a primary key Related tables are associated using foreign keys Special operators are defined and axioms: join, project etc.: relational algebra http://db.grussell.org/section010.html 3/22/2016 13 Transaction Management A transaction is a collection of operations that perform a single logical function in a database application Example: Withdraw $100 from Acct Number A-101 Verify the balance Update account table (& other tables) Deliver money 3/22/2016 14 Transaction Management (contd.) Transaction-management component ensures that the db remains in a correct state despite system failures (e.g. power failures and operating system crashes) and transaction failures. ACID property: Atomicity, Consistency, Isolation, Durability Concurrency-control manager controls the interaction among the concurrent transactions. 3/22/2016 15 Storage Management A storage manager provides the interface between the low-level data and the application programs and queries submitted to the system. The storage manager is responsible for the following tasks: interaction with file-manager efficient storing, retrieving, and updating of data 3/22/2016 16 Overall System Model Naïve users Query processing Application interfaces Object code Application programmers Sophisticated users Application programs query Embedded DML compiler DML compiler DBA Database scheme DDL interpreter Query evaluation engine Transaction manager DBMS System Buffer manager File manager Data files 3/22/2016 indices Statistical data Data dictionary Disk storage 17 Database Administrator Coordinates all the activities of the database system; DBA should have a good understanding of the enterprise’s information resources and needs. DBA’s duties include: Scheme definition Storage structure and access method definition Granting user authority to access the database Specifying integrity constraints Acting as liaison with users Monitoring performance and responding to changes in requirements 3/22/2016 18 Database Users Differentiated by the how they interact with the system Naïve users: invoke pre-determined application with high-level user interface Application programmers; Use DML calls, embedded calls Sophisticated users: managers, decision support systems: request using query languages Specialized users: design and write specialized applications. 3/22/2016 19 Links/support https://wiki.cse.buffalo.edu/services/content/how-use-jdbc-oracle 3/22/2016 20 Simple Database Application A P P L I C A T I O N 3/22/2016 DBMS DB Ex: Access Oracle MySql B.Ramamurthy 21 Multi-Databases A P P L I C A T I O N DBMS 1 DBMS 2 DBMS 3 3/22/2016 B.Ramamurthy DB DB DB 22 Standard Access to DB A P P L I C A T I O N D R I V E R M G R DBMS Driver 1 DBMS 1 DB DBMS Driver 2 DBMS 2 DB DBMS 3 DB DBMS Driver 3 3/22/2016 B.Ramamurthy 23 ODBC Architecture Application Class1 Class2 ODBC Driver Manager DriverType1 DataSource1 3/22/2016 DriverType2 DataSource2 B.Ramamurthy DriverType3 DataSource3 24 Open Database Connectivity (ODBC) Standard ODBC standard is an interface by which application programs can access and process SQL databases in a DBMS-independent manner. It contains: A Data Source that is the database, its associated DBMS, operating system and network platform A DBMS Driver that is supplied by the DBMS vendor or independent software companies A Driver Manager that is supplied by the vendor of the O/S platform where the application is running 3/22/2016 B.Ramamurthy 25 ODBC Interface It is a system independent interface to database environment that requires an ODBC driver to be provided for each database system from which you want to manipulate data. The database driver bridges the differences between your underlying system calls and the ODBC interface functionality. 3/22/2016 B.Ramamurthy 26 An Example Application DriverManager DB2 driver 3/22/2016 MYSQL driver B.Ramamurthy Oracle driver 27 Application in Java Application in Java DriverManager Access driver 3/22/2016 MYSQL driver B.Ramamurthy Oracle driver 28 Java Support for ODBC : JDBC When applications written in Java want to access data sources, they use classes and associated methods provided by Java DBC (JDBC) API. JDBC is specified an an “interface”. An interface in Java can have many “implementations”. So it provides a convenient way to realize many “drivers” 3/22/2016 B.Ramamurthy 29 Java Support for SQL Java supports embedded SQL. Also it provides an JDBC API as a standard way to connect to common relational databases. You need a JDBC:ODBC bridge for using the embedded SQL in Java. Java.sql package and an extensive exception hierarchy. We will examine incorporating this bridge using sample code. 3/22/2016 B.Ramamurthy 30 Data Source Local relational database; Ex: Oracle Remote relational database on a server; Ex: SQLserver On-line information service; Ex: Dow Jones, Customer database 3/22/2016 B.Ramamurthy 31 Data Source and Driver Data source is the data base created using any of the common database applications available. Your system should have the driver for the database you will be using. For example your Windows system should have the MS Access Driver. There are a number of JDBC drivers available. On the IDE you are using you will install the appropriate driver. 3/22/2016 B.Ramamurthy 32 JDBC Components Driver Manager: Loads database drivers, and manages the connection between application & driver. Driver: Translates API calls to operations for a specific data source. Connection: A session between an application and a driver. Statement: A SQL statement to perform a query or an update operation. Metadata: Information about the returned data, driver and the database. Result Set : Logical set of columns and rows returned by executing a statement. 3/22/2016 B.Ramamurthy 33 JDBC Classes Java supports DB facilities by providing classes and interfaces for its components DriverManager class Connection interface (abstract class) Statement interface (to be instantiated with values from the actual SQL statement) ResultSet interface 3/22/2016 B.Ramamurthy 34 Driver Manager Class Provides static, “factory” methods for creating objects implementing the connection interface. Factory methods create objects on demand when a connection is needed to a DB driver, DriverManager does it using it factory methods. 3/22/2016 B.Ramamurthy 35 Connection interface Connection class represents a session with a specific data source. Connection object establishes connection to a data source, allocates statement objects, which define and execute SQL statements. Connection can also get info (metadata) about the data source. 3/22/2016 B.Ramamurthy 36 Statement interface Statement interface is implemented by the connection object. Statement object provides the workspace for SQL query, executing it, and retrieving returned data. SELECT {what} FROM {table name} WHERE {criteria} ORDER BY {field} Queries are embedded as strings in a Statement object. Types: Statement, PreparedStatement, CallableStatement B.Ramamurthy 3/22/2016 37 ResultSet interface Results are returned in the form of an object implementing the ResultSet interface. You may extract individual columns, rows or cell from the ResultSet using the metadata. 3/22/2016 B.Ramamurthy 38 JDBC Application Architecture Application Connection Result Set Statement Driver Manager Driver DataSource 3/22/2016 Driver DataSource B.Ramamurthy Driver DataSource 39 JDBC Programming Steps Import necessary packages; Ex: import java.sql.*; Load JDBC driver(driver should have been installed) Data source and its location should have been registered. Allocate Connection object, Statement object and ResultSet object Execute query using Statement object Retrieve data from ResultSet object Close Connection object. 3/22/2016 B.Ramamurthy 40 Identifying Data Sources It is specified using URL format. <scheme>: <sub_scheme>:<scheme-specificpart> Example(for local source): jdbc:odbc:tech_books Alternatively, for remote connection, jdbc:odbc://bina.cse.buffalo.edu:4333/tech_books 3/22/2016 B.Ramamurthy 41 Summary We learned the inner details of the relational database model We also studied methods for programmatically accessing the data sources from Java language This is just one persistence model. We will revisit data/storage later on in the semester in the “big-data” context. 3/22/2016 42