Database Application Development CS348 Introduction to Database Management Systems Fall 2014 CS348: Database applicaiton development 1 Communicating with a Database Sever • How have you interacted with a database server? – Web interface – Management console that comes with the DBMS – Your custom written program (java/C++) • Each of the above is a software that is written in a procedural language – How does it communicate in SQL with the DBMS CS348: Database application development 2 Outline • Embedded SQL – SQL libraries for programming languages • Call level interfaces – JDBC – ODBC • Stored Procedures – Programming in SQL • Modern developments – Language integrated querying (LINQ) – Use of SQL in procedural languages CS348: Database application development 3 Embedded SQL • Augment a host procedural programming language with functionality to support SQL – Use a pre-compiler to parse SQL constructs – Embed the result generated machine • Example: SQLJ, PRO*C/C++ CS348: Database application development 4 Embedded SQL example #include <stdio.h> EXEC SQL INCLUDE SQLCA; main() { EXEC SQL WHENEVER SQLERROR GOTO error; EXEC SQL CONNECT TO sample; EXEC SQL UPDATE Employee SET salary = 1.1*salary WHERE empno = '000370'; EXEC SQL COMMIT WORK; EXEC SQL CONNECT RESET; return(0); error: printf("update failed, sqlcode = %1d\n",SQLCODE ); EXEC SQL ROLLBACK return(-1); } CS348: Database application development 5 Embedded SQL: Host variables CS348: Database application development 6 Host Variables • The pre-compiler handles most of the grunt work of converting the query into an executable command – E.g., INT in SQL vs. the programming language – What is a NULL? – Note: The actual SQL statement sent could be different than what is in the program because we are not fully aware of what the compiler is doing CS348: Database application development 7 Host variables example EXEC SQL BEGIN DECLARE SECTION; char deptno[4]; char deptname[30]; char mgrno[7]; char admrdept[4]; char location[17]; EXEC SQL END DECLARE SECTION; /* program assigns values to variables */ EXEC SQL INSERT INTO Department(deptno,deptname,mgrno,admrdept,location) VALUES (:deptno,:deptname,:mgrno,:admrdept,:location); CS348: Database application development 8 Embedded SQL: SQLJ import java.sql.*; public class SimpleDemoSQLJ { …//other methods in class public Address getEmployeeAddress(int empno) throws SQLException { Address addr; #sql { SELECT office_addr INTO :addr FROM employees WHERE empnumber = :empno }; return addr; } …//other methods in class } // end of class CS348: Database application development 9 Cursors • If a query returns multiple rows cursors need to be used to retrieve results • A cursor is like a pointer/iterator that refers to some row of the result. At any time, a cursor may be in one of three places – Before first tuple – On a tuple – After last tuple CS348: Database application development 10 Cursor management • 4 steps to using cursors – – – – DECLARE the cursor and associate it with a query OPEN the cursor (conceptually) causing the query to be evaluated FETCH one or more rows from the cursor CLOSE the cursor • Concept is simple but there are many issues to consider between initiating a query and consuming results of a query. Consider: – – – – Delayed Fetching of 1 billion rows from a remote low bandwidth server Aggressive fetching (fetch all) and memory management issues Locking cursors Looping through dataset while creating other cursors CS348: Database application development 11 Embedded SQL Takeaway • Embedded SQL is not extremely popular – Requires a very specific and “intelligent” pre-compiler – DBMS upgrades and improvements do not necessarily get passed to the pre-compiler (optimizations may require a recompile) – Original source code is generic but the final executable is specific to a particular pre-compiler created for a specific database system • Lessons learnt: – Two completely different programming language paradigms have to be “merged” in one way or the other before execution – Cursor management, memory management, latency, locks are challenges that still exist today (at both the database server side and the application side) CS348: Database application development 12 Call level interfaces • Vendor neutral standard of communicating with a database server – Queries do not get pre-compiled and the database server itself takes care of all optimizations – No need to recompile a program in case you have to use a different database server product from a different vendor – Can query different servers at the same time • Most common examples of SQL CLI: – JDBC (Java database Connectivity) – ODBC (Open Database Connectivity) CS348: Database application development 13 Much simpler • Still the same challenges of writing SQL in procedural code – Queries are treated as strings – Results of executing a query (in case of select queries) are treated as containers of data • Often called ResultSet (java) or Dataset (C#) – Syntax of the program does not contain SQL (e.g., the code is fully compliant java/C++ compiler compatible code) • Interface (driver) is intelligent enough to determine data types of returned data and the programmer can focus on application development CS348: Database application development 14 Java Example: Assignment 0 import java.sql.*; import java.util.Properties; public class Test { private static final String CONNECTION_STRING = "jdbc:mysql://127.0.0.1/tpch?user=root&password=cs348&Database=tpch;"; public static void main(String[] args) throws ClassNotFoundException,SQLException { Connection con = DriverManager.getConnection(CONNECTION_STRING); String query = "SELECT COUNT(*) FROM LINEITEM AS CNT"; Statement stmt = con.createStatement(); ResultSet rs = stmt.executeQuery(query); while (rs.next()) { System.out.println(rs.getString(1)); } con.close(); } } CS348: Database application development 15 Example: C++ sql::mysql::MySQL_Driver *driver; sql::Connection *con; sql::Statement *stmt; driver = sql::mysql::get_mysql_driver_instance(); con = driver->connect("tcp://127.0.0.1:3306", "user", "password"); stmt = con->createStatement(); stmt->execute("USE " EXAMPLE_DB); stmt->execute("DROP TABLE IF EXISTS test"); stmt->execute("CREATE TABLE test(id INT, label CHAR(1))"); stmt->execute("INSERT INTO test(id, label) VALUES (1, 'a')"); delete stmt; delete con; /*taken from MySQL samples*/ CS348: Database application development 16 Prepared Statements • Can also create statement templates called “prepared statements” CODE: String updateString = "update " + dbName + ".COFFEES " + "set SALES = ? where COF_NAME = ?"; updateSales = con.prepareStatement(updateString); //…Later on in the program updateSales.setInt(1, 100); updateSales.setString(2, "French_Roast"); updateSales.executeUpdate(); CS348: Database application development 17 Why use prepared statements? • Can pre-compile and optimize the query before execution – Need to do compile and optimize only once! DBMS can optimize and make a query plan ahead of time knowing what kind of query it will be expecting – Typically used where a single query will be executed multiple times with different parameters – Dynamically adjusting query execution plans is expensive so prepared statements prepare the database for upcoming queries. CS348: Database application development 18 Prepared statement benefits • Prepared statements are resilient to SQL injection attacks • Warning: Don’t let users directly query your DB! conn = pool.getConnection( ); String sql = "select * from user where username='" + username +"' and password='" + password + "'"; stmt = conn.createStatement(); rs = stmt.executeQuery(sql); if (rs.next()) { loggedIn = true; System.out.println("Successfully logged in"); } else { System.out.println("Username/password not recognized"); } CS348: Database application development 19 SQL Injection • Directly inserting user input into SQL queries is dangerous! SELECT * FROM USER WHERE USERNAME = 'X' AND PASSWORD = 'Y' -- What if a hacker enters username = admin' OR '1'='1 -- Any password will work! SELECT * FROM USER WHERE USERNAME = 'admin' OR '1'='1' AND PASSWORD = 'Y' CS348: Database application development 20 Stored procedures • Yet another approach to client-server programming • How it works – The programmer/application passes only the parameters to the SQL server and let it do everything else – SQL on the server itself is sufficient to do everything CS348: Database application development 21 Stored procedures • Stored procedures (SP) are: – Functions, procedures, and routines within the database server that can be invoked by applications – SPs are written in SQL and are typically long sequences of SQL code that many applications can use • Avoid rewrite of SQL code across several applications by merging them into SPs – Low maintenance cost, hiding schema for programmers and providing another interface for them to use • Why should a programmer have to write SQL code in an application that may need maintenance if the table structure changes – Why not Object.savetodatabase(); CS348: Database application development 22 SP Example (MySQL) CREATE PROCEDURE GetOfficeByCountry(IN countryName VARCHAR(255)) BEGIN SELECT * FROM offices WHERE country = countryName; END CS348: Database application development 23 Executing SPs (MySQL) Using SQL interface: CALL GetOfficeByCountry('Canada') Using application: stmt.executeQuery("CALL GetOfficeByCountry('Canada')"); CS348: Database application development 24 Programming in SPs (MySQL) CREATE FUNCTION IncomeLevel ( monthly_value INT ) RETURNS varchar(20) BEGIN DECLARE income_level varchar(20); IF monthly_value <= 4000 THEN SET income_level = 'Low Income'; ELSEIF monthly_value > 4000 AND monthly_value <= 7000 THEN SET income_level = 'Avg Income'; ELSE SET income_level = 'High Income'; END IF; RETURN income_level; END; -- Call this procedure after creating it CALL IncomeLevel(40000) CS348: Database application development 25 Stored procedures • Of course SPs can be “called” in other SPs (recursively) • Depending on what an SP returns we can write SELECT statements that can utilize the return parameters • Other lesser powerful variants such as “functions” also exist in most DBMS CS348: Database application development 26 Stored procedures • Programming language of stored procedures is not widely standardized but is very similar – PL/SQL (Oracle), Transact-SQL (Microsoft), SQL/PSM (MySQL), SQL PL (DB2), even java is used in some DBMS for writing stored procedures • In addition many DBMS allow you to execute your own externally written program (i.e., external call to a program written in C/C++) – Can be slow because of overhead but used in many situations where external software interacting with a server is required! CS348: Database application development 27 Stored procedures • Lots of benefits – Keep SQL code stored in the SQL server – Programmers only pass parameters and retrieve datasets – Maintenance and upgrading all pieces of software that rely on a common database is easy to do because we have eliminated querying in applications altogether (i.e., no SELECT statement in application necessary, only CALL) CS348: Database application development 28 LINQ • Relatively new, future unknown • Why not take the best of SQL and integrate it within a programming language • C# Example: Let us say we have an array of objects and instead of looping through the array a programmer could write a declarative (SQL-like) query against the array Car[] carList = ... var carMake = from item in carList where item.Model == "bmw" select item.Make; CS348: Database application development 29 LINQ Example (C#) class IntroToLINQ { static void Main() { // The Three Parts of a LINQ Query: // 1. Data source. int[] numbers = new int[7] { 0, 1, 2, 3, 4, 5, 6 }; // 2. Query creation. // numQuery is an IEnumerable<int> var numQuery = from num in numbers where (num % 2) == 0 select num; // 3. Query execution. foreach (int num in numQuery) { Console.Write("{0,1} ", num); } } } CS348: Database application development 30 Why LINQ • Procedural code is long (expensive) and full of bugs • Let the programmer declare what he wants instead of writing a long procedure to extract it • Code can become difficult to read for non-expert programmers and LINQ may not catch on with non-DB progammers CS348: Database application development 31