Introduction to Databases: DB Review & ADO.NET Models – Connected and Disconnected Introduction to Databases A database is a collection of data related to a particular topic. Databases can range from a few kilobytes to several terabytes and beyond. The Integrated Student Information System (ISIS) used at Uludag University is a large database. Google’s database is an example of a massive database. NSA has a massive database called a data warehouse that collects information from a variety of other databases. Social Security has massive data base – imagine what is stored here A Database Management System (DBMS) is software that is used to create, access and maintain a database. Microsoft Access Microsoft SQL Server Oracle Introduction to Databases A relational database stores data and the connections (relationships) within the data. A relational database uses tables to store data and manipulate it. A table can be visualized as a 2-dimensional grid that contains rows and columns. Each table contains one or more records (rows) that contain information about a single entry. It uses a relational data model. Records contain a set of related fields that are used to store data. Each record contains one or more fields (columns) that contain a single piece of data. Introduction to Databases A table represents an entity in a problem domain. For example, in the integrated system information system ISIS, a table may represent a student entity or a course entity A record is an instance of an entity. In ISIS, a specific student record (you, for example) would be an instance of the student entity (table). Also, a specific course “CIS3309 Section 001” would be an instance of the course entity. Database Tables are models of application domain entities – those “things” that need to be manipulated in our system These things are no different in concept from the entities we model in VB Only here we use classes for our models rather than DB Tables Introduction to Databases Most tables include a primary key, which is a field that uniquely identifies each record in a table. Storing student records by name in a large university will make it difficult to find students with the same name. Giving each student a StudentID uniquely identifies each student. SSN uniquely identify each American in the U.S. The tables in a relational database are related to each other through their key columns. A column that identifies a related row in another table is called a foreign key. Illustration – Primary and Foreign Keys Product Table Intro to Databases – Naming Conventions A database named Employees with one table named tblEmployees containing the following fields: fldEmployeeID – (number) unique employee id number fldFirstName and fldLastName – (strings) employee's first and last name, respectively fldTelephone – (string or number) the employee's telephone number fldDateHired – (date value) stores date that the employee was hired fldWage – (number) contains the hourly wage for the employee fldDeductions – (number) contains the number of tax deductions claimed by the employee fldNotes is a string containing notes pertaining to the employee fldStatus – (Boolean) If True the employee is still employed. If False the employee is a former employee. fldType – (string) employee (valid values: fulltime and parttime) Populated Database Table Naming Database Tables and Fields Standard (Hungarian) prefixes are commonly used to name database tables and fields The prefix "tbl" denotes a table The prefix "fld" denotes a field Relational Databases Data stored in one table can be related to data stored in another table in a few ways: One-To-One Relationship One-To-Many Relationship. Many-To-Many Relationship. One-To-Many Relationship means a record in one table is related to many records in another table via primary and foreign keys. For example, a course record in the Course table for Temple university can relate to many students in the Student table. Relational Databases Many-To-Many Relationship means a record in Table A can have many records in Table B, and vice versa. It can be thought of as two separate one-to-many relationships. For example, a writer can write many books, and a book can have many writers. Thus, a Book table can contain a record that relates to an Author table by having multiple authors for a book, and an author record in the Author table can relate to the Book table by having a author that wrote many books. Student-course example? What do we do to many-many relationships? One-many. Many-one. One-To-One Relationship means a record in one table is related to one and only one record in another table. For example, an employee has a single resume, and each resume only belongs to one employee. Introduction to SQL Structured Query language (SQL) is an English – like language that is used to manage and manipulate data in the database. There are 4 basic SQL statements that are needed to work with data in a database. SELECT – used to query a table and find specific information based on some criteria. It selects one or more rows (records) from a table. INSERT – used to add one or more rows (records) to a table. UPDATE – used for modifying data in one or more rows (records) in a table. DELETE – used to delete one or more rows (records) in a table. SQL: Select Statement The result of a Select statement is called a result set, or result table, which is a logical set of rows that consists of all of the columns and rows requested by the Select statement. Syntax: SELECT column_name [, column_name2]... FROM table_name [WHERE selection-criteria] [ORDER BY column_name [ASC|DESC] [, column_name2[ASC|DESC]]...] SQL: Select Statement Example Example: SELECT ProductID, Name, UnitPrice FROM Products WHERE CategoryID = “props” ORDER BY UnitPrice DESC The asterisk (*) can be used to select all fields of a table. Example: SELECT * FROM Products Result Set shown above. Click button to view full table. SQL: Insert Statement The Insert statement is used to add a new row (record) to a table. It can be used to set fields to specified values as it adds the new record. Syntax: INSERT [INTO] table_name [(columns_list)] VALUES (values_list) The order of the items in columns list must match the order of the items in the values list. SQL: Insert Statement Example Example: INSERT INTO Products (ProductID, ShortDescription, UnitPrice) VALUES (“p101”, “Toy”, “19.99”) SQL: Update Statement The Update statement is used to modify/edit existing data within a table. Syntax: UPDATE table_name SET column_name = value [, column_name2 = value 2]... WHERE selection-criteria Syntax: UPDATE Products SET UnitPrice = “25.79”, OnHand = “3” WHERE ProductID=“arm01” SQL: Delete Statement The Delete statement is used to remove a row or rows from a table. Syntax: DELETE FROM table_name WHERE selection-criteria Example: DELETE FROM Products WHERE ProductID = “arm01” SELECT Example Table used in SQL Examples: Return to last viewed slide Introduction to DB Access in VS - ADO.NET The ActiveX Data Objects (ADO.NET) is part of the .NET framework that provides a multi-layered set of data access namespaces (set of classes), which allow us to access and modify data stored in a database. ADO.NET is the primary data access API for the .NET framework. It is a framework inside a framework. This framework’s responsibility is to allow our programs to work with databases by providing the abstraction against which we can program ADO.NET supports many different database platforms ADO.NET supports both a connected and disconnected data architecture. ADO.NET Libraries ADO.NET supports multiple database platforms SqlClient provides support for Microsoft SQL Server 7.0 and later. Import System.Data.SqlClient OracleClient provides support for Oracle 8.1.7 and later. Import System.Data.OracleClient OleDb (Object Linking & Embedding) provides support for conecting to OleDb sources like SQL server, Oracle, and Jet 4.0 (Access). Import System.Data.OleDb ODBC (Object Database Connection) provides support for connection to ODBC source. Only used when you are not connecting to an OleDb source. Import System.Data.ODBC The ADO.Net Platforms Microsoft JET Database Engine (Jet) is a database engine that is the main component for storing, parsing, optimizes, and processes queries. Jet stands for Joint Engine Technology and it is now part of the Windows operating system. The database engine used for Microsoft Access DBMS 2003 and earlier. It is a File-Server System DBMS where the work is done on the individual computer where the Access DBMS exists. Microsoft SQL Server is a client-server DBMS in which the client requests information from the database and the server processes the request. Provides functionalities beyond those available through Jet. Improved multi-user access and transaction-based processing. The ADO.NET Platforms - 2 Microsoft ACE Database Engine (Ace) is a database engine for the Access DBMS 2007 and later. Newest database engine for working with Access “.accdb” files. Oracle is another client-server DBMS used for large databases. Visual Studio & .NET have tools for working with all these database management systems. These tools are part of the sub-namespaces of the System.Data namespace More on ADO.NET ADO.NET has three main abstractions: The ADO.NET DataProviders are components (namespaces containing classes) specifically designed for working with a particular database platform. There is a namespace (a set of classes) for working with Access, SQL Server, one set specifically for Oracle, etc… The ADO.NET DataSet is a component used for working with data separate from the data source. DataProviders DataSets DataReaders An integral part of the disconnected data architecture. The ADO.NET DataReader is an abstraction that is the main feature for working in the connected mode of DB processing ADO.NET Components The DataAdapter is a component that is used to load data into the DataTable of a DataSet. Data Adapter’s major function is to manage the exchange of data between the database and the DataSet. The Command component is used to issue SQL commands for the DataAdapter to execute while connected to a database. The Connection component is used to establish a connection to a database for the DataAdapter to use. The DataReader component is used for read-only, forward-only stream of data coming from a database. This component is part of a connected data architecture. Introduction to Database Connections Object Linking & Embedding Database (OLE DB) connectivity system based on Active X technology. In .NET, ODBC and OLE DB are replaced by “providers” called Data Providers ActiveX is Microsoft technology used for developing reusable object oriented software components. Libraries of code provide all necessary communication between your code and database systems. Providers are an integral part of ADO.NET. Providers exist for the main database systems such as SQL Server, Access, and Oracle. For this course, we will be working in the provider world mostly with SQL Server or Access providers. ADO.NET Some Background A couple of important “connection systems” preceded the “provider” system that is now used with ADO.NET. Open Database Connectivity (ODBC) is a standard software API (interface) implemented by Microsoft. A common system that client software used as far back as the early 1990s to connect to databases and other simpler data sources. Client code didn’t have to deal with the specifics needed to connect to and communication with a database. This was all included in the ODBC driver, which allowed ODBCcompliant programs to access a data source. ODBC provided an abstraction through which a physical DB could be accessed (more on this to come) Disconnected vs. Connected The major advantage of the disconnected approach is that it improves system performance due to the fact that it uses less resources for maintaining connections when working with a data source. However, there is a disadvantage to this approach that occurs when two or more users try to update the same row of a table, which is called a concurrency problem. Database server User 1 DBMS User 2 Products data table Products table Products data table Disconnected vs. Connected ADO.NET contains two different approaches to work with data in a database: Connected Data Architecture and the Disconnected Data Architecture. Connected Data Architecture: Represents the objects that insist on having an open connection available for them to work and interact with the data source. ADO.NET provides classes to communicate directly with a data source. Disconnected Data Architecture: Represents the objects that open and close a connection to the data source as needed. Your application works with data that was copied from the data source. Any changes made to the “copied” data in the Dataset will be later reconciled with the data source. Connected applications alone don't fulfill the demands of today’s distributed applications. Improved performance, more efficient and more complex than the connected approach. Disconnected vs. Connected Concurrency problems occur because once data is retrieved from the database, the connection is dropped. As a result, the DBMS can’t manage the update process. There are two ways that ADO.NET can handle concurrency problems: 1. Optimistic Concurrency – The program checks to see if there has been a change to the row since it was retrieved. If it has, the update or deletion will be refused and a concurrency exception will be thrown that your program must handle. 2. “Last In Wins” – Since no checking is done, the row that was updated by the last user will overwrite changes made to the row by the previous user. Another way to avoid concurrency problems is to retrieve and update only one row at a time. ADO.NET Explained ADO.NET Components for Disconnect Approach: Dataset .NET data provider Data table Data adapter Command Database server Connection Database ADO.NET Explained ADO.NET Components for the Connected Approach: .NET data provider Data reader Select command Database server Connection Insert, Delete, or Update command Database Steps to Working with ADO.NET First, a database connection is established Then the database is opened SQL commands are sent over the open connection ADO.NET builds an in-memory representation of the returned data from any SELECT command and changes are made to this representation When processing is complete, the database connection is closed It is suggested that you use the Open and Close methods to explicitly open and close your connections. In this way, you keep your connections open only as long as needed to complete the required processing It then closes the database until the next processing is to be done Processing Database Elements All elements of a data base, tables (entities), rows (sets of values), columns (database fields), and individual cells (database values) are represented as abstractions in .NET In other words, every table, row, and column are represented as instances of the abstractions (classes) that are part of the ADO.NET support structure We will see how all this fits together later An Underlying Framework – The Connection Keeping a mental picture of this process in your head is important It all begins with the connection Already visited this idea for files (Chapter 10). Now we revisit it a third time for databases Remember … In your programs, you manipulate objects using methods belonging to the class that defines the type of the object The class (an abstract data type) is an abstraction (model) of the object being manipulated The Object Framework 1 Recall, too … We must have a class to model the entity to be manipulated The class is said to define an “abstract data type” We can then create objects (instances of the class we just created) Next comes the tricky part … We now have to connect that object to the actual entity to be drawn on The Object Framework 2 For databases, the object is an instance of a class that provides an abstract view of the database including an abstraction of database tables, rows, and columns All of these database features are modeled using ADO.NET classes So we program against elements of the ADO.NET classes (see the first slide on ADO.NET) without concern for the underlying structures The Object Framework 3 – The Connection The database abstraction (methods and data stores) that you program against (DB commands, internal datasets and data tables) The connection – via creation of a database connection object The actual, physical database to be manipulated ‘Example - Connection to a Microsoft Access Database Dim myConnectionObj As New OleDbConnection(ConnectString) ‘Example - Connection to SQL Server Database Dim myConnectionSqlObj As New SqlConnection(SqlConnectString) Remember … Why do this – what is this all about? The abstraction enables us to manipulate database objects in a uniform, consistent way, regardless of the DBMS (Access, MySQL, SQL Server, Oracle) The underlying DBMS is hidden under several layers of ADO.NET software which do all the work Connection Strings – more details 1 All we need to do is connect the program’s ADO.NET database object to the underlying database to be used Examples: From Dr. Lefkovitz ‘SQL Server Connection Dim SqlConnectString As String = _ "server=dwarf.cis.temple.edu;Database=fa06_c342132;” & _ "User id=fa06_c342132;Password=jimbo“ Dim myConnectionSql As New SqlConnection(SqlConnectString) ‘Access Connection Dim ConnectString As String = "provider=Microsoft.Jet.OLEDB.4.0;” _ “Data Source=c:\users1\dl\Inventory.mdb“ Dim myConnection As New OleDbConnection(ConnectString) Programmer layer – build commands (SELECT, INSERT, DELETE, and UPDATE) for execution against the database Logical layer – uses a DataAdapter to send commands from your software (across the data adapter) to the database getting back tables to be processed (we will visit this in a later section of the Chapter 11 Lectures) Physical layer – manages the actual (physical) execution of the commands on the database Connection Strings – more detail 2 Documentation for Visual Studio contains includes an article entitled “Working with Connection Strings” which you can find by using on-line help index “connection strings [ADO.NET]” You can also go to http://www.connectionstrings.com to locate lots more information about connection strings Connection Strings – more detail 3 A connection string … A connection string consists of key-value pairs An equals sign separates a key and value A semicolon separates a key-value pair Connection strings differ between database providers Example: Dim ConnectionString As String ConnectionString = _ "Provider=Microsoft.Jet.OLEDB.4.0;" & _ "Password=""""; User ID=Admin;" & _ "Data Source= |DataDirectory|\Employees.mdb" The OleDbConnection Class A connection is represented by the System.Data.OleDb.OleDbConnection class Properties The ConnectionString property contains a string that determines how ADO.NET will establish the database connection. One of the benefits of ADO.NET is that if a program is migrated to use a different database, all of the other code remains intact. Only the ConnectionString needs to be changed. The ConnectionTimeout specifies the number of seconds that can elapse to establish a connection. The DataSource property gets the location and file name of the database. This file name is embedded into the ConnectionString. The Provider property defines the database provider. This information is also embedded into the ConnectionString property. The State property gets the state of the connection. A Few Methods The Open and Close methods open and close the connection, respectively. An Event The StateChanged event fires when the connection state changes. Database Processing Sequence Establish a connection to your data source using the Connection object. Create an SQL command statement either using a string or the Command object. Execute the Command object within the context of the connected database. Process the results of the commands by retrieving/storing the data when necessary. Use the DataReader to read the records. Use the DataAdapter and DataSet for storage of data retrieved from the database, and for use in your program. Close the connection and release resources utilized by all the objects. ADO.NET Terms Summary DataSet: DataAdapter: Executes commands (SQL statements) against the data source. Data Providers: Connects to the data source. Command: Manages the exchange of data between the Dataset and a database. Allowing a DataSet and DataTable to be filled from the data source and later reconciled with the data source. Connection: Data in the Dataset is independent of the database that was used to retrieve it. The ADO.NET classes responsible for working directly with a specific database. Data Reader Retrieves query results in a read-only, forward-only connected result set. ADO.NET Resources Connection Strings: OleDbConnection Class: http://msdn.microsoft.com/en-us/library/system.data.oledb.oledbcommand.aspx OleDbDataAdapter Class: http://msdn.microsoft.com/enus/library/system.data.oledb.oledbconnection.aspx OleDbCommand Class: http://www.connectionstrings.com http://msdn.microsoft.com/enus/library/system.data.oledb.oledbdataadapter.aspx OleDbDataReader Class: http://msdn.microsoft.com/enus/library/system.data.oledb.oledbdatareader.aspx