Introduction to Database Systems Objectives: •Introduction to Database Concepts •Introduce Microsoft Access •Tables, Queries, Forms, Reports etc •Creating a simple database table •Indexing CS&E 1111 AcIntro Data, Information, & Database Data : A collection of numbers and text : 273448926 Information : Meaning derived from data SSN # 273-44-8926 Database: A collection of related data stored in a specific format to simplify retrieval. The phone book and dictionary are examples of databases. John Smith’s SSN # 273-44-8926 DBMS – Database Management Systems are programs that help in the storage and retrieval of information from databases. (Access, Oracle) CS&E 1111 AcIntro DBMS Vs. Store and retrieve data Provide support for data “organizing” and selecting Can do simple calculations Efficient data handling Customers of VISA with balance > 1000 Spreadsheets Store and analyze data Provide support for complex calculations Can do simple data organizing and selecting Inefficient data handling What was the net profit of VISA CS&E 1111 AcIntro Access DBMS Objects CS&E 1111 AcIntro Access DBMS “objects”: Tables - a list of data organized into fields and records Queries - question structures to sort, filter and select specific information Forms - structures for screen views of data Reports - structures for written output of data Program Modules - program code to perform specific actions CS&E 1111 AcIntro A Database Management System Data Tables Relationships Data Input: Onto Tables Onto Forms From files Reports DBMS Query Results Queries Program Modules Output Files CS&E 1111 AcIntro An Inventory Database System Forms: Order Transactions New Vendors & New Customers Shipments Program Modules Tables: Current Inventory New Products Vendor List Orders Order Details Customer Accounts Program Modules Daily Ship List Reports & Queries Customer Invoices Accounts Payable Inventory Low Message Output File: CS&E 1111 AcIntro “Objects”: Tables Data is stored in objects called Tables • Data on tables are listed in rows called records •A record consists of one or more ordered categories called fields •Field types include Text. Number, Currency, Date etc. Certain fields are required fields and must be filled • Field Properties – format, validation rule, required, size, masks, defaults, ranges A table is a listing of multiple records, all records in a table have the same fields CS&E 1111 AcIntro Primary Key A field, or combination of fields, which uniquely identifies a record in a database CS&E 1111 AcIntro Table Datasheet View: Bank Customers Customers: Table acct# 456 543 544 999 1112 2456 3064 3575 4456 5323 ssn 123456789 139555002 157745969 123456789 178301771 201529842 227776183 257436001 290951508 328824082 Primary Key lname Sommer Suyama Lebihan Sommer Berglund Trujillo Moos Citeaux Callahan Moreno fname Martín Michael Laurence Martín Christina Ana Hanna Frédérique Laura Antonio Field address 5400 E. Broad ST 4110 Old Redmond Rd. 342 S. Drexel 5400 E. Broad ST 707 Oxford Rd. 7 Houndstooth Rd. 29 King's Way 3400 - 8th Avenue Suite 210 Coventry House Miner Rd. 49 Gilbert St. city Reynoldsburg Worthington Bexley Reynoldsburg Columbus Marysville Groveport Columbus Dublin Cleveland state OH OH OH OH OH OH OH OH OH OH zipcode 43213 43201 43209 43213 43222 42001 43287 43210 43255 40032 Record CS&E 1111 AcIntro A Second Table - “Transactions” keeps track of all deposits and withdrawals • What’s the Primary key on this table? Account# 456 544 2456 3064 4456 5323 5566 456 544 1112 4456 date 4/1/2012 4/1/2012 4/2/2012 4/2/2012 4/3/2012 4/3/2012 4/4/2012 4/9/2012 4/9/2012 4/11/2012 4/12/2012 transaction$ $1,231.39 $3,366.82 $2,301.68 $5,087.66 $12,838.42 $167.80 $641.92 ($1,186.26) ($3,243.43) $159.66 ($12,367.91) CS&E 1111 AcIntro Walkthrough: Setting up a Table using Design View CS&E 1111 AcIntro Divide Tables into Inseparable Fields Address as 1 field – 17 Main St. New York, New York 10002 Address as 4 fields Street Address - 17 Main St. City - New York State - New York Zip code – 10002 If you wanted to filter the table to get a list of only New York City residents how would this work with these two different Table structures CS&E 1111 AcIntro Defining Properties for each Field in a Table For a person’s social security number use: What field type? Text, Number - Short Integer, Number- Long Integer etc. Should it be optional or required? Does the value need to be within certain limits or from a predefined list? Is there a default value? Would an input mask be appropriate CS&E 1111 AcIntro Memory and Field Size Why adjust the field size for Social security number? Text: Memo: Up to 255 characters Up to 65,535 characters Numbers: Integer - 2 bytes - 16 bits Long Integer - 4 bytes - 32 bits (*) Byte (character) - 1 byte Single - 4 bytes - -precision 7 Double - 8 bytes - -precision 15 CS&E 1111 AcIntro How should you decide what information goes on which table? If a fact appears in more than one record of a table, then this fact should probably be defined in another table. Each fact change should change in only one place Example: Address Calculations shouldn’t be part of the database Example: Account number Example: Current Balance Select a Primary Key where applicable so you can relate your tables Example: Account number CS&E 1111 AcIntro Tools: Sorting and Filtering Sorting - allows the user to temporarily order the records by a specific field Ascending or Descending order Single or multiple sort fields Filtering - allows the user to view only specific records that meet the criteria Filter by form or filter by selection Specify a single criteria or use Boolean and/or for multiple criteria in multiple fields CS&E 1111 AcIntro Storing and Retrieving Records from Tables How are DBMS systems designed to efficiently handle data? Data is stored on magnetic or optical disk in a linear fashion To retrieve a specific record one would have to search them one at a time until the desired record is found. To make data retrieval more efficient one can “index” a table based on a specific field. Search routines could then be used on that field to more efficiently find the record Example: Alphabetically sorting a dictionary and then analyzing the first letter of the criteria in the sorted list. CS&E 1111 AcIntro Linear Search on Un-indexed Field Find the name of student for ss#606147775 ssn 606147775 178301771 988544806 290951508 257436001 874775203 419975418 371616449 536370683 774094122 157745969 474621100 227776183 328824082 684995818 123456789 139555002 201529842 lname Anders Berglund Buchanan Callahan Citeaux Davolio Dodsworth Fuller Hardy King Lebihan Leverling Moos Moreno Peacock Sommer Suyama Trujillo fname Maria Christina Steven Laura Frédérique Nancy Anne Andrew Thomas Robert Laurence Janet Hanna Antonio Margaret Martín Michael Ana •Go to the first record to see if 178301771 matches 606147775. If no match then check the next record until a match is found. •Here it must check 11 items before you find the correct one. On average it will have to check #entries/2 CS&E 1111 AcIntro A Binary Search on an Indexed Field Find name of student for ss#606147775 This table is sorted by SSN ssn 123456789 139555002 157745969 178301771 201529842 227776183 257436001 290951508 328824082 371616449 419975418 474621100 536370683 606147775 684995818 774094122 874775203 988544806 lname Sommer Suyama Lebihan Berglund Trujillo Moos Citeaux Callahan Moreno Fuller Dodsworth Leverling Hardy Anders Peacock King Davolio Buchanan fname Martín Michael Laurence Christina Ana Hanna Frédérique Laura Antonio Andrew Anne Janet Thomas Maria Margaret Robert Nancy Steven 1. First go to the middle record of the table and compare values to see if 606147775>328824082. 2. If it is greater, continue checking only from this midpoint to the end. Otherwise continue checking only from the beginning to the midpoint 3. Then go to the midpoint of this subsection. The process continues until a match is found Indexing can aid in querying efficiency using algorithms such as binary search (1/2,1/2 again) vs. a linear search (1 at time from top to bottom). A binary search cuts the processing time significantly for large databases CS&E 1111 AcIntro Indexing vs. Sorting Sorting your table & saving permanently changes the record order.. Use an index instead. ssn#606147775 is associated with record 14 ssn 123456789 139555002 157745969 178301771 201529842 227776183 257436001 290951508 328824082 371616449 419975418 474621100 536370683 606147775 684995818 774094122 874775203 988544806 lname Sommer Suyama Lebihan Berglund Trujillo Moos Citeaux Callahan Moreno Fuller Dodsworth Leverling Hardy Anders Peacock King Davolio Buchanan fname Martín Michael Laurence Christina Ana Hanna Frédérique Laura Antonio Andrew Anne Janet Thomas Maria Margaret Robert Nancy Steven CS&E 1111 AcIntro From Tables to a Relational Database We have carefully looked at tables in our database how to set them up, define fields etc. The real advantages of a DBMS however are in their ability to relate information. Thus far we have customer names on one list and transactions on a separate list. How do we match a customer’s name to specific transaction? We need a way to relate these two tables to extract useful information. We can relate these two tables by matching the account numbers. CS&E 1111 AcIntro One to Many Relationship Customers: Table acct# 456 543 544 999 1112 2456 ssn 123456789 139555002 157745969 123456789 178301771 201529842 lname Sommer Suyama Lebihan Sommer Berglund Trujillo fname Martín Michael Laurence Martín Christina Ana Acct#/Account No. is the Foreign Key address 5400 E. Broad ST 4110 Old Redmond Rd. 342 S. Drexel 5400 E. Broad ST 707 Oxford Rd. 7 Houndstooth Rd. Account# 456 544 2456 3064 4456 5323 5566 456 544 1112 4456 city Reynoldsburg Worthington Bexley Reynoldsburg Columbus Marysville date 4/1/1999 4/1/1999 4/2/1999 4/2/1999 4/3/1999 4/3/1999 4/4/1999 4/9/1999 4/9/1999 4/11/1999 4/12/1999 state OH OH OH OH OH OH zipcode 43213 43201 43209 43213 43222 42001 transaction$ $1,231.39 $3,366.82 $2,301.68 $5,087.66 $12,838.42 $167.80 $641.92 ($1,186.26) ($3,243.43) $159.66 ($12,367.91) CS&E 1111 AcIntro A Foreign Key is a field that defines the relationship between 2 tables: A valid foreign key must have all of the properties listed below: 1. Must be a primary key (unique) in at least one of the tables 2. The field names on each table do not have to match as long as the information is the same. Fields with the same name does not necessarily mean they are the foreign key. 3. The related fields must be the same data type (number, text etc) CS&E 1111 AcIntro Walkthrough: Establishing relationships Primary Key on Accounts table: acct# Foreign Key: acct#/Account# CS&E 1111 AcIntro Once Relationships are established you can gather information from one or more tables to answer questions like: Create a list of account numbers and owner names and total transactions What are the total deposits made by accounts starting with 5? What is the total balance of all accounts held by Jane Doe ? These requests are known as Queries CS&E 1111 AcIntro “Objects”: Queries To extract information from the Database use a Query which is a “question” or “request” The query is not the data that results but a set of instructions specifying how specific records or combinations of records should be extracted - it lets the user: prepare lists, sort, filter choose records to met specific criteria do calculations on the data Summarize data by a specific grouping match up the data to related information CS&E 1111 AcIntro Walkthrough: A Simple Query Create a query to list the first name, last name and transaction amount for each transaction We will spend the next 1-2 weeks learning, in greater detail, how to use the query tool. CS&E 1111 AcIntro “Objects”: Forms A form is not data that results but a set of instructions specifying a screen view format of the data. These forms are designed to simplify data display, inputting and editing. Walkthrough: Setting up a form CS&E 1111 AcIntro “Objects”: Reports A report is not the data that results but a set of instructions specifying the format of written output. Reports allow users to vary : The type of text formatting Report & Page headers, footers, titles Which data fields (from tables, queries and/or calculations) to be used Sorts, Filters, Groupings of fields Walkthrough: Setting up an AutoReport CS&E 1111 AcIntro CS&E 1111 AcIntro “Objects”: Macros &Modules Programmed instructions updating tables/queries from inputs customer address account balance retrieving data from other sources performing tasks if a certain criterion is met monthly bank statement balance > 20000 send letter offering estate planning service This course will not cover the Macros & Modules - we can refer you to more Advanced books on Access & Visual Basic CS&E 1111 AcIntro A Review of Database Theory Information is stored in objects known as tables consisting of records or related information categorized into fields. A field that uniquely identifies a table is known as a primary key field Tables can be related to each other using valid foreign key fields. Queries, Reports, Forms are objects which can be created using the data on tables to dynamically extract/display information in a specified format. CS&E 1111 AcIntro