Database Systems Information Systems Intermediate 2 Data and Information Data is raw, unprocessed facts and figures. Data is collected, stored and processed by computers. Examples: 368 HR101FE 010468 Baker 25168 Data and Information Information is processed data with structure or meaning. Information is useful to humans. Examples: Age: 36 years 8 months Post Code: HR10 1FE Date of Birth: 01/04/68 Occupation: Baker Total Spent: £251.68 What is a database? A database is a collection of related information about a set of persons or objects. Traditionally, databases have been manual paper based systems. Example: the “Yellow Pages” What is a database management system? A database management system (DBMS) is a software package which is used to create, manipulate and present data from electronic databases. Example of DBMSs include Microsoft Access and Filemaker Pro. Traditional databases storage of paper records was very bulky easy to mis-file a record, or records to be lost or damaged data often duplicated in several records keeping records up-to-date was difficult and time consuming, and often resulted in data inconsistency, where duplicated values were updated in one record but not in others many people employed to maintain the records, which was costly searching for records was time consuming producing reports, such as sorted lists or data collated from several sources, was extremely time consuming, if not impossible Case Study: DVD Rentals Member Number 1012 1034 1056 1097 Title Miss Mr Mr Mrs Forename Isobel John Fred Annette Surname Ringer Silver Flintstone Kirton Telephone No. 293847 142536 817263 384756 Case Study: DVD Rentals DVD Code 002 003 Title Finding Nemo American Pie Cost £2.50 £2.50 008 011 The Pianist Notting Hill £2.50 £2.50 014 015 Prime Suspect £2.00 Shrek £1.50 Date Out 03/09/04 27/08/04 01/09/04 04/09/04 27/08/04 04/09/04 27/08/04 10/09/04 Date Due 04/09/04 28/08/04 02/09/04 06/09/04 28/08/04 06/09/04 28/08/04 11/09/04 Member Number Name 1034 John Silver 1056 F Flintstone Isobel Ringer 1097 Annette Kirton 1012 I Ringer 1086 F Flintstone Annette Kirton 1034 Joan Silver Benefits of computerised databases Searching, sorting and calculating operations can be performed much more quickly and easily. Information is more easily available to users, due to improved methods of data retrieval. Data integrity is improved resulting in more accurate information. Types of computerised database Flat file Relational Flat file databases DVD Code Title Cost Date Out Date Due Member Number 002 Finding Nemo £2.50 03/09/04 04/09/04 1034 John Silver 142536 003 American Pie £2.50 27/08/04 28/08/04 1056 Fred Flintstone 817263 003 American Pie £2.50 01/09/04 02/09/04 1012 Isobel Ringer 293847 008 The Pianist £2.50 04/09/04 06/09/04 1097 Annette Kirton 384756 011 Notting Hill £2.50 27/08/04 28/08/04 1012 Isobel Ringer 293847 011 Notting Hill £2.50 04/09/04 06/09/04 1056 Fred Flintstone 817263 014 Prime Suspect £2.00 27/08/04 28/08/04 1097 Annette Kirton 384756 015 Shrek 10/09/04 11/09/04 1034 Joan Silver 142536 £1.50 Name Telephone Number Limitations of flat file databases Data is very likely to be duplicated. The duplication of data leads to the possibility of data inconsistency. It is not possible to store information about a member without entering details of a DVD. This is called an insertion anomaly. Removing a DVD from the database may remove the only record which stores details of a Member. This is called a deletion anomaly. Relational databases A relational database stores data in more than one table. The idea is to ensure that data is only entered and stored once, so removing the possibility of data duplication and inconsistency. Entities and Data Relationships An entity represents a person or object. e.g. Member, DVD Rental Each entity has a set of attributes which describe examples or instances of that entity. The attributes of the DVD Rental entity are code, title, cost, date out, date due and member number The attributes of the Member entity are member number, name and telephone number. Entities, Attributes and Instances MEMBER Member Number 1012 Member Name Isobel Ringer Telephone Number 293847 1034 John Silver 142536 1056 Fred Flintstone 817263 1097 Annette Kirton 384756 Entities, Attributes and Instances The Member entity is the whole table MEMBER Member Number 1012 Member Name Isobel Ringer Telephone Number 293847 1034 John Silver 142536 1056 Fred Flintstone 817263 1097 Annette Kirton 384756 Entities, Attributes and Instances The Member entity Each column is the whole table attribute, e.g. Member Name MEMBER Member Number 1012 Member Name Isobel Ringer Telephone Number 293847 1034 John Silver 142536 1056 Fred Flintstone 817263 1097 Annette Kirton 384756 stores one Entities, Attributes and Instances The Member entity Each column stores one is the whole table attribute, e.g. Member Name MEMBER Member Number 1012 Member Name Isobel Ringer Telephone Number 293847 1034 John Silver 142536 1056 Fred Flintstone 817263 1097 Annette Kirton 384756 Each row stores one instance, e.g. Member 1034 Entities, Attributes and Instances An entity represents a person or object. e.g. Member, DVD Rental Each entity has a set of attributes which describe examples or instances of that entity. The attributes of the DVD Rental entity are code, title, cost, date out, date due and member number The attributes of the Member entity are member number, name and telephone number. Data Relationships Three types of relationship: One-to-one One-to-many Many-to-many Data Relationships One-to-one VEHICLE REGISTRATION NUMBER Data Relationships One-to-one MEMBER One-to-many DVD RENTAL VEHICLE REGISTRATION NUMBER Data Relationships One-to-one MEMBER REGISTRATION NUMBER One-to-many DVD RENTAL VEHICLE Many-to-many PUPIL TEACHER More than one table Member Member Number Name Telephone Number 04/09/04 1012 Isobel Ringer 293847 American Pie 3 £2.50 01/09/04 02/09/04 1034 John Silver 142536 008 The Pianist £2.50 04/09/04 06/09/04 1056 Fred Flintstone 817263 011 Notting Hill £2.50 04/09/04 06/09/04 1097 Annette Kirton 384756 014 Prime Suspect £2.00 27/08/04 28/08/04 015 Shrek £1.50 10/09/04 11/09/04 003 American Pie 3 £2.50 27/08/04 28/08/04 011 Notting Hill 28/08/04 DVD Code Title Cost 002 Finding Nemo £2.50 03/09/04 003 Date Out Date Due £2.50 27/08/94 but there’s a problem… More than one table Member Number Member Name Telephone Number DVD Code 1012 Isobel Ringer 293847 003 1034 John Silver 142536 ? 1056 Fred Flintstone 817263 011 1097 Annette Kirton 384756 ? DVD Code Title Cost Date Out Date Due 002 Finding Nemo £2.50 03/09/04 04/09/04 003 American Pie 3 £2.50 01/09/04 02/09/04 008 The Pianist £2.50 04/09/04 06/09/04 011 Notting Hill £2.50 04/09/04 06/09/04 014 Prime Suspect £2.00 27/08/04 28/08/04 015 Shrek £1.50 10/09/04 11/09/04 003 American Pie 3 £2.50 27/08/04 28/08/04 011 Notting Hill £2.50 27/08/94 28/08/04 More than one table Member Number Member Name Telephone Number 1012 Isobel Ringer 293847 1034 John Silver 142536 1056 Fred Flintstone 817263 1097 Annette Kirton 384756 DVD Code Title Cost Date Out Date Due Member Number 002 Finding Nemo £2.50 03/09/04 04/09/04 1034 003 American Pie 3 £2.50 01/09/04 02/09/04 1012 008 The Pianist £2.50 04/09/04 06/09/04 1097 011 Notting Hill £2.50 04/09/04 06/09/04 1056 014 Prime Suspect £2.00 27/08/04 28/08/04 1097 015 Shrek £1.50 10/09/04 11/09/04 1034 Keys A key is a field, or set of fields, whose values uniquely identify a record. In any table, there may be more than one field, or set of fields, which can uniquely identify each record—these are called candidate keys. The candidate key which is chosen to be used is called the primary key. Keys Member Number Member Name Telephone Number 1012 Isobel Ringer 293847 1034 John Silver 142536 1056 Fred Flintstone 817263 1097 Annette Kirton 384756 Member Number is a candidate key for the Member entity MEMBER(Member Number, Name, Telephone Number) Keys DVD Code Title Cost Date Out Date Due Member Number 002 Finding Nemo £2.50 03/09/04 04/09/04 1034 003 American Pie 3 £2.50 01/09/04 02/09/04 1012 008 The Pianist £2.50 04/09/04 06/09/04 1097 011 Notting Hill £2.50 04/09/04 06/09/04 1056 014 Prime Suspect £2.00 27/08/04 28/08/04 1097 015 Shrek £1.50 10/09/04 11/09/04 1034 DVD Code is a candidate key for the DVD Rental entity DVD RENTAL(DVD Code, Title, Cost, Date Out, Date Due, *Member Number) Member Number is called a foreign key. Keys DVD Code Title Cost Date Out Date Due Member Number 002 Finding Nemo £2.50 03/09/04 04/09/04 1034 003 American Pie 3 £2.50 01/09/04 02/09/04 1012 008 The Pianist £2.50 04/09/04 06/09/04 1097 011 Notting Hill £2.50 04/09/04 06/09/04 1056 014 Prime Suspect £2.00 27/08/04 28/08/04 1097 015 Shrek £1.50 10/09/04 11/09/04 1034 DVD Code is a candidate key for the DVD Rental entity DVD RENTAL(DVD Code, Title, Cost, Date Out, Date Due, *Member Number) Member Number is called a foreign key. Keys A foreign key is a field which is not a primary key in its own table, but is a primary key in another table. Member Number is a foreign key in the DVD table, because it is the primary key in the Member table. Here is the data model: MEMBER(Member Number, Name, Telephone Number) DVD RENTAL(DVD Code, Title, Cost, Date Out, Date Due, *Member Number) Implementation 3 steps: Set-up the tables Populate the tables Manipulate and present the data Setting up the tables Which tables are required? Which fields are required? What are the properties of each field? Setting up the tables Which tables are required? The tables correspond directly to the entities in the data model. In this case, there will be two tables, Member and DVD Rental. Setting up the tables Which fields are required? The fields in each table are the attributes in each entity in the data model. Setting up the tables What are the properties of each field? Its name be consistent! Setting up the tables What are the properties of each field? Its name Its data type text numeric (integer, real, currency) date or time Boolean (yes or no) link object Setting up the tables What are the properties of each field? Its name Its data type Validation: Presence check Restricted Choice check Range check Populating the tables Take care to be accurate Validation: make sure the data is sensible Verification: make sure the date is correct Verification methods: Bar codes, OCR Manipulating the Data Searching records Sorting records Calculating values Presenting results Searching Which fields will be used to identify the records required? What are the search conditions for identifying the records required? Which fields will be displayed? E.g. Search for Test 3 = 10 “Test 3 = 10” is called the search condition Searching: Boolean operators Operator Meaning Example = equal to Age = 16 Surname = “Smith” <> not equal to Height < > 1.70 Certificate < > “PG” > greater than or after Age > 17 Surname > “N” Date of Birth > 01/05/1952 < less than or before Height < 1.9 Surname < “N” Date of Birth < 31/06/1990 >= greater than or equal to or after and including Age >= 17 Postcode >= “EH30” Date of Birth >= 01/05/1952 <= less than or equal to or before and including Height <= 1.95 Postcode <= “EH20” Date of Birth <= 30/06/1990 Searching: wildcard characters Searching: wildcard characters Character * Description Example wh* matches what, Matches any number of characters (zero or more). It can when, where, who, why, be used as the first or last character in the character string. white, etc. Matches any single alphabetic character. b?ll matches ball, bell, bill and bull Matches any single character within the brackets. b[ae]ll matches ball and bell but not bill or bull Matches any character not in the brackets. b[!ae]ll matches bill and bull but not ball or bell – Matches any one of a range of characters. You must specify the range in ascending order (A to Z, not Z to A). b[a-c]d matches bad, bbd, and bcd # Matches any single numeric character. 1#3 matches 103, 113, 123, etc. ? [] ! Wildcard Searches Search for Surname = “*son” Complex Searches A complex search involves more than one search condition (and usually more than one field) Search Search Search Search for for for for Test Test Test Test 3 3 3 3 = = > < 10 AND Average > 6 10 OR Average > 6 5 AND Test 3 < 8 2 OR Test 3 >9 Sorting Which field will be used to decide the order of records? This is called the sort key. For the sort key, will the order of sorting be ascending or descending? Sorting For a list of people with the tallest first For a list of people with youngest first sort in ascending order of age For alphabetical order sort in descending order of height sort in ascending order of surname “ascending order of surname” is called the sort condition Complex Sorting A complex sort involves more than one sort condition involving two or more fields. The main sort key is called the primary sort key, and the second one is called the secondary sort key. “Telephone book” order: Ascending order of Surname, then Ascending order of Forename Calculating Use formulas or expressions to calculate a value for a record based on other values in the record Presenting Use Layouts (Filemaker Pro) Use forms and reports (Microsoft Access) Which fields are required? Perform a search and/or sorting operation and present the results