DATABASE SUMMARY NOTE WHAT IS A DATABASE? A database is a store of data that has been organised (sorted) in some way. File: A collection of records and fields is called a file. Table: This is an organized store of related data and information arranged in a grid of rows and columns. WHAT IS A RECORD? A database record shows all the data about one person or object. Example is all the information about one student. Field: A single field can hold one piece of data e.g. first name. TYPES OF DATABASE a) FLAT-FILE DATABASE A flat-file database stores its data in one table, which is organised using rows and columns. They are only suitable for very simple databases. Advantages of Flat file database • All records are stored in one place. • It is an excellent option for small databases. • It requires less hardware and software components. • Very simple to setup. Disadvantages of flat-file database The problems with using a flat-file databases are as follows: • Duplicated data is often unnecessarily entered i.e. the same data is stored many times. • Database storage space is wasted with this duplicated data. • Duplicated data takes a long time to enter and update (unnecessarily). • More time is taken to enter new data or edit existing data in a database. • It takes more time to locate or update data in the database. • Flat file database is harder to update. • Harder to change data format. • It is poor database in terms of complex queries. • It increases redundancy and inconsistency. b) RELATIONAL DATABASE A relational database stores the data in more than one linked table, within the file. It uses two or more tables linked together (to form a relationship). Each table within a relational database will have a key field which holds unique data. ADVANTAGES OF RELATIONAL DATABASE The advantages of using a relational database instead of a flat-file database are as follows: • Duplicated Data is reduced. • Database storage space is not wasted (due to unnecessary duplicated data.) • Quicker to enter data as there are less duplicates. • Quicker to update data • Less time is taken to enter new data or edit existing data in a database. • It takes less time to locate or update data in the database. • The same data is not stored multiple times. • Less chance of data entry errors. DISADVANTAGES OF RELATIONAL DATABASE • It is expensive to setup and maintain. • Maintenance of the relational database becomes difficult over time due to the increase in the data. • There are limits to how well relational databases can scale i.e. it is not scalable. PRIMARY KEY Primary key is a single field that contain unique data that cab be used to identify every record in the database. A primary key field cannot contain blank records or duplicates. FOREIGN KEY A foreign key is a regular field in one table which is being used as the primary key in another table. Foreign keys are used to provide the link (relationship) between the tables. COMPOUND KEY A compound key is a primary key created using a number of foreign key fields rather than a single foreign key, which together create unique data. Compound key is created when two or more primary keys from different tables are present as foreign keys within a table. The foreign keys are used together to uniquely identify each record. COMPOSITE KEY A composite key is a special type of primary key which uses two or more fields from the same table to create a unique value that can be used to identify every record in that table. TYPES OF DATABASE RELATIONSHIP a) One-to-One Relationship Such a relationship exists when each record of one table is related to only one record of the other table. e.g. A driver has only one driver’s license, a passenger can be assigned only one seat on a plane. b) One-to-Many or Many-to-One Relationship Such a relationship exists when each record of one table can be related to one or more than one record of the other table. Example in a bank, one customer can have many accounts but one account can only be held by one customer. c) Many-to-Many Relationship Such a relationship exists when each record of the first table can be related to more than one record of the second table and a single record of the second table can be related to more than one record of the first table. A many-to-many relationship can be seen as a two one-to-many relationship which is linked by a 'linking table' or 'associate table'. The linking table links two tables by having fields which are the primary key of the other two tables. Example: If there are two entity type ‘Customer’ and ‘Product’ then each customer can buy more than one product and a product can be bought by many different customers. REASONS WHY ERROR MAY BE GENERATED WHILE CREATING RELATIONSHIP BETWEEN TWO TABLES a.) The primary key field and the foreign key field in related tables have been set up with different data types. b.) Data has been entered in the foreign key field of a table that does not exist in the primary key field of a related table. REFERENTIAL INTEGRITY Referential integrity forces table relationships to be consistent and avoids redundant data. This means that the data in the foreign key field must match the data in the primary key field. It prevents a user from accidentally changing or deleting data in one table, without the same action happening to the related data in another table. Referential integrity requires that, whenever a foreign key value is used it must reference a valid, existing primary key in the parent table. IMPORTANCE OF USING REFERENTIAL INTEGRITY • Prevents the entry of duplicate data. • Prevents one table from pointing to a non-existent field in another table. • Guaranteed consistency between related tables. • Prevents the deletion of a record that contains a value referred to by a foreign key in another table. • Prevents the addition of a record to a table that contains a foreign key unless there is a primary key in the linked table. DATABASE QUERIES Being able to search a database using queries is what makes it such a powerful tool. Imagine a database held by a bank. The bank wants to send out a letter to everyone who is overdrawn on their account. To search manually through every account would be impossible. A query would enable them to find thousands of customers within a few seconds. A database query is a request to access data from a database to manipulate it or retrieve it. TYPES OF DATABASE QUERY • SIMPLE QUERY A simple query is a query that searches using just one parameter. A simple query might use all of the fields in a table and search using just one parameter, or it might use just the necessary fields about which the information is required, but it will still use just one parameter (search criteria). • COMPLEX QUERY A complex query is a parameter query that searches using more than one parameter value or criterion i.e. on two or more criteria. It often uses Boolean operands such as AND, OR or NOT operands or a mixture of these. The example below shows that the query is searching for event = birthday AND paid = yes. It might be searching using two parameters e.g. event = lunch OR event = birthday, etc. • NESTED QUERY A nested query is a query within another query, often referred to as a subquery. Nested queries help you to use the result of one query as an input parameter of another. The innermost subquery is executed first, then next level, until the main query is reached. • SUMMARY QUERY Summary queries are used to summarise the contents of a database table. It is used to extract aggregate of data items for a group of records rather than a detailed set of records. They are also called Group-By queries/aggregate queries and use aggregate functions. It uses functions such as SUM, AVERAGE, MIN, MAX, COUNT. STATIC PARAMETER QUERY • In static parameter query, the parameter is a fixed value or hard coded into the query. This means that every time the query is run, it will only ever search for records which contain that fixed parameter. In other to change the parameter to another value, the query will need to either be recreated or edited in query design view to input another parameter. DYNAMIC PARAMETER QUERY • With a dynamic parameter query every time the query is run a box will appear that allows her to input a new parameter which it uses to search the database for records that match the input parameter. This saves the stress of having to open the query in design view to change the parameter or recreate the query with another parameter. Note: A parameter is the value that is used by the query to select the records. Examples of parameters include >5 which retrieves all records with fields greater than 5 or M* which returns all records with fields that begin with letter M, etc. NORMALISATION Normalisation is a technique that used to reduce the duplication of data in a relational database. It is the process of reorganizing data in a database so that it meets two basic requirements: There is no redundancy of data, all data is stored in only one place. Advantages of Normalisation 1. It reduces data duplication and redundancy and ensures that each piece of data is stored only once. 2. It Reduce insertion, deletion and updating anomalies. 3. It reduces the file size of the database due to there being no redundant data. 4. It groups data logically as well as reducing inconsistent data in tables by enforcing referential integrity. Provides data consistency within the database. It enforces Referential Integrity on Data. Making changes to a table is easier as there is less data to alter. With no duplicated data there will be fewer errors in the data 5. 6. 7. 8. • Disadvantages of Normalisation 1. The processing of data can be slower with a greater number of tables and links to navigate. 2. With the larger number of tables, setting up complex queries can be more difficult. 3. Requires skill and experience to do well. DATA DICTIONARY A data dictionary is a file or a document that contains the descriptions and information about the structure of data stored in a database. It contains metadata i.e. data about the database. The data dictionary contains important information about database objects such as tables, indexes, columns, datatypes, and relationships. COMPONENTS OF A DATA DICTIONARY Components of the data dictionary includes: - Table names - field names - data types - field size/ field length - Primary key and foreign key fields - Validation rules - Input masks, etc Example1: The data dictionary for the jobs table is as shown below: Example 2: Create a data dictionary to represent each of the tables below. ANSWER: Table name: Boat Field name Data type Boat ID CustID Make Type Price Length Year Catamaran BuiltHere Bought Sold Field size Numeric/integer Alphanumeric/text 6 Alphanumeric/text 30 Alphanumeric/text 7 Numeric (2 d.p) Numeric/integer Numeric/integer Boolean Boolean Date/Time Date/Time Key field PK FK Other metadata (validation rules, input mask,etc) LL00 000 Table name: Customers Field name Data type CustomerID FirstName FamilyName Telephone Email Discount Alphanumeric/text Alphanumeric/text Alphanumeric/text Alphanumeric/text Alphanumeric/text Numeric/percent Field size Key field 6 30 50 13 30 PK Other metadata (validation rules, input mask,etc) >0 and <=0.2 ENTITY-RELATIONSHIP DIAGRAM (ERD) An entity-relationship diagram is a diagram that shows the relationships or connections between the entities in a database. TYPES OF ENTITY-RELATIONSHIP DIAGRAM (ERD) 1. Conceptual ERD: A conceptual ERD only shows the relationships between existing entities within a business and the attributes/field names of each entity. It is only used during the analysis phase of the system lifecycle. It is simpler than logical or physical design and does not contain the details needed to implement the database to meet the business needs. An example of logical ER diagram 2. Logical ERD: Logical ERD is an extension of the conceptual ER diagram but includes the data types of the attributes. It includes the relationships between existing entities within a business, the attributes/field names and data types of each entity. It is also only used during the analysis phase of the system lifecycle and does not contain the details needed to create the database to meet the business needs. An example of logical ER diagram 3. Physical ERD: The physical ER diagram is the one used to create the database, so must include the entities (tables), attributes (fields) as well as the data types, field lengths, key fields (i.e. primary key fields, foreign key fields), types of relationships, indexes, constraints and all the details required for the physical implementation of the database to meet the business needs. The purpose of this is to describe the data that will be used in a relational database that meets the business requirements. An example of physical ER diagram