Database Fundamentals Introduction • • • • • • Introduction to database systems Database Management Systems (DBMS) Type of Database Database Design Database Design Considerations Component of Database Systems Introduction to Database Systems • A database is a software program that stores information which relates to a particular activity or purpose • Examples: – A bank needs to store the information relating to customer accounts – A hospital needs to keep data about patients and the medication dispensed, – A university needs to maintain record of its students, – A internet sales business needs to .. Introduction to Database Systems • A database includes tools to help organise and maintain the data. • A database design engineer must know and be able to use these tools in order to make the database efficient and accurate. • In a properly designed database the information is updated once and all other occurrences of this within the database will be updated automatically. • A principal requirement of any database system is the ability to generate, store and retrieve data efficiently. Basic components of a database system In order to convert data into useful information a set of software tools are need, SQL, Form, etc.. Database Fundamentals Introduction • • • • • • Introduction to database systems Database Management Systems (DBMS) Type of Database Database Design Database Design Considerations Component of Database Systems Database Management Systems (DBMS) • DBMS is a collection of programs provided by the vendor, which enable the data to be accessed, filtered and generally processed efficiently, in order to yield useful information for the user Database Management Systems (DBMS) • In principle, user will generate a question (query) and DBMS will search the dB to find the answer, which will be returned to user • A DBMS will include additional tools to provide graphical user interfaces and report generation programs to facilitate the delivery of information to the user. • It will also provide tools to create, edit, process, and modify data and also to program the database. SQL is a standard language accessing relational databases. What is a Database Management System? • A Database Management System (DBMS) is a program that manages a database: – Supports a high-level access language (e.g. SQL). – Application describes database accesses using that language. – DBMS interprets statements of language to perform requested database access. SQL examples: • Example 1: SELECT Course_Number FROM StudentRecords WHERE student_name = ‘ ‘; • Example 2: SELECT student_number FROM StudentRecords WHERE BEngCourse = ‘2388’ AND average_mark >70; • These examples show that SQL is a very useful tool for interrogating a database Database Fundamentals Introduction • • • • • • Introduction to database systems Database Management Systems (DBMS) Type of Database Database Design Database Design Considerations Component of Database Systems Types of Database • Classification by user – Single user databases: Access was initially designed as a single user database – If Single user database system used for multiple user, there are two problems need to be concern: Data corruption and Performance & Speed. Types of Database • Classification by user – Single user databases: If there are less than 50 users than this type of database is referred to as a workgroup database, if the users number is bigger than 50, it is referred to as an Enterprise database Microsoft’s solutions for the multiuser market • File-server: Access database is configured as a network share so that multiple users can use it simultaneously • Client/Server: Access work as a client, there is MS SQL server in the server side • Database replication: MS networking supports replication through its directory services. • Web-based database solutions. ASP/Access or ASP/SQL server Classification by location • Centralised Database: In a centralised database system there is only one database which is shared between users. – Q: What is the main disadvantage of centralised databases? How do we solve it? – A: Perhaps the main disadvantage of this type of database is that a failure of the server running the database will prevent all users from using the database until the server is back on-line. To overcome this problem larger database applications tend to be of a distributed type – Q: What is the main disadvantage of centralised databases? How do we solve it? – A: Perhaps the main disadvantage of this type of database is that a failure of the server running the database will prevent all users from using the database until the server is back on-line. To overcome this problem larger database applications tend to be of a distributed type Classification by location Distributed Database Classification by location Distributed Database • A distributed database is commonly found in networks. Here the current version of the database is kept on different servers. Any changes are replicated across the network so that all copies of the database are current. – Q: What are the main disadvantages and disadvantages of distributed databases? • The main advantage of this arrangement is perhaps in fault tolerance. If one of the servers is down, others can take over the workload. These type of databases are common in enterprise databases. • The main disadvantage is cost, both in terms of software licensing and the cost of the hardware in a network. In heavy utilisation environment network traffic may also be a problem. Classification by type • Classification by type distinguishes between a transactional database and a data warehouse. • The distinction is fairly intuitive with the transactional database being used to process day-to-day transaction information and maintain a record of decisions made relating to this data. Due to the fact that many transactions can occur in this database system, and that these transactions will impact on decisions made on that day, these types of databases are treated as time critical. • On the other hand, the data warehouse is used to generate strategic decisions and the data stored tends to be longer term. Consequently this data is considered to be NON-time-critical. • In terms of the number of applications the transactional database (also referred to as production database) is the more common of the two database types. Classification by type What is a Transaction? • When an event in the real world changes the state of the enterprise, a transaction is executed to cause the corresponding change in the database state – With an on-line database, the event causes the transaction to be executed in real time • A transaction is an application program with special properties - discussed later - to guarantee it maintains database correctness What is a Transaction Processing System? • Transaction execution is controlled by a TP monitor – Creates the abstraction of a transaction, analogous to the way an operating system creates the abstraction of a process – TP monitor and DBMS together guarantee the special properties of transactions • A Transaction Processing System consists of TP monitor, databases, and transactions transactions Transaction Processing System DBMS database DBMS database TP Monitor System Requirements • High Availability: on-line => must be operational while enterprise is functioning • High Reliability: correctly tracks state, does not lose data, controlled concurrency • High Throughput: many users => many transactions/sec • Low Response Time: on-line => users are waiting System Requirements (con’t) • Long Lifetime: complex systems are not easily replaced – Must be designed so they can be easily extended as the needs of the enterprise change • Security: sensitive information must be carefully protected since system is accessible to many users – Authentication, authorization, encryption OLTP vs. OLAP • On-line Transaction Processing (OLTP) – Day-to-day handling of transactions that result from enterprise operation – Maintains correspondence between database state and enterprise state • On-line Analytic Processing (OLAP) – Analysis of information in a database for the purpose of making management decisions OLAP • Analyzes historical data (terabytes) using complex queries • Due to volume of data and complexity of queries, OLAP often uses a data warehouse • Data Warehouse - (offline) repository of historical data generated from OLTP or other sources • Data Mining - use of warehouse data to discover relationships that might influence enterprise strategy Examples - Supermarket • OLTP – Event is 3 cans of soup and 1 box of crackers bought; update database to reflect that event • OLAP – Last winter in all stores in northeast, how many customers bought soup and crackers together? • Data Mining – Are there any interesting combinations of foods that customers frequently bought together? Database Fundamentals Introduction • • • • • • Introduction to database systems Database Management Systems (DBMS) Type of Database Database Design Database Design Considerations Component of Database Systems Database Design Very basic level database components Database Design • The design of a database is concerned with the way that these components are organised and this determines how efficient the design solution will be. • To design the data structure we need to ask ourselves a few questions, for example: – 1. Do we know all the fields that we need to include so that our records and files contain all the necessary data? – 2. What fields will each of our records contain. – 3. Do we allow fields to duplicate in records? This will create data redundancy. – 4. What records are needed in files? Database Fundamentals Introduction • • • • • • Introduction to database systems Database Management Systems (DBMS) Type of Database Database Design Database Design Considerations Component of Database Systems Database design considerations • Data dependence and Structural dependence – Data is organised by the designer into a structure that suits its application. Thus, there is a structural dependence associated with data in a database. – Structural dependence deals with the mechanism for changing database structure. – Good database design should ensure that it is relatively easy to make the necessary modifications to the structure. – For example, the addition of a field to an existing file structure. – data dependence and is concerned with the issues of data itself, rather than the structure. – For example, data dependence will be concerned with how easy it is to change a field from say being an integer to a decimal. Database design considerations • Data Redundancy – Data Redundancy is perhaps the most common problem in dB design. This is to say that the same data is stored in a number of locations. For example, a student’s address is kept in the finance file as well as enrolment and exam results file. – Good database design will ensure that the data is not duplicated in this manner. This will save on storage space (memory) and also improve the speed of data access and processing. – data integrity, refers to the consistency of data which is duplicated. – For example, if our design requires us to maintain a student’s email address in a number of different locations, then failing to update a change in e-mail address in all locations will result in data inconsistency. Database design considerations • Data anomalies – Modification, insertion and deletion anomalies all relate to data in a one-to-many relationship. For example, there is one personal tutor to many students. So this is a one-to-many relationship and the database will keep a record indicating the personal tutor for each student. If the tutor’s telephone number changes then, because the relationship is one-to-many, all student files have to be updated with this new information. Failure to update will result in a data anomaly, which in this case is referred to as a modification anomaly. – Similarly we face problems with for example inserting new tutor, when this is required and deleting an old tutor. All the students files have to be updated accordingly, and failure to do this will result in an insertion anomaly or conversely a deletion anomaly. Database Fundamentals Introduction • • • • • • Introduction to database systems Database Management Systems (DBMS) Type of Database Database Design Database Design Considerations Component of Database Systems Components of Database Systems Components of Database Systems • A Database system will comprise of hardware and software components. – The hardware part includes the computer hardware but it is worth noting that besides the computer and peripheral hardware, computer networks hardware plays a significant role in modern database systems. – The software components include the operating system (O/S), the database engine (DBMS) and the additional tools including application and utility software. • The operating system refers to the software that makes all the resources available to users. The resources usually include hardware such as printers, hard disks, network interface cards, multimedia equipment, but this can also include software, such as applications that need to be shared between users. Common operating systems include UNIX, Linux, Microsoft DOS and Windows, Mac and MVS from IBM. The database engine sits above the O/S and uses it to communicate with resources. It has its own software tools to help the users in their efforts to communicate with the database and obtain the required information. Common DMS include Microsoft access, Oracle, IBMs DB2.