Chapter 1 1 File Systems and Databases Prof. Sin-Min Lee Dept. of Computer Science Introducing the Database Major Database Concepts 1 Data and information Data - Raw facts Information - Processed data Data management Database Metadata Database management system (DBMS) 1 Sales per Employee for Each of ROBCOR’S Two Divisions 1 Figure 1.1 1 1 1 1 Introducing the Database Importance of DBMS 1 It helps make data management more efficient and effective. Its query language allows quick answers to ad hoc queries. It provides end users better access to more and better-managed data. It promotes an integrated view of organization’s operations -- “big picture.” It reduces the probability of inconsistent data. 1 The DBMS Manages the Interaction Between the End User and the Database 1 Figure 1.2 1 Introducing the Database Why Database Design Is Important? 1 A well-designed database facilitates data management and becomes a valuable information generator. A poorly designed database is a breeding ground for uncontrolled data redundancies. A poorly designed database generates errors that lead to bad decisions. 1 1 1 1 Historical Roots Why Study File Systems? 1 It provides historical perspective. It teaches lessons to avoid pitfalls of data management. Its simple characteristics facilitate understanding of the design complexity of a database. It provides useful knowledge for converting a file system to a database system. Contents of the CUSTOMER File 1 Figure 1.3 1 1 Table 1.1 Basic File Terminology 1 Data “Raw” facts that have little meaning unless they have been organized in some logical manner. The smallest piece of data that can be “recognized” by the computer is a single character, such as the letter A, the number 5, or some symbol such as; ‘ ? > * +. A single character requires one byte of computer storage. Field A character or group of characters (alphabetic or numeric) that has a specific meaning. A field might define a telephone numbers, a birth date, a customer name, a year-to-date (YTD) sales value, and so on. Record A logically connected set of one or more fields that describes a person, place, or thing. For example, the fields that comprise a record for a customer named J. D. Rudd might consist of J. D. Rudd’s name, address, phone number, date of birth, credit limit, unpaid balance, and so on. File A collection of related records. For example, a file might contain data about ROBCOR Company’s vendors; or, a file might contain the records for the students currently enrolled at Gigantic University. Contents of the AGENT File 1 Figure 1.4 A Simple File System 1 Figure 1.5 1 1 1 1 1 1 1 1 File System Critique File System Data Management 1 File systems require extensive programming in a third-generation language (3GL). As the number of files expands, system administration becomes difficult. Making changes in existing file structures is important and difficult. Security features to safeguard data are difficult to program and usually omitted. Difficulty to pool data creates islands of information. File System Critique Structural and Data Dependence 1 Structural Dependence A change in any file’s structure requires the modification of all programs using that file. Data Dependence A change in any file’s data characteristics requires changes in all data access programs. Significance of data dependence is the difference between the data logical format and the data physical format. Data dependence makes file systems extremely cumbersome from a programming and data management point of view. File System Critique Field Definitions and Naming Conventions 1 A good (flexible) record definition anticipates reporting requirements by breaking up fields into their components. Example: – Customer Name Last Name, First Name, Initial – Customer Address Street Address, City, State FIELD CONTENTS CUS_LNAME Customer last name CUS_FNAME Customer first name CUS_INITIAL Customer initial CUS_AREACODE Customer area code CUS_PHONE Customer phone CUS_ADDRESS Customer street address or box number CUS_CITY Customer city CUS_STATE Customer state File System Critique Field Definitions and Naming Conventions 1 Selecting proper field names is very important. Names must be as descriptive as possible within restrictions. Naming must reflect designer’s documentation needs and user’s reporting and processing requirements. File System Critique 1 Data Redundancy: Uncontrolled data redundancy sets the stage for Data Inconsistency (lack of data integrity) Data anomalies Modification anomalies Insertion anomalies Deletion anomalies Figure 1.6 1 The Database System Environment 1 Figure 1.7 Figure 1.7 Database Systems The Database System Components 1 Hardware Computer Peripherals Software Operating systems software DBMS software Applications programs and utilities software Database Systems The Database System Components People 1 Procedures Systems administrators Database administrators (DBAs) Database designers Systems analysts and programmers End users Instructions and rules that govern the design and use of the database system Data Collection of facts stored in the database Database Systems The Database System Components 1 The complexity of database systems depends on various organizational factors: Organization’s size Organization’s function Organization’s corporate culture Organizational activities and environment Database solutions must be cost effective AND strategically effective. Database Systems Types of Database Systems Number of Users 1 Single-user – Desktop database Multiuser – Workgroup database – Enterprise database Scope Desktop Workgroup Enterprise Database Systems Types of Database Systems 1 Location Centralized Distributed Use Transactional (Production) Decision support Data warehouse 1 1