MENU ANALYS CONTA IS CT DATABAS E SYSTEMS Chapter 4 MENU ANALYS CONTA IS CT objectives After studying this chapter, you should:Double-click on it ● Understand the operational problems inherent in the flat-file approach to data management that gave rise to the database approach ● Understand the relationships among the fundamental components of the database concept ● Recognize the defining characteristics of three database models: hierarchical, network, and relational ● Understand the operational features and associated risks of deploying centralized, partitioned, and replicated database models in the DDP environment ● Be familiar with the audit objectives and procedures used to test data management ANALYS CONTA IS CT MENU Data ANALYSIS DATA MANAGEMENT APPROACH FLAT-FILE APPROACH ● It is most often associated with so-called LEGACY SYSTEMS. The flat-file environment promotes a single-user view approach to data management whereby end users own their data files rather than share them with other users. Data files are therefore structured, formatted, and arranged to suit the specific needs of the owner or primary user of the data. ● Data redundancy Replication of essentially the same data in multiple files. MENU ANALYS CONTA IS CT Data ANALYSIS ANALYS CONTA IS CT MENU Data ANALYSIS THREE SIGNIFICANT PROBLEMS IN THE FLAT-FILE ENVIRONMENT Data storage ● ● ● ● Data Updating Currency of information Task-Data dependency ANALYS CONTA IS CT MENU Data ANALYSIS DATA MANAGEMENT APPROACH THE DATABASE APPROACH ● Access to the data resource is controlled by a database management system. THE DBMS is a special software system that is programmed to know which data elements each user is authorized to access. This approach centralizes the organization's data into a common database that is shared by the user community. MENU ANALYS CONTA IS CT Data ANALYSIS ANALYS CONTA IS CT MENU KEY ELEMENTS OF THE DATABASE ENVIRONMENT ● ● ● ● DBMS USERS DATABASE ADMINISTRATOR (DBA) PHYSICAL DATABASE Data ANALYSIS MENU ANALYS CONTA IS CT Database Management System ● Typical Features PROGRAM DEVELOPMENT - The DBMS contains application software. Both programmers and end users may employ this feature to create applications to access the database. 1. BACKUP and RECOVERY - During processing, the DBMS periodically makes backup copies of the physical database. 2. DATABASE USAGE REPORTING - This feature captures statistics on what data are being used, when they are used, and who uses them. 3. DATABASE ACCESS - The most important feature of a DBMS is to permit authorized user access, both formal and informal to the database. 4. MENU ANALYS CONTA IS CT THREE SOFTWARE MODULES ● DATA DEFINITION LANGUAGE is a programming language used to define the database to the DBMS. 3 LEVELS ● Physical view/internal view ● Conceptual view/logical view (schema) ● External view/user view (subschema) MENU ANALYS CONTA IS CT THREE SOFTWARE MODULES ● DATA MANIPULATION LANGUAGE is the proprietary programming language that a particular DBMS uses to retrieve, process, and store data. Selected DML commands can be inserted into programs that are written in universal languages such as COBOL & FORTRAN. MENU ANALYS CONTA IS CT THREE SOFTWARE MODULES ● QUERY LANGUAGE The second method of database access is the informal method of queries. This feature allows authorized users to process data independent of professional programmers by providing a "friendly" environment for integrating and retrieving data to produce query reports. MENU ANALYS CONTA IS CT DBMS OPERATION MENU ANALYS CONTA IS CT DATABASE ADMINISTRATOR The DBA is responsible for managing the database resource. The sharing of a common database by multiple users requires organization, cooperation, rules and guidelines to protect the integrity of database. MENU ANALYS CONTA IS CT FUNCTIONS OF THE DATABASE ADMINISTRATOR Database Planning: ● Develop organization's database strategy Define database environment Define data requirements Develop data dictionary MENU ANALYS CONTA IS CT FUNCTIONS OF THE DATABASE ADMINISTRATOR Design: ● Schema Subschema Internal view of databases DB controls MENU ANALYS CONTA IS CT FUNCTIONS OF THE DATABASE ADMINISTRATOR Implementation: ● Determine access policy Implement security controls Specify tests procedures Establish programming standards MENU ANALYS CONTA IS CT FUNCTIONS OF THE DATABASE ADMINISTRATOR Operation and Management: ● Evaluate database performance Recognize database as user needs demand Review standards and procedures MENU ANALYS CONTA IS CT FUNCTIONS OF THE DATABASE ADMINISTRATOR Change and Growth: ● Plan for change and Growth Evaluate new technology MENU ANALYS CONTA IS CT ORGANIZATIONAL INTERACTIONS OF THE DBA MENU ANALYS CONTA IS CT Typical File Processing Retrieve a record from the file based on its primary key value. Operations 1. 2. Insert a record into a file. 3. Update a record in the file. 4. Read a complete file of records. 5. Find the next record in a file. 6. Scan a file for records with common secondary keys. 7. Delete a record from a file. Data ANALYSIS MENU ANALYS CONTA IS CT Data ANALYSIS Data Structures Data structures are the bricks and mortar of the database. The data structures allows records to be located, stored, and retrieved, and enables movement from one record to another. Two fundamental components Organization Access method MENU ANALYS CONTA IS CT Data ANALYSIS Data Organization The organization of a file refers to the way records are physically arranged on the storage device. This may be either sequential or random. Data Access Methods Access methods are computer program that are part of the operating system and are used to locate records and to navigate through the database. MENU 1. 2. 3. 4. 5. 6. ANALYS CONTA IS CT The criteria that influence the selection of the data structure include: Rapid file access and data retrieval Efficient use of disk storage space High throughput for transaction processing Protection from data loss Ease of recovery from system failure Accommodation of file growth Data ANALYSIS MENU ANALYS CONTA IS CT Data ANALYSIS DBMS Models A data model is an abstract representation of the data about entities of interest. These include resources (asset), events (transactions), and agents (personnel or customers, etc.) and their relationships in an organization. The purpose of a data model is to represent entities and defining their attributes in a way that is understandable to users. Three common models are the hierarchical, the network, and the relational models. Because of certain conceptual similarities, we shall examine the hierarchical and network models first. These are termed navigational models because of explicit links or paths among their data elements. MENU ANALYS CONTA IS CT Data ANALYSIS Database Terminology Entity and Record Type. An entity is anything about which the organization wishes to capture data. A Record Type is a physical database representation of an entity . Database designers group together into tables (files) the record types that pertain to specific entities. Occurrence. The term occurrence relates to the number of records of represented by a particular record type. Attributes. Entities are defined by attributes Database. A database is the set of record types that an organization needs to support its business processes. Associations. Record types that constitute a database exist in relation to other record types. This is called an association. Three basic associations are one-to-one, one-to-many, and many-to-many. MENU ANALYS CONTA IS CT MENU ANALYS CONTA IS CT Data ANALYSIS The Hierarchical Model The earliest DBAs were based on the hierarchical data model. This was a popular method of data presentation because it reflected, more or less accurately, many aspects of an organization that are hierarchical in relationship. IBM's information management system (IMS) is the most prevalent example of a hierarchical database. It was introduced in 1968 and is still a popular database model over 40 years later. The hierarchical model is constructed of sets that describe the relationship between two linked files. Each set contains a parent and a child. Files at the same level with the same parent are called siblings. This structure is also called a tree structure. The highest level in the tree is the root segment, and the lowest file in a particular branch is called a leaf. MENU ANALYS CONTA IS CT MENU ANALYS CONTA IS CT Data ANALYSIS Navigational Databases The hierarchical data model is called navigational database because traversing the files requires following a predefined path. MENU ANALYS CONTA IS CT MENU ANALYS CONTA IS CT DATA INTEGRATION IN A HIERARCHICAL DATABASE MENU ANALYS CONTA IS CT Data ANALYSIS Limitations of the Hierarchical Model The hierarchical model presents an artificially constrained view of data relationships. Based on the proposition that all business relationships are hierarchical (or can be represented as such) this model does not always reflect reality. The following rules, which govern the hierarchical model, reveal it operating constraints: 1. A parent record may have one or more child records. 2. No child record can have more than one parent. MENU ANALYS CONTA IS CT MENU ANALYS CONTA IS CT Data ANALYSIS The Network Model In the late 1970s, an ANSI committee created the Committee on Development of Applied Symbol Languages (CODASYL), which formed database task group to develop standards for database design. CODASYL developed the network model for databases. The most popular example of the network model is integrated database management system (IDMS), which Cullinane/Cullinet Software introduced into the commercial market in the 1980s. MENU ANALYS CONTA IS CT MENU ANALYS CONTA IS CT Data ANALYSIS TheE. F.Relational Model Codd originally proposed the principles of the relational model in the late 1960s. The formal model has its foundation in relational algebra and set theory, which provide the theoretical basis for most of the data manipulation operations used. The relational model portrays data in the form of two-dimensional tables. Properly designed tables posses the following four characteristics: All occurrences at the intersection of a row and a column are a single value. No multiple values (repeating groups) are allowed. 1. 2. The attribute values in any column must all be of the same class. 3. Each column in a given table must be uniquely named. 4. Each row in the table must be unique in at least one attribute. This attribute is the primary key. MENU ANALYS CONTA IS CT MENU ANALYS CONTA IS CT MENU ANALYS CONTA IS CT DATABASE in a distributed environment Two Basic Options: ● Centralized Databases ● Distributed Databases MENU ANALYS CONTA IS CT DATABASE in a distributed environment Centralized Databases ● The first approach involves retaining the data in a central location. Remote IT units send request for data to the central site, which processes the request and transmits the data back to the requesting IT unit. MENU ANALYS CONTA IS CT MENU ANALYS CONTA IS CT Data currency in a ddp During data processing, account balances pass through a environment state of temporary inconsistency where their values are incorrectly stated. To achieve data currency, simultaneous access to individual data elements by multiple IT units must be prevented. The solution to this problem is to employ a database lockout, which is software control (usually a function of the DBMS) that prevents multiple simultaneous accesses to data. MENU ANALYS CONTA IS CT DATABASE in a distributed environment Distributed Databases ● Distributed databases fall into two categories: partitioned databases or replicated databases. This section examines issues, features, and trade-offs that need to be evaluated in deciding the disposition of the databases. MENU ANALYS CONTA IS CT distributed databases ● Partitioned databases The partitioned database approach splits the central database into segments or partitions that are distributed to their primary users. The advantages of this approach: ● Having data stored at local sites increases users' control ● Transaction processing response time is improved by permitting local access to data and reducing the volume of data that must be transmitted between IT units. ● Partitioned databases can reduce the potential effects of a disaster. By locating data at several sites, the loss of a single IT unit does not eliminate all data processing by the organization. MENU ANALYS CONTA IS CT MENU ANALYS CONTA IS CT The Deadlock Phenomenon In a distributed environment, it is possible for multiple sites to lock out each other from the database, thus preventing each from processing its transactions. A deadlock is a permanent condition that must be resolved by special software that analyzes each deadlock condition to determine the best solution. MENU ANALYS CONTA IS CT MENU ANALYS CONTA IS CT Deadlock Resolution Resolving a deadlock usually involves terminating one or more transactions to complete processing of the other transactions in the deadlock. Some of the factors that are considered in this decision follow: ● The resources currently invested in the transaction. ● The transaction's stage of completion. ● The number of deadlocks associated with the transaction. MENU ANALYS CONTA IS CT Replicated Databases Replicated databases are effective in companies where there exists a high degree of data sharing but no primary use. Since common data are replicated at each IT unit site, the data traffic between sites is reduced considerably. MENU ANALYS CONTA IS CT MENU ANALYS CONTA IS CT MENU ANALYS CONTA IS CT Concurrency Control Database concurrency is the presence of complete and accurate data at all user sites. System designers need to employ methods to ensure that transactions processed at each site are accurately reflected in the databases of all the other sites. A commonly used method for concurrency control is to serialize transactions. MENU ANALYS CONTA IS CT Database Distribution Methods and the Accountant The decision to distribute databases is one that should be entered into thoughtfully. There are many issues and trade-offs to consider. Here are some of the most basic questions to be addressed. ● Should the organization's data be centralized or distributed? ● If data distribution is desirable, should the databases be replicated or partitioned? ● If replicated, should the databases be totally replicated or partially replicated? ● If the database is to be partitioned, how should the data segments be allocated among the sites? MENU ANALYS CONTA IS CT CONTROLLING AND AUDITING DATA Controls over data management SYSTEMS systems fall into two categories: Access MANAGEMENT Controls and Backup Controls. ● ACCESS CONTROLS In the shared database environment, access control risks include corruption, theft, mis use, and destruction of data. These threats originate from both unauthorized intruder and authorized users who exceed their access privileges. User Views The user view or subschema is a subset of the total database that defines the user's data domain and provides access to the database. MENU ANALYS CONTA IS CT MENU ANALYS CONTA IS CT MENU ANALYS CONTA IS CT Inference Controls One advantage of the database query capability is that it provides users with summary and statistical data for decision making. To preserve the confidentiality and integrity of the database, inference controls should be in place to prevent users from inferring, through query features, specific data values that they are unauthorized to access. MENU ANALYS CONTA IS CT Inference controls attempt to prevent three types of compromises to the database: 1. Positive compromise - the user determines the specific value of a data item. 2. Negative compromise - the user determines that a data item does not have a specific value. 3. Approximate compromise - the user is unable to determine the exact value of an item but is able to estimate it with sufficient accuracy to violate the confidentiality of the data. MENU ANALYS CONTA IS CT MENU ANALYS CONTA IS CT Audit Procedures for Testing Database Access Controls ● Responsibility for authority tables and subschemas ● Appropriate Access Authority ● Biometric Controls ● Inference Controls ● Encryption Controls MENU ANALYS CONTA IS CT Backup Controls Data can be corrupted and destroyed by malicious acts from external hackers, disgruntled employees, disk failure, program errors, fires, floods and earthquakes. To recover from such disasters, organizations must implement policies, procedures and techniques that systematically and routinely provide backup copies of critical files. MENU ANALYS CONTA IS CT MENU ANALYS CONTA IS CT MENU ANALYS CONTA IS CT MENU ANALYS CONTA IS CT Audit Objective Relating to Flat-File Backup Verify that backup controls in place are effective in protecting data files from physical damage. Audit Procedures for Testing Flat-File Backup Controls • Sequential file backup • Backup Transaction files • Off-Site Storage MENU ANALYS CONTA IS CT Backup The backup features makes a periodic backup of the entire database. This is an automatic procedure that should be performed at least once a day. Transaction Log (Journal) Provides an audit trial of all processed transactions. It lists transactions in a transaction log file and records the resulting changes to the database in a separate database change log. MENU ANALYS CONTA IS CT Checkpoint Feature The suspends all data processing while the system reconciles the transaction log and the database change log against the database. Recovery Module Uses the logs and backup files to restart the system after a failure. MENU ANALYS CONTA IS CT MENU ANALYS CONTA IS CT Audit Objective Relating to Database Backup Verify that controls over the data resource are sufficient to preserve the integrity and physical security of the database. Audit Procedures for Testing Database Backup Controls The auditor should verify that backup is performed routinely and frequently to facilitate the recovery of lost, destroyed or corrupted data without excessive reprocessing The auditor should verify that automatic backup procedures are in place and functioning and that copies of the database are stored off-site for further security MENU ANALYS CONTA IS CT Chapter 4 Thank you for listening GROUP 4 MONTOYA, ALEISSA GRACE SACMAN, CARMELA SORIANO, VANESSA SY, ANGELIKA JOYCE