ASPECTS OF FILE PROCESSING APPROACH USERS HAVE REQUIREMENTS FOR INFORMATION SYSTEM APPLICATIONS. USERS EITHER “COMMISSION” IS STAFF TO DEVELOP APPLICATION, OUTSOURCE IT, OR INCREASINGLY, DEVELOP IT THEMSELVES USING PC SOFTWARE. REGARDLESS OF WHO DOES IT, PROGRAMMING PROCESS IS ERROR-PRONE AND TIME-CONSUMING. EACH APPLICATION REQUIRES ITS OWN DATA IN ITS OWN FORMATS STORED IN ITS OWN FILES. Orders Department Program A Program B Accounting Department Program C Order Filling System Customer Master file Inventory Master file Program A Program B Invoicing System Back Order file Inventory Pricing file Customer Master file Payroll Department Program A Program C Payroll System Employee Master file PROBLEMS WITH FILE PROCESSING ENVIRONMENT UNCONTROLLED REDUNDANCY - Data - Programs DATA INCONSISTENCY - Redundant data is difficult to keep “in synch” LIMITED SHARING OF DATA - File structure incompatibility (ex: ISAM vs Direct Access) - System software incompatibility (ex: Oracle vs M/S Access) INFLEXIBILITY OF AVAILABLE INFORMATION - Files tend to isolate information and restrict what can be retrieved POOR ENFORCEMENT OF STANDARDS - “Random” naming conventions (ex: homonyms and synonyms) INFORMATION RESOURCE MANAGEMENT MORE DIFFICULT - Program maintenance (exs: “Which systems use which data?” “If we convert from 5-digit to 9-digit zip codes, what programs need to be changed?”; “What files and programs are not Y2K compatible?”) - Data security: multiple files ==> more vulnerability ELEMENTS OF DB APPROACH (1) DATA AS A RESOURCE Data considered as a resource in same way hardware/software/personnel are considered as resources. (2) DATA INDEPENDENCE Data format/structure independent of particular application. User view of data not tied to physical representation of data. (3) DATABASE MANAGEMENT SYSTEM (DBMS) DBMS is software which coordinates data resources in the same way that an operating system (OS) coordinates system resources. (4) DATABASE MANAGEMENT The management of hardware/software/ DBMS/data which implements DB approach. DATABASE APPROACH (1) Put data into common databases that can be accessed and shared by multiple users. Appl. Prog. 1 DB1 DB2 DB3 Appl. Prog. 2 Shared Data Resource (2) Provide uniform storage, access, and retrieval methodology for data. Ex: High level commands like STORE, UPDATE, RETRIEVE, etc. (3) Provide generalized, user-friendly interfaces so non-DP personnel may interact directly with data without programmer’s aid. Ex: Query language (SQL), DB application generators (PowerBuilder, M/S Access) BENEFITS OF DATABASE PROCESSING MINIMAL DATA REDUNDANCY Redundancy reduced (not eliminated) ==> controlled redundancy. CONSISTENCY OF DATA Inconsistency managed via single, centralized source of data. INTEGRATION OF DATA Data are organized into single, logical structure with logical relationships between “files”; Files can be “joined”, or integrated, where it makes sense; files no longer in isolation. SHARING OF DATA All programs access same data using standardized commands; DBMS supports multiple file structures; Separate “views” of a database can be created for each user. DATA ACCESSIBILITY AND RESPONSIVENESS DBMS supports programmed and unanticipated queries and reports via SQL; UNIFORM SECURITY, PRIVACY, AND INTEGRITY CONTROLS Centralized data facilitates standards and controls; DBMS provides data integrity features (ex: security, concurrency, transaction recovery, etc.). SUPPORTS IRM AND REDUCED PROGRAM MAINTENANCE Dictionary, or repository, stores data about information resources; APPLICATION AND DATA INDEPENDENCE Data can be changed without changing application programs EASE OF APPLICATION DEVELOPMENT Database application environments (e.g., PowerBuilder for Oracle) CHALLENGES OF DATABASE APPROACH ORGANIZATIONAL CONFLICT Data ownership (“Information is power.”); Implementation of data standards. SPECIALIZED PERSONNEL Database personnel (database analysts, DBA, etc) require special skills and training. EXTENSIVE OVERHEAD OF MAINTAINING DATA INTEGRITY Security, privacy, concurrency, recovery and backups. DISTRIBUTED DATA MANAGEMENT Data management is centralized in its approach whereas distributed computing over networks is decentralized. • MAKING DATA AVAILABLE ON THE WEB Data warehouses are used to store large amounts of time-series data for management analysis and decision-making. COMPONENTS OF A DATABASE SYSTEM DATABASE - User Data - Metadata - Indexes - Application Metadata • DATABASE MANAGEMENT SYSTEM (DBMS) - Design tools subsystem - Run-Time subsystem - DBMS Engine Data integrity Data dictionary APPLICATIONS - Forms - Queries - Reports - Menus - Application Programs KINDS OF DATA IN A DATABASE RECORD-BASED SOURCE DATA BITS => BYTES => FIELDS => RECORDS => FILES => DB META-DATA DATA ABOUT DATA IN THE DATABASE. (E.g., information about the data fields, their data types, and acceptable data values, etc.) DATA ABOUT APPLICATIONS (E.g., forms, queries, reports, etc.) OVERHEAD DATA INDEXES, KEYS, ETC. OTHER SOURCE DATA (Bit Map Data) VOICE IMAGES ANIMATION VIDEO FUNCTIONS OF A DBMS DATA DEFINITION Define and store database structure called the Schema (Metadata + Overhead Data) Load source data into database Provide multiple views of data DATA MANIPULATION Provide variety of access methods to tables (ISAM, direct, B-tree, sequential, etc.) Query processing (SQL) Provide database interface commands in existing programming languages such as C++, Java, C, Cobol. DATA INTEGRITY AND CONTROL Security Concurrent processing Backup and recovery Referential integrity APPLICATION ENVIRONMENT Programming language (e.g., VisualBasic) Menus, screens, forms and reports APPLICATION ENVIRONMENTS FOR DB PROCESSING LOOSELY-COUPLED vs TIGHTLY-COUPLED Access Basic vs Visual Basic or C++ DESIGN TOOLS Table creation tool Form creation tool Query creation tool Report creation tool Procedural language compiler RUN-TIME TOOLS Form processor Query processor Report writer Procedural language run-time