Data Management Options Dr. Merle P. Martin MIS Department CSU Sacramento Acknowledgments Dr. Russell Ching (MIS Dept) Source Materiel / Graphics Edie Schmidt (UMS) - Graphic Design Prentice Hall Publishing (Permissions) Martin, Analysis and Design of Business Information Systems, 1995 Agenda Why manage data? Definitions Typical problems Data Administrator The DBMS Distributing data Why Manage Data? Delayed output (paycheck) Locate a resource Where is the stock item stored? Where does the employee work? Why Manage Data? Make resource decisions Should we turn account over to collection agency? Should we send customer letter asking why he / she hasn’t shopped here in 6 months? Should we give employee overtime? Why Manage Data? Determine resource status Is there enough stock in warehouse to satisfy this customer’s order? How much should I order? What is the value of this resource? balance sheet Definitions File: resource inventory: Material People Employees, customers Funds Customer balances Accounts Payable Definitions Data Organization Bit / byte Character Field Record File DBMS Data Hierarchy for Stereos to Go Database File { 12345 Smith John A 123 Main Street Sacramento CA 95819 12345 Smith John A 123 Main Street Sacramento CA 95819 12345 Smith John A 123 Main Street Sacramento CA 95819 Record Field Character (Byte) Smith 10110011 Bit 1 Definitions Views: Physical - how stored Logical - how viewed and used Volatility: - % records that change Immediacy: rapidity of change Storage Problems Redundancy Accuracy Security Lack of data sharing Report inflexibility Inconsistent data definitions Too much data information overload Data Administrator Clean up data definitions Control shared data Manage distributed data Maintain data quality Clean Up Definitions Synonyms / aliases Standard data definitions names and formats Date of Birth (AJIS) mm/dd/yy (courts) dd/mm/yy (corrections) Data Dictionary COBOL Control Shared Data Local - used by one unit Shared - used by two or more activities Impact of proposed program changes on shared data Program-to-data element matrix Control or clearinghouse? Manage Distributed Data Geographically dispersed whether shared data or not Different levels of detail different management levels Low Infrequent Frequency of Use Very frequent Quite old Currency Required Accuracy Future Time Horizon High Highly current Historical Aggregate Wide Scope Well defined Detailed Level of Aggregation External Operational Control Source Management Control Internal Strategic Planning Maintain Data Quality Put owners in charge of data verify data accuracy and quality Fairbanks Court example Who owns the data? Issue Should the Data Administrator control ALL data, or just that data that crosses organizational boundaries? WHAT DO YOU THINK? The DBMS Data Base Management System: software that permits a firm to: centralize data manage them efficiently provide access to applications such as payroll, inventory DBMS Components Data Design Language (DDL) Data Manipulation Language (DML) Inquiry Language (IQL) Teleprocessing Interface (TP) Martin, Figure 16-5 Designers Teleprocess DDL Database DML Update Applic. Software Programmers IQL Interface Retrieve End-Users IQL LANGUAGE Data Base IQL SELECT EMP-ID, EMP-FIRSTNAME, EMP-LASTNAME, EMP-YTD-PAY FROM EMPLOYEE WHERE EMPID=1234 . 3-level Database Model James Martin Sprague / McNurlin, Fig. 7-2, pg. 207 External Level (1) User views (logical) By application program Each has unique view Schema / subschema Schema and Subschemas Physical Database Individual Views Subschema User User DBMS DBMS Software Schema Overall View of the Data Subschema User User Subschema User User Enterprise Level (2) Under control of Data Administrator DBMS Implementation data removed passwords report views Physical Level (3) Schema Pointers (e.g., next record) Flags (e.g., record frozen) Traditional Data Models Hierarchical - one parent Network more than one parent student to course, major Relational (tables) Hierarchical Model Project 1 Dept. A Dept. B Dept C 1 3 5 2 4 Employees 6 Network Model John Smith Jane Smith Savings Mortgage Checking Account Number First Name Middle Initial Last Name ... Credit Limit Customer Order Number Order Date Account Number Date Shipped Orders Order Number Line Item Number Product Code Quantity Line Items Product Code Product Name Price Unit Manufacturer Code Products Relational Manufacturer Code Manufacturer Name Manufac(turer) Object-oriented DBMS An object is: a piece of data PLUS procedures performed on data PLUS attributes describing data PLUS relationship between object and other objects Distributed Data Goals: move processing as close to users as possible allow several applications to run simultaneously on same data Distributed Types Fragmented distribute data without duplication users unaware of where data located Segmented data duplicated one site has master file problem with data synchronization Why Distribute? Save money offload DB processes to less expensive machines (PCs) Lower telecommunications costs DB closer to users Decrease dependence on a single computer manufacturer Why Distribute Move control closer to owner Increased DBMS scope more varied types of data link at workstations Permit storage of multimedia data True Distributed DB Local autonomy (ownership) No reliance on central site Continuous operations not affected by another site Data transparency Independence Independence Fragmentation Replication Hardware Software Networks Database Problems With Distributed Databases Security Shared data simultaneous update Complexity Need telecommunications infrastructure Issue Is data in your organization totally distributed? How? Should it be? Why or why not? Points to remember Definition Typical problems Role of Data Administrator The DBMS Distributing data