Introduction to Data Management Chapter 1, Pratt & Adamski Data and Information DATA: Facts concerning people, objects, vents or other entities. Databases store data. INFORMATION: Data presented in a form suitable for interpretation. Data is converted into information by programs and queries. Data may be stored in files or in databases. Neither one stores information. KNOWLEDGE: Insights into appropriate actions based on interpreted data. Knowledge Generation DATA INFORMATION Basic Principles DATABASE: A shared collection of interrelated data designed to meet the varied information needs of an organization. DATABASE MANAGEMENT SYSTEM: A collection of programs to create and maintain a database. Define Construct Manipulate Advantages of Database Processing More information from same data Shared data Balancing conflicts among users Controlled redundancy Consistency Integrity Security Increased productivity Data independence Disadvantages of Database Processing Increased size Increased complexity More expensive personnel Increased impact of failure Difficulty of recovery Cost Especially server and mainframe systems Objectives of the DBMS Approach SELF-DESCRIBING DATA INDEPENDENCE MULTIPLE VIEWS MULTIPLE USERS What is a Database Management System? Data Files Directory Access Engine Utility Programs Database DATA METADATA ACCESS ENGINE UTILITIES Files and Databases Metadata “Data about data” Description of fields Display and format instructions Structure of files and tables Security and access rules Triggers and operational rules Database Access USER INTERFACE DATABASE PROGRAM History of Database Management File Management Systems Hierarchical Model IBM “Information Management System (IMS)” 1966 Network Model Charles Bachman’s “Integraded Data Store (IDS)” 1965 Conference on Data Systems Languages /DataBase Task Group CODASYL/DBTG (1971) Relational Model E.F. Codd, 1970 File Management Systems Provided facilities to extract data and share files, but did not implement any way to connect records in one file to those in another. Relationships had to be implemented in application code. Database vs File Systems Program 1 Meta-Data Program 2 Meta-Data Program 3 Meta-Data Program 1 Program 2 Program 3 FILE SYSTEM Data DATABASE MetaData Data Structured Databases Relationships were implemented by physical pointers (called “sets”) which allowed records to be connected in different files. Hierarchical databases allow only one parent set; networks allow several. These permit efficient processing but the sets must be constructed on data entry and cannot be rearranged later. Relational Models Relational models implement relationships with matched data values in related files (called primary and foreign keys). Any attributes can be matched. The connection is established at retrieval so interconnections can be developed as needed. Hierarchy SECTION STUDENT COLLEGE INSTRUCTOR COLLEGE Each file can have only one parent. To implement a second “parent” (COLLEGE) we have to implement a shadow copy. Network SECTION STUDENT INSTRUCTOR COLLEGE Each file can have several parents. Both SECTION and COLLEGE are “parent” files.. Relational SECTION SECTION-STUDENT SECTION-INSTRUCTOR SECTION-KEY STUDENT-KEY SECTION-KEY INSTRUCTOR-KEY STUDENT INSTRUCTOR COLLEGE-KEY COLLEGE-KEY COLLEGE Each file can have several parents. Both SECTION and COLLEGE are “parent” files.. Relational Terminology Entity Person, place, thing or event about which we wish to keep data Attribute property of an entity Relationship an association among entities (entity records) KERR MCGEE’S LIFE CYCLE STAGE PROCESS MODEL DATA MODEL Initialization Report Report Feasibility Report High Level DFD Process Analysis (Business Chart) High Level E/R Diagram Requirements General DFD High Level Dictionary Top Down E/R File Specifications Requirements Logical DFD Data Dictionary File Specifications Process Logic Bottom Up E/R Action Diagrams System Design Structure Charts Module IPO Specification Screen/Report Layouts Cleanup Volume/Usage Analysis Physical Schema Index/Record Specs Coding/Testing Test Plan Logs and Documentation Code Implementation Installation Plan Population Plan Data Management Designing and managing information in a data base environment requires: Understanding the principles of data modeling in system design. Using SQL for data manipulation. Understanding the concepts of managing data in a database environment. Information System Modeling Approaches PROCESS MODELING: The traditional method of designing systems by following the changes to data flows. DATA MODELING: An approach to system development that specifies the file structure that conforms to the things important to the organization. PROTOTYPING: An iterative approach that focuses on building small operating OBJECT MODELING (Event driven design): Defines objects that contain data and associated processing rules encapsulated together.