Chapter 5 Organizing Data & Information Data & Databases Data consists of raw facts that when organized may be transformed into information A Database is a collection of data organized to meet users’ needs A Database Management System (DBMS) is a group of programs that manipulate the database & provide an interface between the database & the user of the database or other application programs Chapter 5 2 IS for Management The Hierarchy of Data (Figure 5.1) Database Management System Database File (table) Record (entity, row) Field (characteristic, column) Chapter 5 Byte3 (character) IS for Management Data Entities, Attributes, & Keys Entity: A generalized class of people, places, or things for which data is collected, stored, & maintained • Examples: customers, employees Attribute: A characteristic of an entity; something the entity is identified by • Examples: customer name, employee name Key: A field or set of fields in a record that is a unique identifier of a record • Examples: social insurance number, customer number Chapter 5 4 IS for Management Keys & Attributes (Figure 5.2) <-------------------------------------Attributes------------------------------------> Employee Number Last Name First Name Hire Date Department Number 005-10-6321 Johns Francine 10-7-65 257 549-77-1001 Buckley Bill 2-17-79 650 098-40-1370 Fiske Steven 1-5-85 598 Key Field Chapter 5 5 IS for Management The Traditional Approach (Figure 5.3) Separate files are created & stored for each application program Chapter 5 6 IS for Management Drawbacks to the Traditional Approach Data redundancy – Duplication of data in separate files Lack of data integrity – The degree to which the data in any one file is accurate Program-data dependence – A situation in which programs & data organized for one application are incompatible with programs & data organized differently for another application Inability to Link Data Chapter 5 7 IS for Management The Database Approach (Figure 5.4) A pool of related data is shared by multiple applications. Rather than having separate data files, each application uses a collection of data that is either joined or related in the database. Chapter 5 8 IS for Management Advantages to the Database Approach Improved strategic use of corporate data Reduced data redundancy Improved data integrity Easier modification & updating Data & program independence Better access to data & information Standardization of data access A framework for program development Better overall protection of the data Shared data & information resources Chapter 5 9 IS for Management Disadvantages to the Database Approach Relatively high cost of purchasing & operating a DBMS in a mainframe operating environment Increased cost of specialized staff Increased vulnerability Chapter 5 10 IS for Management Database Design Logical design precedes physical design – Abstract model of how data should be structured & arranged – Users should assist in creating logical design Physical design starts with the logical design – What specific hardware/software will be used – Fine-tuning of logical design for performance/cost considerations – Planned Data Redundancy • A way of organizing data in which the logical database design is altered so that certain data entities are combined – Summary totals are carried in the data records rather than calculated from elemental data – Some data attributes are repeated in more than one data entity to improve database performance Chapter 5 11 IS for Management Data Modeling Data Model – A map or diagram of entities & their relationships Enterprise data modeling – Data modeling done at the level of the entire organization Entity-Relationship (ER) diagrams – A data model that uses basic graphical symbols to show the organization of & relationships between data (Figure 5.5) Chapter 5 12 IS for Management Database Models Hierarchical (Figure 5.6): A data model in which the data is organized in a top-down or inverted tree structure Network (Figure 5.7): An expansion of the hierarchical database model with an owner-member relationship in which a member may have many owners Relational (Figure 5.8): All data elements are placed in two-dimensional tables, called relations, that are the logical equivalent of files Chapter 5 13 IS for Management A Relational Database (3 tables) Chapter 5 14 IS for Management Relational Database Terminology Domain: Allowable values for attributes Selecting: Data manipulation that eliminates rows (records) according to user-defined criteria Projecting: Data manipulation that eliminates columns (attributes) in a table Joining: Data manipulation that combines two or more tables Linking: Relating tables in a relational database together by a common attribute(s) Chapter 5 15 IS for Management Schemas & Subschemas Schema – View of the entire database – Includes logical & physical structure & relationships among all data Subschema – User view of a portion of the database – Can have many subschemas for one database Chapter 5 16 IS for Management Data Definition Language & Dictionary Data Definition Language (DDL) – A collection of instructions & commands used to define & describe data & data relationships in a database Data Dictionary – A detailed description of all data used in the database • • • • • • • Chapter 5 Provides a standard definition of terms & data elements Assists programmers in designing & writing programs Simplifies database modification Reduces data redundancy Increases data reliability Faster program development Easier modification of data & information 17 IS for Management Logical & Physical Access Paths (Figure 5.14) Physical Access Path DBMS accesses a storage device to retrieve data Data on Storage Device DBMS Logical Access Path Application requires information from the DBMS Chapter 5 Management inquiries 18 Other Software Application Programs IS for Management Manipulating Data Concurrency Control – A method of dealing with a situation in which two or more people need to access the same record in a database at the same time Data Manipulation Language (DML) – The commands that are used to manipulate the data in a database Structured Query Language (SQL) – A standardized data manipulation language for querying a database – Most modern databases are SQL compliant Chapter 5 19 IS for Management DBMS Selection Criteria Database size Number of concurrent users Performance Integration Features Vendor Cost Chapter 5 20 IS for Management Database Developments (1) Distributed Database – A database in which the actual data may be spread across several smaller databases connected via telecommunications devices – Transparent to user (user does not know where data is) Replicated Database – Duplicate of original database (saves telecom time/$$) Chapter 5 21 IS for Management Database Developments (2) Data Warehouse – A relational database management system designed specifically to support management decision making Data Mart – A subset of a data warehouse for small & medium-size businesses or departments within larger companies Data Mining – Automated discovery of patterns & relationships in a data warehouse – Built-in analysis tools Chapter 5 22 IS for Management Database Developments (3) On-line Transaction Processing (OLTP) – TP happens at time of transaction On-line Analytical Processing (OLAP) – Supports high speed analysis of data involving complex relationships Multidimensional Databases – Data can include graphics, photographs, sound files, etc. Open Database Connectivity (ODBC) – Software written in compliance with ODBC standards can be used with any ODBC-compliant database Chapter 5 23 IS for Management Object-Relational Database Management Systems Can manipulate audio, video, & graphical data Hypertext: Users can search & manipulate alphanumeric data in an unstructured way Hypermedia: Users can search & manipulate multimedia forms of data Spatial Data Technology: Use of an objectrelational database to store & access data according to the location it describes & to permit spatial queries & analysis Chapter 5 24 IS for Management Case US West Chapter 5 25 IS for Management