Organizing Data and Information Chapter 3 Fundamentals of Information Systems, Second Edition 1 Learning Objectives – Define general data management concepts and terms, highlighting the advantages and disadvantages of the database approach to data management. – Name three database models and outline their basic features, advantages, and disadvantages. Fundamentals of Information Systems, Second Edition 2 Learning Objectives – Identify the common functions performed by all database management systems and identify three popular end-user database management systems. – Identify and briefly discuss recent database applications. Fundamentals of Information Systems, Second Edition 3 The Hierarchy of Data Fundamentals of Information Systems, Second Edition 4 Date entries, attributes, and keys – Entity: Generalized class of people, places, systems for which data is collected. (Ex. Employees, customers) – Attribute: Characteristic of an entity (Ex. First name, last name) – Key: A set of fields used to identify an entity – Primary Key: A key that uniquely identified the entity Fundamentals of Information Systems, Second Edition 5 Keys and Attributes Fundamentals of Information Systems, Second Edition 6 The Traditional Approach To Data Management – Create new files for each application – Data redundancy – Data integrity Fundamentals of Information Systems, Second Edition 7 The Database Approach to Data Management Fundamentals of Information Systems, Second Edition 8 Advantages of the Database Approach (1) • Improved strategic use of corporate date – Accurate information always available • Reduced data redundancy – Data is stored in one place • Improved data integrity – Changes are reflected throughout • Easier modification and update – No need to know where the data is Fundamentals of Information Systems, Second Edition 9 Advantages of the Database Approach (2) • Data and program independence – Accurate information always available • Better access to data and information – Simple instructions to access data • Standardization of data access – Each DBMS uses the same set of instructions • Standardization for programmers – Should only know how to access the DBMS Fundamentals of Information Systems, Second Edition 10 Advantages of the Database Approach (3) • Better protection of data – Require authorization on the data • Shared data resources – Setup the database once – Several applications can use it Fundamentals of Information Systems, Second Edition 11 Disadvantages of the Database Approach • Costly – Specialized DBMS software – Specialized DBMS administrators and operators • Increased vulnerability – Single point of failure – Targets for attacks Fundamentals of Information Systems, Second Edition 12 Data Modeling • Planned data redundancy – To have it available in more than one place – To improve system performance • Data model – A diagram of entities and their relationships • Enterprise data modeling – Done at the level of enterprise • Entity-relationship diagrams – Use graphs to show how data is organized and how it is related Fundamentals of Information Systems, Second Edition 13 Entity-Relationship Diagram for a Customer Ordering Database Entity Relationship (one-to-many) Relationship (many-to-one) Relationship (one-to-one) Fundamentals of Information Systems, Second Edition 14 Database Models • Hierarchical (tree) – Data is organized top-down • Network – Owner-membership relationship – A member can have many owners • Relational – Uses tabular format with 2-dimensional tables (relations) – Relations resemble files Fundamentals of Information Systems, Second Edition 15 Hierarchical Database Model Fundamentals of Information Systems, Second Edition 16 Network Database Model Fundamentals of Information Systems, Second Edition 17 Relational Database Model Fundamentals of Information Systems, Second Edition 18 Relational Models Describe data using a standard tabular format with all data elements placed in two-dimensional tables, called relations, that are the logical equivalent of files. – Rows represent data entity – Columns represent attributes Fundamentals of Information Systems, Second Edition 19 Relational Models – Domain: Set of values an attribute can have • Age: Between 0-100 • Gender: Male or female – Selecting • Pick rows based on certain criteria • Select those whose gender is female – Projecting • Create a new table with a subset of attributes – Joining • Combine two or more tables Fundamentals of Information Systems, Second Edition 20 Linking Database Tables to Answer an Inquiry Fundamentals of Information Systems, Second Edition 21 Building and Modifying a Relational Database Fundamentals of Information Systems, Second Edition 22 Database Management Systems Fundamentals of Information Systems, Second Edition 23 Providing a User View • Schema - a description of the entire database – First create a schema, then create the tables • Subschema - a file that contains a description of a subset of the database and identifies which users can modify the data items in that subset – A sales representative has to see the data for her office, not the company stock data Fundamentals of Information Systems, Second Edition 24 The Use of Schemas and Subschemas Fundamentals of Information Systems, Second Edition 25 Creating and Modifying the Database • Data definition language (DDL) - a collection of instructions and commands used to define and describe data and data relationships in a specific database • Used to define the schemas • Data dictionary – detailed description of data in a database • Create a data dictionary when defining the schemas Fundamentals of Information Systems, Second Edition 26 Typical Uses of a Data Dictionary • • • • • • • Provide a standard definition of terms and data elements Assist programmers in designing and writing programs Simplify database modification Reduce data redundancy Increase data reliability Speed program development Ease modification of data and information Fundamentals of Information Systems, Second Edition 27 Storing and Retrieving Data Fundamentals of Information Systems, Second Edition 28 Data Access • Concurrency control: Lock the record so that only one application can access it at a time • Data manipulation language (DML) • Structured Query Language (SQL) • SELECT * FROM Project WHERE Project_number=“155” • UPDATE Project SET Project_number=“156” WHERE Project_number=“155” Fundamentals of Information Systems, Second Edition 29 Structured Query Language Fundamentals of Information Systems, Second Edition 30 Database Output Fundamentals of Information Systems, Second Edition 31 Popular Database Management Systems • • • • • • Oracle MySQL Paradox database FileMaker Pro Microsoft Access Lotus 1-2-3 Spreadsheet Fundamentals of Information Systems, Second Edition 32 Worldwide Database Market Share (2001) Fundamentals of Information Systems, Second Edition 33 Selecting a Database Management System (1) • Database size: Number of records in the database • Number of concurrent users: People or applications that will access it at the same time • Performance: How fast can the DBMS access or update records? Fundamentals of Information Systems, Second Edition 34 Selecting a Database Management System (2) • Integration: Which operating system can it run under? • Features: Which security procedures or privacy policies are in place? • Vendor: Size and reputation of the vendor • Cost: Initial cost, maintenance costs, hardware costs, personnel costs Fundamentals of Information Systems, Second Edition 35 Database Applications Fundamentals of Information Systems, Second Edition 36 Data Warehouses, Data Marts, and Data Mining • Data Warehouse - a database that collects business information from many sources in the enterprise, covering all aspects of the company’s processes, products, and customers. • Data Mart – a subset of a data warehouse. – For small and medium size businesses – Used mostly for decision support system • Data Mining - an information analysis tool that involves the automated discovery of patterns and relationships in a data warehouse. Fundamentals of Information Systems, Second Edition 37 Elements of a Data Warehouse Fundamentals of Information Systems, Second Edition 38 Common Data Mining Applications Fundamentals of Information Systems, Second Edition 39 Common Data Mining Applications (1) • Branding and positioning of products • Customer churn – Which customers can switch to competitors? • Direct marketing – Who would respond to telemarketing? • Fraud detection – Predict transactions which are likely to be illegal Fundamentals of Information Systems, Second Edition 40 Common Data Mining Applications (2) • Market-based analysis – Which products are bought at the same time (diaper, beer, chips) • Market segmentation – Group users based on similarity of products that they buy • Trend analysis – Analyze how variables change over time (e.g., sales) Fundamentals of Information Systems, Second Edition 41 Business Intelligence Gathering enough of the right information in a timely manner and usable form. – Competitive intelligence • What others are doing – Counterintelligence • Define trade secret information – Knowledge management • Capture company’s collective expertise wherever it resides • Record knowledge and share it Fundamentals of Information Systems, Second Edition 42 Others – Distributed databases • Data is spread over a few database – On-line analytical processing (OLAP) • Programs used to store and deliver data • Used to analyze millions of customer records – Open database connectivity (ODBC) standards Fundamentals of Information Systems, Second Edition 43 Comparison of OLAP and Data Mining Fundamentals of Information Systems, Second Edition 44 Advantages of ODBC Fundamentals of Information Systems, Second Edition 45 Object-Relational Database Management System • Stores the following types of data as objects: – – – – – audio images unstructured text spatial data Fundamentals of Information Systems, Second Edition 46 Spatial Technology Fundamentals of Information Systems, Second Edition 47 Summary • Data - one of the most valuable resources a firm possesses. • Entity - a generalized class of objects for which data is collected, stored, and maintained. • Attribute - a characteristic of an entity. • DBMS - a group of programs used as an interface between a database and application programs. • Data mining - the automated discovery of patterns and relationships in a data warehouse. Fundamentals of Information Systems, Second Edition 48