Chapter 5 Data Resource Management McGraw-Hill/Irwin Copyright © 2008 2008,The TheMcGraw-Hill McGraw-HillCompanies, Companies,Inc. Inc.All Allrights rightsreserved. reserved. Learning Objectives • Explain the business value of implementing data resource management processes and technologies in an organization • Outline the advantages of a database management approach to managing the data resources of a business, compared to a file processing approach • Explain how database management software helps business professionals and supports the operations and management of a business 5-2 Learning Objectives • Provide examples to illustrate the following concepts • Major types of databases • Data warehouses and data mining • Logical data elements • Fundamental database structures • Database development 5-3 Case 1: Sharing Business Databases • Amazon’s data vault • • • • • • Product descriptions Prices Sales rankings Customer reviews Inventory figures Countless other layers of content • Took 10 years and a billion dollars to build 5-4 Case 1: Sharing Business Databases • Amazon opened its data vault in 2002 • 65,000 developers, businesses, and entrepreneurs have tapped into it • Many have become ambitious business partners • eBay opened its $3 billion databases in 2003 • 15,000 developers and others have registered to use it and to access software features • 1,000 new applications have appeared • 41 percent of eBay’s listings are uploaded to the site using these resources 5-5 Case 1: Sharing Business Databases • Google recently unlocked access to its desktop and paid-search products • Dozens of Google-driven services cropped up • Developers can grab 1,000 search results a day for free; anything more requires permission • In 2005, the Ad-Words paid-search service was opened to outside applications 5-6 Case Study Questions • What are the business benefits to Amazon and eBay of opening up some of their databases to developers and entrepreneurs? • Do you agree with this strategy? • What business factors are causing Google to move slowly in opening up its databases? • Do you agree with its go-slow strategy? 5-7 Case Study Questions • Should other companies follow Amazon and eBay’s lead and open up some of their databases to developers and others? • Defend your position with an example of the risks and benefits to an actual company 5-8 Logical Data Elements 5-9 Logical Data Elements • Character • A single alphabetic, numeric, or other symbol • Field or data item • Represents an attribute (characteristic or quality) of some entity (object, person, place, event) • Examples: salary, job title • Record • Grouping of all the fields used to describe the attributes of an entity • Example: payroll record with name, SSN, pay rate 5-10 Logical Data Elements • File or table • A group of related records • Database • An integrated collection of logically related data elements 5-11 Electric Utility Database 5-12 Database Structures • Common database structures… • Hierarchical • Network • Relational • Object-oriented • Multi-dimensional 5-13 Hierarchical Structure • Early DBMS structure • Records arranged in tree-like structure • Relationships are one-to-many 5-14 Network Structure • Used in some mainframe DBMS packages • Many-to-many relationships 5-15 Relational Structure • Most widely used structure • Data elements are stored in tables • Row represents a record; column is a field • Can relate data in one file with data in another, if both files share a common data element 5-16 Relational Operations • Select • Create a subset of records that meet a stated criterion • Example: employees earning more than $30,000 • Join • Combine two or more tables temporarily • Looks like one big table • Project • Create a subset of columns in a table 5-17 Multidimensional Structure • Variation of relational model • Uses multidimensional structures to organize data • Data elements are viewed as being in cubes • Popular for analytical databases that support Online Analytical Processing (OLAP) 5-18 Multidimensional Model 5-19 Object-Oriented Structure • An object consists of • Data values describing the attributes of an entity • Operations that can be performed on the data • Encapsulation • Combine data and operations • Inheritance • New objects can be created by replicating some or all of the characteristics of parent objects 5-20 Object-Oriented Structure Source: Adapted from Ivar Jacobsen, Maria Ericsson, and Ageneta Jacobsen, The Object Advantage: Business Process Reengineering with Object Technology (New York: ACM Press, 1995), p. 65. Copyright @ 1995, Association for Computing Machinery. By permission. 5-21 Object-Oriented Structure • Used in object-oriented database management systems (OODBMS) • Supports complex data types more efficiently than relational databases • Examples: graphic images, video clips, web pages 5-22 Evaluation of Database Structures • Hierarchical • Works for structured, routine transactions • Can’t handle many-to-many relationship • Network • More flexible than hierarchical • Unable to handle ad hoc requests • Relational • Easily responds to ad hoc requests • Easier to work with and maintain • Not as efficient/quick as hierarchical or network 5-23 Database Development • Database Administrator (DBA) • In charge of enterprise database development • Improves the integrity and security of organizational databases • Uses Data Definition Language (DDL) to develop and specify data contents, relationships, and structure • Stores these specifications in a data dictionary or a metadata repository 5-24 Data Dictionary • A data dictionary • Contains data about data (metadata) • Relies on specialized software component to manage a database of data definitions • It contains information on.. • The names and descriptions of all types of data records and their interrelationships • Requirements for end users’ access and use of application programs • Database maintenance • Security 5-25 Database Development 5-26 Data Planning Process • Database development is a top-down process • Develop an enterprise model that defines the basic business process of the enterprise • Define the information needs of end users in a business process • Identify the key data elements that are needed to perform specific business activities (entity relationship diagrams) 5-27 Entity Relationship Diagram 5-28 Database Design Process • Data relationships are represented in a data model that supports a business process • This model is the schema or subschema on which to base… • The physical design of the database • The development of application programs to support business processes 5-29 Database Design Process • Logical Design • Schema - overall logical view of relationships • Subschema - logical view for specific end users • Data models for DBMS • Physical Design • How data are to be physically stored and accessed on storage devices 5-30 Logical and Physical Database Views 5-31 Data Resource Management • Data resource management is a managerial activity • Uses data management, data warehousing, and other IS technologies • Manages data resources to meet the information needs of business stakeholders 5-32 Case 2: Emerson & Sanofi, Data Stewards • Data stewards • Dedicated to establishing and maintaining the quality of data • Need business, technology, and diplomatic skills • Focus on data content • Judgment is a big part of the job 5-33 Case Study Questions • Why is the role of a data steward considered to be innovative? • What are the business benefits associated with the data steward program at Emerson? • How does effective data resource management contribute to the strategic goals of an organization? 5-34 Types of Databases 5-35 Operational Databases • Stores detailed data needed to support business processes and operations • Also called subject area databases (SADB), transaction databases, and production databases • Database examples: customer, human resource, inventory 5-36 Distributed Databases • Distributed databases are copies or parts of databases stored on servers at multiple locations • Improves database performance at worksites • Advantages • • • • Protection of valuable data Data can be distributed into smaller databases Each location has control of its local data All locations can access any data, any where • Disadvantages • Maintaining data accuracy 5-37 Distributed Databases • Replication • Look at each distributed database and find changes • Apply changes to each distributed database • Very complex • Duplication • One database is master • Duplicate the master after hours, in all locations • Easier to accomplish 5-38 External Databases • Databases available for a fee from commercial online services, or free from the Web • Examples: hypermedia databases, statistical databases, bibliographic and full text databases • Search engines like Google or Yahoo are external databases 5-39 Hypermedia Databases • A hypermedia database contains • Hyperlinked pages of multimedia • Interrelated hypermedia page elements, rather than interrelated data records 5-40 Components of Web-Based System 5-41 Data Warehouses • Stores static data that has been extracted from other databases in an organization • Central source of data that has been cleaned, transformed, and cataloged • Data is used for data mining, analytical processing, analysis, research, decision support • Data warehouses may be divided into data marts • Subsets of data that focus on specific aspects of a company (department or business process) 5-42 Data Warehouse Components 5-43 Applications and Data Marts 5-44 Data Mining • Data in data warehouses are analyzed to reveal hidden patterns and trends • Market-basket analysis to identify new product bundles • Find root cause of qualify or manufacturing problems • Prevent customer attrition • Acquire new customers • Cross-sell to existing customers • Profile customers with more accuracy 5-45 Traditional File Processing • Data are organized, stored, and processed in independent files • Each business application designed to use specialized data files containing specific types of data records • Problems • • • • Data redundancy Lack of data integration Data dependence (files, storage devices, software) Lack of data integrity or standardization 5-46 Traditional File Processing 5-47 Database Management Approach • The foundation of modern methods of managing organizational data • Consolidates data records formerly in separate files into databases • Data can be accessed by many different application programs • A database management system (DBMS) is the software interface between users and databases 5-48 Database Management Approach 5-49 Database Management System • In mainframe and server computer systems, a software package that is used to… • Create new databases and database applications • Maintain the quality of the data in an organization’s databases • Use the databases of an organization to provide the information needed by end users 5-50 Common DBMS Software Components • Database definition • Language and graphical tools to define entities, relationships, integrity constraints, and authorization rights • Nonprocedural access • Language and graphical tools to access data without complicated coding • Application development • Graphical tools to develop menus, data entry forms, and reports 5-51 Common DBMS Software Components • Procedural language interface • Language that combines nonprocedural access with full capabilities of a programming language • Transaction processing • Control mechanism prevents interference from simultaneous users and recovers lost data after a failure • Database tuning • Tools to monitor, improve database performance 5-52 Database Management System • Database Development • Defining and organizing the content, relationships, and structure of the data needed to build a database • Database Application Development • Using DBMS to create prototypes of queries, forms, reports, Web pages • Database Maintenance • Using transaction processing systems and other tools to add, delete, update, and correct data 5-53 DBMS Major Functions 5-54 Database Interrogation • End users use a DBMS query feature or report generator • Response is video display or printed report • No programming is required • Query language • Immediate response to ad hoc data requests • Report generator • Quickly specify a format for information you want to present as a report 5-55 Database Interrogation • SQL Queries • Structured, international standard query language found in many DBMS packages • Query form is SELECT…FROM…WHERE… 5-56 Database Interrogation • Boolean Logic • Developed by George Boole in the mid-1800s • Used to refine searches to specific information • Has three logical operators: AND, OR, NOT • Example • Cats OR felines AND NOT dogs OR Broadway 5-57 Database Interrogation • Graphical and Natural Queries • It is difficult to correctly phrase SQL and other database language search queries • Most DBMS packages offer easier-to-use, point-and-click methods • Translates queries into SQL commands • Natural language query statements are similar to conversational English 5-58 Graphical Query Wizard 5-59 Database Maintenance • Accomplished by transaction processing systems and other applications, with the support of the DBMS • Done to reflect new business transactions and other events • Updating and correcting data, such as customer addresses 5-60 Application Development • Use DBMS software development tools to develop custom application programs • Not necessary to develop detailed data-handling procedures using conventional programming languages • Can include data manipulation language (DML) statements that call on the DBMS to perform necessary data handling 5-61 Major DBMS Software • • • • • MS Access MS SQL Server IBM DB2 Oracle 9i MySQL (Open source DBMS) 5-62 mySQL DBMS Application 5-63 Case 3: Acxiom Corp. Data • Acxiom does three things really well… • Manages large volumes of data • Cleans, transforms, and enhances that data • Distills business intelligence from that data to drive smart decisions • Refined data is sold to customers • • • • Developing telemarketing lists Identifying prospects for credit card offers Screen prospective employees Detecting fraudulent financial transactions 5-64 Case 3: Acxiom Corp. Data • Primary business activities • Building its data library • Selling data • Managing other companies’ data and data centers 5-65 Case Study Questions • Acxiom is in a unique type of business. How would you describe the business of Acxiom? • Are they a service- or product-oriented business? • It is easy to see that Acxiom has focused on a wide variety of data from different sources. • How does Acxiom decide which data to collect, and for whom? • Acxiom’s business raises many issues related to privacy. • Are the data collected by Acxiom really private? 5-66 Case 4: Protecting the Data Jewels • Harrah’s Entertainment and other casino companies closely guard customer data • Both hard copy and electronic files • Concerns • Broader access to CRM systems • More frequent job switching 5-67 Case 4: Protecting the Data Jewels • Protection methods • Nondisclosure, non-compete, and nonsolicitation agreements that specify customer lists • Trade-secret laws and legal action • Limiting access to sensitive information • Physical security • Strong password protection • Reinforcement of signed agreements during exit interviews • Monitoring electronic communication 5-68 Case Study Questions • Why have developments in IT helped to increase the value of the data resources of many companies? • How have these capabilities increased the security challenges associated with protecting a company’s data resources? • How can companies use IT to meet the challenges of data resource security? 5-69