Chapter 5 Data Resource Management McGraw-Hill/Irwin Copyright © 2009 by The McGraw-Hill Companies, Inc. All rights reserved. Learning Objectives • Explain the business value of implementing data resource management processes and technologies in an organization • Outline the advantages of a database management approach to managing the data resources of a business, compared to a file processing approach • Explain how database management software helps business professionals and supports the operations and management of a business 5-2 Learning Objectives • Provide examples to illustrate the following concepts – Major types of databases – Data warehouses and data mining – Logical data elements – Fundamental database structures – Database development 5-3 Case 1: Cogent Communications, Intel, and Others • IT integration and adoption issues can make or break merger and acquisition activities. • Experts and IT managers agree that companies will feel the full impact of the merger and acquisition frenzy directly in their data centers. • Companies that have data centers where the employees hold all the knowledge suffer greatly when, after a merger or acquisition, those people are let go. • It is important to document the knowledge from those people and figure out how to make the processes work with only a handful of employees. • Companies should have good information about what goes on in the data center in terms of systems and how they interact with each other and interface with the business. 5-4 Case Study Questions 1. Place yourself in the role of a manager at a company undergoing a merge or acquisition. What would be the most important things customers would expect from you while still in that process? What role would IT play in meeting those expectations? Provide at least three examples. 2. Focus on what Andi Mann in the case calls “tribal knowledge.” What do you think he means by that, and why is it so important to this process? What strategies would you suggest for companies that are faced with the extensive presence of this issue in an acquired organization? Develop some specific recommendations. 5-5 Case Study Questions 3. Most of the discussion on the case focused on hardware and software issues. However, these are essentially enablers for underlying business processes developed by each of the companies involved. What different alternatives do companies have for merging their business processes, and what role would IT play in supporting those activities? Pay particular attention to data management and governance issues. 5-6 Logical Data Elements 5-7 Logical Data Elements • Character – A single alphabetic, numeric, or other symbol • Field or data item – Represents an attribute (characteristic or quality) of some entity (object, person, place, event) • Examples: salary, job title • Record – Grouping of all the fields used to describe the attributes of an entity • Example: payroll record with name, SSN, pay rate 5-8 Logical Data Elements • File or table – A group of related records • Database – An integrated collection of logically related data elements 5-9 Electric Utility Database 5-10 Database Structures • Common database structures… – Hierarchical – Network – Relational – Object-oriented – Multi-dimensional 5-11 Hierarchical Structure – Early DBMS structure – Records arranged in tree-like structure – Relationships are one-to-many 5-12 Network Structure – Used in some mainframe DBMS packages – Many-to-many relationships 5-13 Relational Structure • Most widely used structure – Data elements are stored in tables – Row represents a record; column is a field – Can relate data in one file with data in another, if both files share a common data element 5-14 Relational Operations • Select – Create a subset of records that meet a stated criterion • Example: employees earning more than $30,000 • Join – Combine two or more tables temporarily – Looks like one big table • Project – Create a subset of columns in a table 5-15 Multidimensional Structure • Variation of relational model – Uses multidimensional structures to organize data – Data elements are viewed as being in cubes – Popular for analytical databases that support Online Analytical Processing (OLAP) 5-16 Multidimensional Model 5-17 Object-Oriented Structure • An object consists of – Data values describing the attributes of an entity – Operations that can be performed on the data • Encapsulation – Combine data and operations • Inheritance – New objects can be created by replicating some or all of the characteristics of parent objects 5-18 Object-Oriented Structure Source: Adapted from Ivar Jacobsen, Maria Ericsson, and Ageneta Jacobsen, The Object Advantage: Business Process Reengineering with Object Technology (New York: ACM Press, 1995), p. 65. Copyright @ 1995, Association for Computing Machinery. By permission. 5-19 Object-Oriented Structure • Used in object-oriented database management systems (OODBMS) • Supports complex data types more efficiently than relational databases – Examples: graphic images, video clips, web pages 5-20 Evaluation of Database Structures • Hierarchical – Works for structured, routine transactions – Can’t handle many-to-many relationship • Network – More flexible than hierarchical – Unable to handle ad hoc requests • Relational – Easily responds to ad hoc requests – Easier to work with and maintain – Not as efficient/quick as hierarchical or network 5-21 Database Development • Database Administrator (DBA) – In charge of enterprise database development – Improves the integrity and security of organizational databases – Uses Data Definition Language (DDL) to develop and specify data contents, relationships, and structure – Stores these specifications in a data dictionary or a metadata repository 5-22 Data Dictionary • A data dictionary – Contains data about data (metadata) – Relies on specialized software component to manage a database of data definitions • It contains information on.. – The names and descriptions of all types of data records and their interrelationships – Requirements for end users’ access and use of application programs – Database maintenance – Security 5-23 Database Development 5-24 Data Planning Process • Database development is a top-down process – Develop an enterprise model that defines the basic business process of the enterprise – Define the information needs of end users in a business process – Identify the key data elements that are needed to perform specific business activities (entity relationship diagrams) 5-25 Entity Relationship Diagram 5-26 Database Design Process • Data relationships are represented in a data model that supports a business process • This model is the schema or subschema on which to base… – The physical design of the database – The development of application programs to support business processes 5-27 Database Design Process • Logical Design – Schema - overall logical view of relationships – Subschema - logical view for specific end users – Data models for DBMS • Physical Design – How data are to be physically stored and accessed on storage devices 5-28 Logical and Physical Database Views 5-29 Data Resource Management • Data resource management is a managerial activity – Uses data management, data warehousing, and other IS technologies – Manages data resources to meet the information needs of business stakeholders 5-30 Case 2: Applebee’s, Travelocity, and Others • Apart from using data for basic business decisions such as replenishing food supplies based on how much finished product was sold daily, Applebee is developing more sophisticated analyses that looks at how well items are selling so the company can make better decisions about what to order and what products to promote. • Today organizations are extensively aggregating and mining their data to make better decisions. • Travelocity has launched a new project to help it mine almost 600,000 unstructured comments so that it can better monitor and respond to customer service issues 5-31 Case Study Questions 1. What are the business benefits of taking the time and effort required to create and operate data warehouses such as those described in the case? Do you see any disadvantages? Is there any reason why all companies shouldn’t use data warehousing technology? 2. Applebee’s noted some of the unexpected insights obtained from analyzing data about “back-of-house” performance. Using your knowledge of how a restaurant works, what other interesting questions would you suggest to the company? Provide several specific examples. 5-32 Case Study Questions 3. Data mining and warehousing technologies use data about past events to inform better decision-making in the future. Do you believe this stifles innovative thinking, causing companies to become too constrained by the data they are already collecting to think about unexplored opportunities? Compare and contrast both viewpoints in your answer. 5-33 Types of Databases 5-34 Operational Databases • Stores detailed data needed to support business processes and operations – Also called subject area databases (SADB), transaction databases, and production databases – Database examples: customer, human resource, inventory 5-35 Distributed Databases • Distributed databases are copies or parts of databases stored on servers at multiple locations – Improves database performance at worksites • Advantages – – – – Protection of valuable data Data can be distributed into smaller databases Each location has control of its local data All locations can access any data, any where • Disadvantages – Maintaining data accuracy 5-36 Distributed Databases • Replication – Look at each distributed database and find changes – Apply changes to each distributed database – Very complex • Duplication – One database is master – Duplicate the master after hours, in all locations – Easier to accomplish 5-37 External Databases • Databases available for a fee from commercial online services, or free from the Web – Examples: hypermedia databases, statistical databases, bibliographic and full text databases – Search engines like Google or Yahoo are external databases 5-38 Hypermedia Databases • A hypermedia database contains – Hyperlinked pages of multimedia – Interrelated hypermedia page elements, rather than interrelated data records 5-39 Components of Web-Based System 5-40 Data Warehouses • Stores static data that has been extracted from other databases in an organization – Central source of data that has been cleaned, transformed, and cataloged – Data is used for data mining, analytical processing, analysis, research, decision support • Data warehouses may be divided into data marts – Subsets of data that focus on specific aspects of a company (department or business process) 5-41 Data Warehouse Components 5-42 Applications and Data Marts 5-43 Data Mining • Data in data warehouses are analyzed to reveal hidden patterns and trends – Market-basket analysis to identify new product bundles – Find root cause of qualify or manufacturing problems – Prevent customer attrition – Acquire new customers – Cross-sell to existing customers – Profile customers with more accuracy 5-44 Traditional File Processing • Data are organized, stored, and processed in independent files – Each business application designed to use specialized data files containing specific types of data records • Problems – Data redundancy – Lack of data integration – Data dependence (files, storage devices, software) – Lack of data integrity or standardization 5-45 Traditional File Processing 5-46 Database Management Approach • The foundation of modern methods of managing organizational data – Consolidates data records formerly in separate files into databases – Data can be accessed by many different application programs – A database management system (DBMS) is the software interface between users and databases 5-47 Database Management Approach 5-48 Database Management System • In mainframe and server computer systems, a software package that is used to… – Create new databases and database applications – Maintain the quality of the data in an organization’s databases – Use the databases of an organization to provide the information needed by end users 5-49 Common DBMS Software Components • Database definition – Language and graphical tools to define entities, relationships, integrity constraints, and authorization rights • Nonprocedural access – Language and graphical tools to access data without complicated coding • Application development – Graphical tools to develop menus, data entry forms, and reports 5-50 Common DBMS Software Components • Procedural language interface – Language that combines nonprocedural access with full capabilities of a programming language • Transaction processing – Control mechanism prevents interference from simultaneous users and recovers lost data after a failure • Database tuning – Tools to monitor, improve database performance 5-51 Database Management System • Database Development – Defining and organizing the content, relationships, and structure of the data needed to build a database • Database Application Development – Using DBMS to create prototypes of queries, forms, reports, Web pages • Database Maintenance – Using transaction processing systems and other tools to add, delete, update, and correct data 5-52 DBMS Major Functions 5-53 Database Interrogation • End users use a DBMS query feature or report generator – Response is video display or printed report – No programming is required • Query language – Immediate response to ad hoc data requests • Report generator – Quickly specify a format for information you want to present as a report 5-54 Database Interrogation • SQL Queries – Structured, international standard query language found in many DBMS packages – Query form is SELECT…FROM…WHERE… 5-55 Database Interrogation • Boolean Logic – Developed by George Boole in the mid1800s – Used to refine searches to specific information – Has three logical operators: AND, OR, NOT • Example – Cats OR felines AND NOT dogs OR Broadway 5-56 Database Interrogation • Graphical and Natural Queries – It is difficult to correctly phrase SQL and other database language search queries – Most DBMS packages offer easier-to-use, point-and-click methods – Translates queries into SQL commands – Natural language query statements are similar to conversational English 5-57 Graphical Query Wizard 5-58 Database Maintenance • Accomplished by transaction processing systems and other applications, with the support of the DBMS – Done to reflect new business transactions and other events – Updating and correcting data, such as customer addresses 5-59 Application Development • Use DBMS software development tools to develop custom application programs – Not necessary to develop detailed datahandling procedures using conventional programming languages – Can include data manipulation language (DML) statements that call on the DBMS to perform necessary data handling 5-60 Case 3: Amazon, eBay, and Google • Amazon’s data vault – Product descriptions – Prices – Sales rankings – Customer reviews – Inventory figures – Countless other layers of content • Took 10 years and a billion dollars to build 5-61 Case 3: Amazon, eBay, and Google • Amazon opened its data vault in 2002 – 65,000 developers, businesses, and entrepreneurs have tapped into it – Many have become ambitious business partners • eBay opened its $3 billion databases in 2003 – 15,000 developers and others have registered to use it and to access software features – 1,000 new applications have appeared – 41 percent of eBay’s listings are uploaded to the site using these resources 5-62 Case 3: Amazon, eBay, and Google • Google recently unlocked access to its desktop and paid-search products – Dozens of Google-driven services cropped up – Developers can grab 1,000 search results a day for free; anything more requires permission – In 2005, the Ad-Words paid-search service was opened to outside applications 5-63 Case Study Questions 1. What are the business benefits to Amazon and eBay of opening up some of their databases to developers and entrepreneurs? Do you agree with this strategy? 2. What business factors are causing Google to move slowly in opening up its databases? Do you agree with its go-slow strategy? 5-64 Case Study Questions 3. Should other companies follow Amazon and eBay’s lead and open up some of their databases to developers and others? Defend your position with an example of the risks and benefits to an actual company 5-65 Case 4: Emerson & Sanofi, Data Stewards • Data stewards – Dedicated to establishing and maintaining the quality of data – Need business, technology, and diplomatic skills – Focus on data content • Judgment is a big part of the job 5-66 Case Study Questions 1. Why is the role of a data steward considered to be innovative? 2. What are the business benefits associated with the data steward program at Emerson? 3. How does effective data resource management contribute to the strategic goals of an organization? 5-67