Data Warehouse Management The Case for Data Warehousing The Case Against Data Warehousing Data Warehousing Gotchas Data Warehousing Software Evaluation March 13, 2000 Prof. Hwan-Seung Yong Dept. of CSE, Ewha Womans Univ. http://dblab.ewha.ac.kr/hsyong Basic Reason for Data Warehousing • In Text – – – – To convert data into business intelligence make management decision making based on facts not intuition get closer to the customers gain competitive advantage • But – data warehousing is only one step out of many in the long road toward the ultimate goal of accomplishing these highfalutin objectives • More practical reason is in next slide 2000/3/9 H.S. Yong, Ewha Womans Univ. 2 The Case for Data Warehousing • To perform querying and reporting on servers/disks not used by OLTP systems • To use data models and/or server technologies that speed up querying and reporting and that are not appropriate for transaction processing • To provide an environment where a relatively small amount of knowledge of the technical aspects of database technology is required to write and maintain queries and reports and/or to provide a means to speed up the writing and maintaining of queries and reports by technical personnel • To provide a repository of "cleaned up" transaction processing systems data that can be reported against and that does not necessarily require fixing the transaction processing systems 2000/3/9 H.S. Yong, Ewha Womans Univ. 3 The Case for Data Warehousing • To make it easier, to query and report data from multiple transaction processing systems and/or from external data sources • To prevent persons who only need to query and report transaction processing system data from having any access whatsoever to transaction processing system databases and logic used to maintain those databases – security issues 2000/3/9 H.S. Yong, Ewha Womans Univ. 4 The Case Against Data Warehousing • Data warehousing systems, for the most part, store historical data that have been generated in internal transaction processing systems. This is a small part of the universe of data available to manage a business. Sometimes this part has limited value. • Data warehousing systems can complicate business processes significantly. • If most of your business needs are to report on data in one transaction processing system and/or all the historical data you need are in that system and/or the data in the system are clean and/or your hardware can support reporting against the live system data and/or the structure of the system data is relatively simple and/or your firm does not have much interest in end user ad hoc query/report tools, data warehousing may not be for your business. • Data warehousing can have a learning curve that may be too long for impatient firms. 2000/3/9 H.S. Yong, Ewha Womans Univ. 5 The Case Against Data Warehousing • Data warehousing can become an exercise in data for the sake of the data. • In certain organizations ad hoc end user query/reporting tools do not "take". • Many "strategic applications" of data warehousing have a short life span and require the developers to put together a technically inelegant system quickly. Some developers are reluctant to work this way • There is a limited number of people available who have worked with the full data warehousing system project "life cycle". • Data warehousing systems can require a great deal of "maintenance" which many organizations cannot or will not support • Sometimes the cost to capture data, clean it up, and deliver it in a format and time frame that is useful for the end users is too much of a cost to bear. 2000/3/9 H.S. Yong, Ewha Womans Univ. 6 Data Warehousing Gotchas • You are going to spend much time extracting, cleaning, and loading. • Despite best efforts at project management, data warehousing project scope will increase. • You are going to find problems with systems feeding the data warehouse. • You will find the need to store data not being captured by any existing system • You will find the need to store data not being captured by any existing system • Some transaction processing systems feeding the warehousing system will not contain detail • You will underbudget for the resources skilled in the feeder system platforms • Many warehouse end users will be trained and never or seldom apply their training 2000/3/9 H.S. Yong, Ewha Womans Univ. 7 Data Warehousing Gotchas • After end users receive query and report tools, requests for IS written reports may increase • Your warehouse users will develop conflicting business rules • Large scale data warehousing can become an exercise in data homogenizing • 'Overhead' can eat up great amounts of disk space by precomputation • The time it takes to load the warehouse will expand to the amount of the time in the available time window • You are going to have a tough problem with security - especially if you make your data warehouse Web-accessible • You will fail if you concentrate on resource optimization to the neglect of project, data, and customer management issues and an understanding of what adds value to the customer 2000/3/9 H.S. Yong, Ewha Womans Univ. 8 Data Warehousing Software Evaluation • Do the evaluation yourself – do not rely solely on the ideas of someone outside your organization – you know better than any outsider, your organization’s needs, expectations, limitations, resources • Always first ask whether technology already in-house can do the job • Get references – Ask the software vendor for a complete list of referenceable sites – If this is a major decision for your company, call 5-6 sites • If you are going to see multiple vendor demos, build a test case that each vendor will follow • Go through the www.reviewbooth.com site to find published evaluations of the software • Be skeptical of data warehousing pundits' endorsements or reviews of technology 2000/3/9 H.S. Yong, Ewha Womans Univ. 9 Data Warehousing Software Evaluation • Go to the vendor road shows to talk with other attendees • Understand the tradeoffs the software makes – tradeoff speed, capacity, computer resource consumption, ease of development, ease of use, and ease of maintenance • Check the financial stability of the vendor • Have a representative team perform the evaluation • If you're evaluating an end user tool, let an end user lead the evaluation effort 2000/3/9 H.S. Yong, Ewha Womans Univ. 10