Group 8:
Chiu Wai Man, Vivian (wmchiu@cse)
Sia Ka Cheung, Richard (kcsia8@ie)
Yip Wai Fung, Johnny (wfyip8@ie)
Data warehouse and on-line analytical processing (OLAP) are essential elements of decision support, which has increasingly become focus of the database industry. Data warehouse collect data from source databases throughout an enterprise. After processing of such data, multidimensional views of data are then generated for a variety of OLAP servers to carry out OLAP queries.
Data warehouse is a place that stores a large amount data for analysis. To perform complex queries, joining of large tables, searching of data and computing result account for most of the time in OLAP queries. Since decision support is time critical, to be able to finish OLAP query in a reasonable time is important for analysts. This interests us to perform a survey in the area of improving query performance.
There are many issues in performance tuning in OLAP queries. In this survey, we will focus on speeding up OLAP queries using different view materialization methods.
Since maintenance is an essential task in data warehouse, we will also discuss several issues and strategies involved in data warehouse maintenance.
In the study of view materialization methods, we will drill into various algorithms improving OLAP query performance, including different index and optimal view generation methods. Index methods include Bit-Sliced indexes, Projection indexes and join index hierarchies [1][2][3]. As for optimal view generation methods, we will study greedy search, A* search, point intersection and genetic algorithm [4] [5][6]. Finally, we will study new data structures in the representation of data cube other than
traditional lattice structure, such as cubetrees [7].
As for data maintenance, there are various main areas in this issue, including data cleaning, data extraction form source databases, data updating strategies and data warehouse self-maintainability. We will focus on warehouse consistency (data cleaning)
[8][9] and self-maintainability. [10][11]
After studying related method mentioned above, we will try to compare the boost in performance among various algorithms in OLAP queries. For example, comparing index types for evaluating aggregate functions in queries.
References :
[1]
Patrick O’Neil, Dallan Quass. Improved query performance with variant indexes.
In Proc. ACM-SIGMOD 1997.
[2] Zhaohui Xie, Jiawei Han. Join Index Hierarchies for Supporting Efficient
Navigations in Object-Oriented Databases. In Proc. 20 th
VLDB Conference
1994.
[3]
Patrick O’Neil and Goetz Graefe. Multi-Table Joins Through Bitmapped Join
Indicies. SIGMOD Record, 1995.
[4] Nebojsa Stefanovic, Jiawei Han, Krzysztof Koperski. Object-Based Selective
Materialization for Efficient Implementation of Spatial Data Cubes. In IEEE
Transactions on Knowledge and Data Engineering, Vol. 12, No. 6
November/December 2000.
[5] Venky Harinarayan, Anand Rajaraman, Jeffrey D. Ullman. Implementing Data
Cubes Efficiently. In Proc. of the ACM-SIGMOD International Conference on
Management of Data, pages 205-216, 1996.
[6] H. Gupta and I. Mumick. Selection of Views to Materialize Under a
Maintenance-Time Constraint. In International Conference on Database Theory,
Jerusalam, Israel, January 1999.
[7] Yannis Kotidis, Nick Roussopoulos. An Alternative Storage Organization for
ROLAP Aggregate Views Based on Cubetrees. ACM-SIGMOD 1998.
[8] Y. Zhuge, H. Garcia-Molina, and J. L. Wiener. Consistency Algorithms for
Multi-Source Warehouse View Maintenance. Journal of Distributed and Parallel
Databases, volume 6, number 1, January, 1998, pages 7-40.
[9] Y. Zhuge, J. L. Wiener and H. Garcia-Molina. Multiple View Consistency for
Data Warehousing. In Proceedings of the International Conference on Data
Engineering, Binghamton, UK, April, 1997.
[10] D. Quass, A. Gupta, I. Mumick, and J. Widom "Making Views Self-Maintainable for Data Warehousing." In Proceedings of the Conference on Parallel and
Distributed Information Systems, Miami Beach, FL, December 1996.
[11] N. Huyn. "Efficient Self-Maintenance of Materialized Views." Technical Report,
Stanford University, 1996.