Project Proposal

advertisement

CSC 5120 Advance Topic in Database System

Project Proposal

Supervised by Dr. Ada Fu

Group 8:

Chiu Wai Man, Vivian (wmchiu@cse)

Sia Ka Cheung, Richard (kcsia8@ie)

Yip Wai Fung, Johnny (wfyip8@ie)

A survey on view materialization methods for OLAP and warehouse maintenance problem

Data warehouse and on-line analytical processing (OLAP) are essential elements of decision support, which has increasingly become focus of the database industry. Data warehouse collect data from source databases throughout an enterprise. After processing of such data, multidimensional views of data are then generated for a variety of OLAP servers to carry out OLAP queries.

Data warehouse is a place that stores a large amount data for analysis. To perform complex queries, joining of large tables, searching of data and computing result account for most of the time in OLAP queries. Since decision support is time critical, to be able to finish OLAP query in a reasonable time is important for analysts. This interests us to perform a survey in the area of improving query performance.

There are many issues in performance tuning in OLAP queries. In this survey, we will focus on speeding up OLAP queries using different view materialization methods.

Since maintenance is an essential task in data warehouse, we will also discuss several issues and strategies involved in data warehouse maintenance.

In the study of view materialization methods, we will drill into various algorithms improving OLAP query performance, including different index and optimal view generation methods. Index methods include Bit-Sliced indexes, Projection indexes and join index hierarchies [1][2][3]. As for optimal view generation methods, we will study greedy search, A* search, point intersection and genetic algorithm [4] [5][6]. Finally, we will study new data structures in the representation of data cube other than

traditional lattice structure, such as cubetrees [7].

As for data maintenance, there are various main areas in this issue, including data cleaning, data extraction form source databases, data updating strategies and data warehouse self-maintainability. We will focus on warehouse consistency (data cleaning)

[8][9] and self-maintainability. [10][11]

After studying related method mentioned above, we will try to compare the boost in performance among various algorithms in OLAP queries. For example, comparing index types for evaluating aggregate functions in queries.

References :

[1]

Patrick O’Neil, Dallan Quass. Improved query performance with variant indexes.

In Proc. ACM-SIGMOD 1997.

[2] Zhaohui Xie, Jiawei Han. Join Index Hierarchies for Supporting Efficient

Navigations in Object-Oriented Databases. In Proc. 20 th

VLDB Conference

1994.

[3]

Patrick O’Neil and Goetz Graefe. Multi-Table Joins Through Bitmapped Join

Indicies. SIGMOD Record, 1995.

[4] Nebojsa Stefanovic, Jiawei Han, Krzysztof Koperski. Object-Based Selective

Materialization for Efficient Implementation of Spatial Data Cubes. In IEEE

Transactions on Knowledge and Data Engineering, Vol. 12, No. 6

November/December 2000.

[5] Venky Harinarayan, Anand Rajaraman, Jeffrey D. Ullman. Implementing Data

Cubes Efficiently. In Proc. of the ACM-SIGMOD International Conference on

Management of Data, pages 205-216, 1996.

[6] H. Gupta and I. Mumick. Selection of Views to Materialize Under a

Maintenance-Time Constraint. In International Conference on Database Theory,

Jerusalam, Israel, January 1999.

[7] Yannis Kotidis, Nick Roussopoulos. An Alternative Storage Organization for

ROLAP Aggregate Views Based on Cubetrees. ACM-SIGMOD 1998.

[8] Y. Zhuge, H. Garcia-Molina, and J. L. Wiener. Consistency Algorithms for

Multi-Source Warehouse View Maintenance. Journal of Distributed and Parallel

Databases, volume 6, number 1, January, 1998, pages 7-40.

[9] Y. Zhuge, J. L. Wiener and H. Garcia-Molina. Multiple View Consistency for

Data Warehousing. In Proceedings of the International Conference on Data

Engineering, Binghamton, UK, April, 1997.

[10] D. Quass, A. Gupta, I. Mumick, and J. Widom "Making Views Self-Maintainable for Data Warehousing." In Proceedings of the Conference on Parallel and

Distributed Information Systems, Miami Beach, FL, December 1996.

[11] N. Huyn. "Efficient Self-Maintenance of Materialized Views." Technical Report,

Stanford University, 1996.

Download