A Speech Data Warehouse - Information & Software Engineering

advertisement
Multidimensional Modeling Approaches for OLAP Based on Extended Relational Concepts
1
Multidimensional Modeling Approaches for OLAP
Based on Extended Relational Concepts
O. Mangisengi, A M. Tjoa
Institute of Software Technology, Technical University of Vienna, Austria
Tel: 58801-18800, FAX: 58801-18899, E-mail: {oscar, tjoa}@ifs.tuwien.ac.at
R.R. Wagner
Institute of Applied Knowledge Processing, University of Linz, Austria
Tel: 43-732-2468-791, E-mail: wagner@ifs.uni-linz.ac.at
Abstract
The paper introduces different extended relational concepts to model and query OLAP data. In the
first approach we will use the concept of nested relations to model multi-dimensional OLAP-data.
In the second approach we will model OLAP by using the extended relational model which was
introduced by Codd for use in OLAP-modeling. The general requirements for all two extended
relational data models which could serve as a foundation for multidimensional database systems,
are similar to those that made the relational model successful, namely the existence of an
implementation independent formalism, the separation of structure and contents, and the existence
of declarative query language. All two approaches will be compared with the modeling based on
the traditional flat relations, which is widely used in the OLAP-community, i.e. the modeling of
OLAP star and snow-flake schemas.
1. Introduction
The On-line Analytical Processing (OLAP) is emerging as the most important approach in Data
Warehousing. OLAP allows to model data in a multidimensional way as a cube and to query and
analyze data from many different perspectives. Independent from the different implementation
aspects, OLAP data are presented to the user in a multidimensional data model.
There are several ways to formally define multidimensional models and their query languages.
However until now there do not exist a commonly accepted formal multidimensional data model.
Such a model is necessary as a basis for an accepted standardized logical data model for OLAP
data. This would allow practitioners and researchers to specify their data warehouses in a unified
way.
The aim of this paper is to propose an approach for the conceptual modeling of multidimensional
data, which is entirely based on a relational data model. It seems to the authors that such a model
would very much correspond with the original intuition of Codd when he introduced the concept
of OLAP in his pioneering white paper [CCS93].
2
Mangisengi, Tjoa, and Wagner
Among the different ways to define a data cube the star schema approach could be regarded as the
most dominant one. A data cube is defined as a collection of at least one fact table and a set of
dimension tables.
In this paper we will introduce a meta model with the capability to describe these tables and the
OLAP queries based on two extended relational approaches, namely nested relations, and the
extended relational model of Codd. The general requirements for such a formal data model that
can serve as a foundation for multidimensional database systems, are similar to those that made
the relational model successful, namely the existence of an implementation independent
formalism, the separation of structure and contents, and the existence of declarative query
language [BSHD98].
Recently a number of approaches are proposed in the literature for the formal foundation of
multidimensional modeling. [BSHD98] compares and describes the most important modeling
approaches in this area, namely the approaches [AGS97] of Agrawal, Gupta, and Sarawagi,
[CT97a,CT97b] of Cabbibo and Torlone, [LW96] of Li and Wang, and [GL97] of Gyssens and
Lakshmanan.
The remainder of this paper is structured as follows. Section 2 briefly the theory of nested relation
and of extended relational. In section 3 we present the modeling of OLAP-data by means of
nested relation and extended relational. Nest and unnest operators of the nested relation are well
suited for drilling-down and rolling-up of OLAP operation, meanwhile PATT and STEP
operators of the extended relational model are useful for OLAP data manipulations. Section 4
presents an extended example for the use of two modeling approaches in a Data Warehouse. Our
conclusion and the comparison of modeling approaches are given in section 5.
2. The Theory of Nested Relation and Extended Relational
In this section we first review the concept of the nested relations given in [RKS88], which is used
to model the multidimensional of OLAP data. The special point for the nested relation is the use
of nest and unnest operator. The second we review the concept of extended relational model
proposed by Codd as the basic of the object oriented model.
2.1. The Nested Relation
2.1.1. Basic Concept
A nested relations (Non First Normal Form relations) is a relation with tuple components which
are atomic values or repeating group instances. A nested relational database scheme is a collection
of relational schemes of the form R ji  R j1 , R j 2 ,, R jn  , where all the Rji are zero order. A nested
relations scheme contains any combination of zero- or higher-order names on the right-hand side
of the specification as long as the scheme remains nonrecursive. A nested relation is represented
simply as a higher-order name on the right-hand side of the specification.
Let R be a name in a database scheme S. An instance of R, written r, is an ordered pair of the form
3
Multidimensional Modeling Approaches for OLAP Based on Extended Relational Concepts
R, V R , where VR is a value for name R. If R is a zero-order name, then VR is just any value from
the domain of R. If R is a higher-order name, then VR must be expanded in terms of the names on
the right-hand side of rule R. A relation structure   R, r denotes a relation scheme R and
instance r.
2.1.2. Operator of Nested Relation
Besides the extended algebra operations of the nested relation, such as union, intersection,
Cartesian product, projection, and selection operators, there are two important operators of the
nested relation, namely the nest   and unnest   operator, which are usefull for OLAP
operations. The detailed extended algebra of the nested relation is given in [RKS88]. We review
the definition of nest and of unnest operators proposed by [RKS88].
1. Nest
Nest takes a relation structure   R, r and aggregates over equal data values in some subset of
the names in R. Formally, let R be a relation scheme, in a database scheme S, which contains
for
the
external
name
R.
Let
and
B1, B2 ,, Bm   ER
R  A1 , A2 ,, An 
C1, C2 ,, Ck   ER  B1, B2 ,, Bm  . Assume that either B  B1, B2 ,, Bm  is in S or that B does not
appear on the left-hand side in S and B1 , B2 ,, Bm  does not appear on the right-hand side S. Then,
 B B , B ,, B    R , r    where:
1
2
m
R  C1 , C2 , , Ck , B1 , B2 ,, Bm   C1 , C2 ,, Ck , B and B  B1 , B2 ,, Bm  is appended in S if it is
not already in S, and r    t | there exists a tuple   r such that
tC1 , C2 ,, Ck   C1C2 Ck   tB  B1B2 Bm  |   r  C1C2 Ck   tC1C2 Ck   
2. Unnest
Unnest takes a relation structure nested on some set of attributes and disaggregates the structure to
make it a flatter structure. Formally, let R be a relation scheme, in database scheme S, which
contains R  A1 , A2 ,, An  for external name R. Assume B is some higher-order name in ER with
an associated B  B1 , B2 ,, Bm  . Let C1 , C2 ,, Cn   ER  B. Then u BB , B ,, B    R , r   
where:
1
2
m
R  C1 , C2 ,, Ck , B1 , B2 ,, Bm  and B  B1 , B2 ,, Bm  is removed from S if it does not appear in
any other relation scheme, and r    t | there exists a tuple u  r such that
tC1C2 Ck   uC1C2 Ck   tB1 B2  Bm  uB .
The nested relations scheme is illustrated by the following example [RKS88]. The relation r1 with
scheme R  A, B, E, B  C, D, E  E  is given by the following table 2.1. The nested relation
operation  BC ,D   EE  r  , then the result of this operation produces the following table 2.2.
4
Mangisengi, Tjoa, and Wagner
2.2. The Extended Relational Model
2.2.1. The Concept of Extended Relational Model
The extended relational model proposed by Codd in [Codd79] is a suitable concept of an object
oriented model for OLAP data. Since it inherently uses the concept of object identifiers, a
typology of different object types, such as associations, characteristics, and kernel object types,
and a meta model description on the relationship between the different types. This model defines
different kind of relations, namely the object relation, property relation, association relation,
characteristic relation, association-graph relation, and characteristic-graph relation. The detailed
definition of each relation is given in [Codd79]. These relations are used as the base relations of
the object oriented OLAP data modeling. The model of OLAP data is given in detail in the next
section.
A
C
D
E
A1 C1 D1 E1
A1 C1 D2 E1
A1 C1 D2 E2
A2 C2 D1 E1
A2 C2 D1 E2
Table 2.1 The relation r
E1
C
D
E
A1 C1 D1 E1
A1 C1 D2 E1
E2
A2 C2 D1 E1
E2
Table 2.2 The nested relation operation
A
B
 BC ,D   EE  r 
2.2.2. OLAP data manipulation
An n-dimensional cube can be obtained by using the PATT (Partitioning by attribute) as defined
in [Codd79] on the fact relation. PATT is introduced in the original paper as follows.
PARTITION BY ATTRIBUTE: PATT
Let R be a relation with attribute A (possibly compound). R may have attributes other than A.
PATT(R,A) delivers the set of relations obtained by partitioning R per all the distinct values of A.
For all relations R having an attribute A: R = UNION / PATT(R,A). The “/” operator here denotes
the COMPRESS-operator.
COMPRESS
Let f be an associative and commutative operator that maps a pair of relations into a relation (for
example, a join). Let Z be a set of relations such that f can be validly applied to every pair of
relations in Z. Then COMPRESS(f,Z) is the relation obtained by repeated pairwise application of
Z. An alternative notation for COMPRESS(f,Z) is f/Z.
The partition operator has the same functionality as the partitioning operator of the quotient
relation (“/”) described in [FK77]. The use of partitioning operator of the quotient relation for
OLAP operation in detail is given in [MT98]. Relevant roll-ups can be obtained by using the
STEP-operator to find the next refinement level. The STEP-operator can be defined as follows:
Multidimensional Modeling Approaches for OLAP Based on Extended Relational Concepts
5
STEP
Let R be an unlabeled digraph relation that does not have an attribute SEP (which stands for
separation). STEP(R) is the set of all tuples of the form (SUB:x,SUP:y,SEP:n), where
(SUB:x,SUP:y) belongs to R and n is the least number of edges of the graph which separate node
x from node y.
3. The modeling of OLAP-data by means of nested relation, and extended
relational
In this section we will show that the mentioned established extended relational models, which
have the capability of grouping tuples of a relation and the ability to ‘drill-down’ and ‘roll-up’
between the different levels of hierarchies, can be used as a conceptual model for OLAP data.
This has the huge advantage of an existing sound theoretical foundation, and that these models
have been studied for a long time. For the object-oriented implementation of these models datablades with the defined operators of these models are serious candidates for an alternative
approach of OLAP databases.
3.1. The Nested Relation approach
In this section we propose the use of nest operator and of unnest operator, which are applied in the
OLAP operation, namely rolling-up and drilling-down operation.
1. Rolling-Up
The rolling-up operation in the nested relation can be applied using the nest operator   . We use
the sample of sales data based on the store dimension, which is given in table 3.1.
State
City
Store
Sales
ST
Ci
STO
Sa
CA
SF
4
100
CA
SF
5
300
CA
SF
6
500
CA
LA
2
200
CA
LA
3
400
CA
LA
7
600
NY
NYC
1
500
Table 3.1 The sample of sales data based on the dimension Store
To illustrate the rolling-up operation in the nested relation, we assume that the lowest level of
aggregation of the table 3.1 is the attribute Store. We introduce the nested relation with the
aggregation hierarchy Store  City  State. To roll-up the sales data from Store to City, the
nested operation can be specified as follows:
6
Mangisengi, Tjoa, and Wagner
City   STOSTO , Sales  q  .
The nested operation City   STOSTO ,Sales  q  produces the following table 3.2.
STO1
Sum(Sales,City)
Store
Sales
STO
Sa
CA
SF
4
100
5
300
900
6
500
CA
LA
2
200
3
400
1200
7
600
NY NYC
1
500
500
Table 3.2 The nested operation City   STOSTO ,Sales  q 
State
ST
City
Ci
The Sum function could be used as a comparable OLAP-functionality.
If the higher summarization of the sales data is required, then the attributes City and Store are
nested into a relation Ci1. The nested operation for rolling-up fro City to State is given as follows:
State   CiCi , STO ,Sales  q  .
The result of the nested operation State   CiCi ,STO ,Sales  q  is given in table 3.3.
State
ST
CA
NY
Ci1
Sum(Sales,City)
Sum(Sales,State)
City
Store
Sales
Ci
STO
Sa
SF
4
100
5
300
900
6
500
LA
2
200
2100
3
400
1200
7
600
NYC
1
500
500
500
Table 3.3 The nested operation State   CiCi ,STO ,Sales  q 
2. Drilling-down
According to the definition of unnest, which undo the nesting of the nested structure on an
attribute set, the unnest operator   can be applied as the drilling-down operation in the OLAPmodel. To demonstrate the drilling-down operation, we use the sales data on the table 3.3, which
is the result of the rolling-up from City to State. For drilling-down from State to City, the unnested
operation can be used as follows:
Multidimensional Modeling Approaches for OLAP Based on Extended Relational Concepts
7
City  Ci State 
This unnest operation City produces the result given in table 3.2.
To obtain a more detailed view of the data, namely at the Store level, the drilling-down operation
from City to Store can be realized using the following unnest operation.
Store   STO City  .
The result of this unnest operation Store is again the table 3.1.
3.2. The Extended Relational Approach
The extended relational model proposed by Codd [Codd79] can be regarded as a forerunner of
object oriented models, having a clear concept of object identifiers, and a strict specification of
object types, which could easily be transformed to class-hierarchies. In this framework fact
relations can be modeled as association relations with participating dimension relation types, such
as the example of the association relation Sales given in table 3.4.
Sales_Oid Time_Oid Product_Oid Store_Oid
1
1
1
1
2
2
2
2
:
:
:
:
n
n
n
n
Table 3.4 Association-Relation-Sales
Each object in this model is uniquely identified by its object-identifier (Oid). For every object type
there exist a single attribute relation in which the object-id’s are collected (Obj-Rel-Relations).
These relations could be sought as the domain for object-identifiers of an object type. The
example of the object relation “Sales” is given in table 3.5.
The associative object type SALES has the subordinate characteristic object types Times, Store,
and Product in the meta relation for association, i.e. the Association-Graph-Relation
AG(SUP:m,SUB:n). We have the following extension of AG, which reflects this association
(Note: the domain of SUP and SUB are the relation names of the object-relations). The
association graph relation is shown in table 3.6.
Sales_Oid
1
2
:
n
Table 3.5. Object-Relation-Sales
SUP
SUB
Obj-Rel-Sales
Obj-Rel-Time
Obj-Rel-Sales
Obj-Rel-Store
Obj-Rel-Sales Obj-Rel-Product
Table 3.6 Association-Graph-Relation
Each dimension can further be refined by characteristics and “characteristics of characteristics”
reflecting the drill-down hierarchy. With other words this relation reflects the star-schema with
8
Mangisengi, Tjoa, and Wagner
the Object-Rel-Sales representing the fact-object and the Object-Rel-Time, Object-Rel-Store, and
Object-Rel-Product as the dimensions.
Each characteristic entity type representing a dimension will furthermore define a tree of these
types corresponding to the possible roll-ups given by the hierarchy of the dimensions. Every subordinate characteristic-relation, which participates in the OLAP cube as a dimension with its
refinement hierarchy, can now be presented in the corresponding characteristic relations. In the
following representation we describe the hierarchy Store  City  State.
Figure 3.1. Characteristic-Relations and Object-Relations of the dimension “Store”
City_Oid
Store_Oid
 c1
1
 c2
2
:
:
 cn
n
Char-Rel-City-Store
State_Oid City_Oid
 s1
 c1
 s2
 c2
:
:
 sx
 cn
Char-Rel-State-City
with the object relations:
Store_Oid
1
2
.
n
Obj-Rel-Store
City_Oid
 c1
 c2
.
 cn
Obj-Rel-City
State_Oid
 s1
 s2
.
 sx
Obj-Rel-State
In the meta relation for characteristics relations we have the following tuples in the CG-Relation
(Characteristic Graph Relation) denoting the refinement hierarchy. The example of CG-Relation
for dimension “Store” is given in table 3.7. Characteristic object types provide a description of a
given kernel entity representing the most aggregated granularity (in case of time: State), which
form a strict hierarchy, called characteristic tree. For the example given in table 3.8 we have the
complete characteristic tree described by the following CG-Relation.
SUB
SUP
Obj-Rel-Store
Obj-Rel-City
Obj-Rel-City
Obj-Rel-State
Table 3.7 CG-Relation
SUB
SUP
Obj-Rel-Time
Obj-Rel-Month
Obj-Rel-Month
Obj-Rel-Year
Obj-Rel-Store
Obj-Rel-City
Obj-Rel-City
Obj-Rel-State
Obj-Rel-Product
Obj-Rel-Item
Obj-Rel-Item
Obj-Rel-Category
Table 3.8 CG-Relation for all dimensions
Multidimensional Modeling Approaches for OLAP Based on Extended Relational Concepts
9
4. An extended example for the use of the two-modeling approach in a Data
Warehouse
In this section we take a small data warehouse schema example to show the use of the nested
relations and of the extended relational in a Data Warehouse. The Data Warehouse has a sales
fact relation (Sa) and three dimension relations. The dimension relations are given by the relation
Store (S), Product (P), and Time (T) as represented in Figure 4.1. Figure 4.1 applies the
multidimensional graphical notation as it is developed by Bulos [Bulo96] using the ADAPT
modeling tool developed by Totok and Jaworski [TJ98]. A sample instantiation of the model in
figure 4.1 is given by the relations Sa, S, P, and T in figure 4.2.
Figure 4.1 the model of a Data Warehouse Example
To demonstrate the relevance of applying the concept of nested relations and of extended
relational in a data warehouse environment, we take the following query example.
“Comparison of the sales of a product `sweet tooth` for stores in state CA for the year 1994
and 1995”.
4.1. The Nested Relation Approach
To apply nested relation in a Data Warehouse, we use the multidimensional query example of the
Data Warehouse above. The Data Warehouse contains relation Sales Fact as fact relation the flat
relation Sales and the nested relations Store, Time, and Product.
The steps used to process the multidimensional query using nested relations are:
1. Restriction, projection and selection of each dimension relation.
a.
Restriction of stores which is located in California (CA). The rolling-up from Store to State
produce the relation StoreST , Ci Ci, STO . S1    ST ' CA', Ci, STO CiStore . The result
of the operation S1 is given by table 4.1.
10
Mangisengi, Tjoa, and Wagner
Time
T
1
2
13
14
397
398
401
402
1
14
Product
P
3
2
3
2
3
3
3
3
5
3
Store
STO
1
4
5
7
Store
STO
1
1
4
4
1
1
4
4
5
5
City
Ci
New York City
Los Angeles
San Francisco
Austin
Sales
SL
30
20
10
50
40
30
20
60
25
45
State
ST
NY
CA
CA
TX
Time
T
1
2
13
14
397
398
401
402
Product
P
2
3
4
5
Month
M
10
10
11
11
10
10
11
11
Year
Y
1994
1994
1994
1994
1995
1995
1995
1995
Item
I
Lots of Nuts
Sweet tooth
Fizzy Light
Fizzy Classic
Category
Ct
Food
Food
Soft drinks
Soft drinks
Figure 4.2 Illustration of the relation Sa, S, P, and
b. Restriction of a product which has item name Sweet tooth. The relation Product Ct , I I , P is
obtained with rolling-up to Item. S2   Ct, I ' Sweet tooth ', P P Product  . Table 4.2
shows the result S 2 .
Ci1
ST
Ct
Ci
STO
LA
4
SF
5
Table 4.1 Result table S1
CA
c.
I
P1
P
3
Food Sweet tooth
Table 4.2 Result table S 2
Restriction of time for the year 1994 and 1995. The rolling-up to Year produces the relation
TimeY , M M , T  .
TimeY , M M , T 
The
relation
is
obtained
by
nesting.
S3    Y '1994'  Y '1995', M , T  M Time . The relation S 3 is shown in table 4.3.
2. Join between the sales fact and each restriction relation
a.
Join between the sales fact (Sa) and S1 and projection on the relevant attributes.


S 4   T , P, ST , SL Sa  e S1 . The joining operation S 4 produces a result given in table 4.4.
b. Join between S 4 and S 2 and projection on the required attributes.


S 5   T , ST , I , SL  S 4  e S 2 . The joining operation S5 is given in table 4.5
11
Multidimensional Modeling Approaches for OLAP Based on Extended Relational Concepts
M1
Y
M
T
10
1
10
2
11
13
11
14
1995
10
397
10
398
11
401
11
402
Table 4.3 Result of operation S 3
1994
c.
Time Product State
Sales
T
P
ST
SL
13
3
CA
10
14
2
50
401
3
20
402
3
60
1
5
25
14
3
45
Table 4.4 Result of operation S 4
Join between S5 and S 3 and projection on the relevant attributes.


S 6   ST , I , Y , SL  S 5  e S 3 . The joining operation S 6 is shown by table 4.6.
Time
State
Item
Sales
T
ST
I
SL
13
CA
Sweet tooth
10
401
20
402
60
14
45
Table 4.5 The joining operation S 5
State
ST
CA
Item
I
Sweet tooth
Year
Y
1994
1995
SL
10
45
20
60
Table 4.6 The joining operation S 6
3. Introduction of the derived attribute Sum(Sales(Year)) in S 7 . The rolling-up operation S7 is
given in table 4.7. S7  S6 State, Year , Item, Sales, SumSales, Year 
State
ST
CA
Year
Y
1994
Item
I
Sweet tooth
Sales
Sum(Sales, Year)
SL
10
55
45
CA
1995
Sweet tooth
20
80
60
Table 4.7 Introduction of the aggregation function sum(Sales(Year))
4. Roll-up of S 7 by ST and introduction of derived attribute Sum(Sales,Year). The rolling-up
operation S8 is given in table 4.8. S8  ST State, Year , Item, Sales , SumSales , Year , SumSales , State 
State
ST
CA
Year
Y
1994
Item
I
Sweet tooth
Sales Sum(Sales, Year) Sum(Sales,State)
SL
10
55
45
135
1995
Sweet tooth
20
80
60
Table 4.8 Roll-up and introduction of the aggregation function sum(Sales(State))
12
Mangisengi, Tjoa, and Wagner
4.2. The Extended Relation Approach
The meta-information of the OLAP-cube is very properly reflected by this semantic model. Of
course, the “real data” are still to be represented by so called property relations.
Figure 4.3 The property relation “Sales” and the property relation of each dimension
Sales_Oid
Sales
1
30
2
20
3
10
4
50
5
40
6
30
7
20
8
60
9
25
10
45
Properties-Relation-Sales
Store_Oid
Store
1
1
2
4
3
5
4
7
Properties-Relation-Store
Time_Oid
Time
1
1
2
2
3
13
4
14
5
397
6
398
7
401
8
402
Properties-Relation-Time
Product_Oid
Product
1
2
2
3
3
4
4
5
Properties-Relation-Product
The property relation Sales corresponds to the relation Object-Relation-Sales in table 3.5 The
Association-Relation-Sales is shown in table 4.9.
Sales_Oid Time_Oid Product_Oid Store_Oid
1
1
2
1
2
2
1
1
3
3
2
2
4
4
1
2
5
5
2
1
6
6
2
1
7
7
2
2
8
8
2
2
9
1
4
3
10
4
2
3
Table 4.9 Association-Relation-Sales
Multidimensional Modeling Approaches for OLAP Based on Extended Relational Concepts
13
The steps for processing the multidimensional query based on the extended relational approach
are as follows. The following three operations create a relation R1 with object-id’s of all stores in
state CA.
R1  Prop  Rel  StateState ' CA'State _ Oid  .
R1  Char  R el  State  City City _ Oid  R1 State _ Oid  .
R1  Char  R el  City  StoreStore _ Oid  R1 City _ Oid  .
R1 delivers the object-id’s of the store with the state ‘CA’.
R2  Prop  Rel  ProductItem ' Sweet tooth Item _ Oid  .
R2  Char  R el  Item  P roductP roduct _ Oid  R2 Item _ Oid  .
R2 contains the object-id’s of the product with the product name ‘Sweet tooth’.
R3  Prop  Rel  Year Year '1994'  Year '1995'Year _ Oid  .
R3  Char  R el  Year  MonthMonth _ Oid  R3 Year _ Oid  .
R3  Char  R el  Month  DayTime _ Oid  R3 Month _ Oid  .
R3 contains all Time_Oid’s of the years 1994 or 1995.
R4  Ass  R el  Sales  R1  R2   R3  Sales _ Oid , Time _ Oid , P roduct _ Oid , Store _ Oid , Year _ Oid .
R4 reduces the fact relation (association relation) to the relevant Time, Product, and Store
restrictions, (i.e. Year: 1994 or 1995, Product: “sweet tooth” and Store: located in state CA).
R4  R4 Sales _ Oid , Year _ Oid  .
R5  P rop  R el  Sales Sales _ Oid , Year _ Oid , Sales   R4 Year _ Oid  Year _ Oid 


R5 contains the Year_Oid for the years 1994 or 1995 and the perspectives sales data.
R6  PATT R5 , Year _ Oid  . The relation R6 contains two relations for the years 1994 respectively
1995. The relation R5 is partitioned by the attribute Year_Oid (PATT operator).
Drill-downs and roll-ups can be performed on the meta-level by using the meta-information in the
characteristic-graph relation. Here we have the STEP operator, which delivers the next level of
detail. The proper meta description with the use of graph-relations allows us to generally
introduce hierarchy-operations in an elegant manner. It is beyond the scope of this paper to
introduce the exact mechanism of using the meta-graph relations for OLAP-navigation with drilldown and roll-up operations.
14
Mangisengi, Tjoa, and Wagner
5. Conclusion and the Comparison of Two OLAP Modeling Approaches
In this section we compare the most important features of OLAP for the two extended relational
models given in table 5.1
OLAP
Functionality
Data-Cube
representation.
Nested Relations


Rolling-up/
Drill-down
Pivoting
Data cube can be
prepared by
means of the
nested operator.
One special view
could be given
priority by
special nesting.
Extended Relational


Meta-information is
Delivered by
associations for the
facts.
Dimensions are
represented by a tree
of „characteristics of
characteristics“.
Flat Relations
ROLAP representation
by a star/snow flake
Schema. No inherent
possibility to define
OLAP cubes of
different granularity
levels.
Use of nest and
unnest operator to
build new OLAPcubes. Use of nested
functional
dependencies of the
schema-pecification
for drill-down/rollup. Every dimension
is represented by one
relation.
Necessity of a
sequence of unnest
and nest operator.
Precise definition of
Use of conventional
dimensional hierarchies in relational operators on
the characteristic-graph
the relations of the
relations. Use of objectsnow-flake schema.
id‘s makes this concept
Many joins are
appropriate for highly
necessary. Use of SQLstructured data which have Group By‘s.
to be stored in OODB‘s.
The meta-operator STEP
can be used for drill-down
on the meta-level.
All dimensions are treated Use of sorts on the new
equally. All operations can pivoting requirements.
be driven by the metainformation. The equaltreatment of dimensions
which is a requirement of
Codd‘s 12 rules is
fulfilled.
Table 5.1 Comparison of two OLAP modeling approaches
The extended relational model of Codd has the finest semantic granularity. It is the only relational
model, which discern between the properties (attribute values transparent to the user), the
definition of object classes which just contain object-identifiers, a typology of relationships
between object types (especially the introduction of characteristic object types with the possibility
to define “characteristic of characteristic” and the introduction of association-type for facts).
Another strength of this model is the proper introduction of meta-graph-relations reflecting the
roles of participating object types within characteristic object types and association object types.
Furthermore the extended relational model possesses many inherent operators (e.g. STEP, PATT)
which are extremely useful for OLAP data manipulations.
Multidimensional Modeling Approaches for OLAP Based on Extended Relational Concepts
15
Summarizing the differences of the two models we can conclude a huge advantage of the
extended relational model, because of its very positive semantic rigorism. Our task is it to find an
implementation approach with an acceptable performance for this model. For some highly
(structured application) this model could build a real alternative to conventional solutions.
Acknowledgements
The authors are very indebted to Andreas Kurz and Josef Schiefer for many creative remarks.
This work is supported by the Austrian Federal Bank, Project No. 6681. For the object-oriented
implementation we thank Informix for their “Innovative Software Grant”.
References
[AB84]
S. Abiteboul, and N. Bidoit. “Non First Normal Form Relations to Represent
Hierarchically Organized Data“, Proceedings of the 3rd. ACM SIGMOD Symposium on
Principles of Database Systems, Waterloo, Ontario, Canada, 1984.
[AGS97]
R. Agrawal, A. Gupta, and S. Sarawagi. “Modelling Multidimensional Databases”,
Proceedings of the Int’l Conference on Data Engineering, 1997.
[BSHD98]
M. Blaschka, C. Sapia, G. Höfling, and D. Dinter. “An overview of multidimensional data
models for OLAP”, Proceedings of Database and Expert Systems Applications, IEEE
Press, 1998.
[Bulo96]
D. Bulos. A New Dimension, in Database Programming & Design, 1996.
[CCS93]
E.F. Codd, S.B. Codd, and C.T. Salley. “Providing OLAP (On-line Analytical
Processing) to User-Analysts: An IT Mandate“, E.F. Codd & Associates, White paper,
1993.
[Codd79]
E.F. Codd. “Extending the Database Relational Model to Capture More Meaning“, ACM
Transaction on Database Systems, Vol. 4, December 1979.
[CT97a]
L. Cabbibo, and R. Torlone. ”A systematic Approach to Multidimensional Databases”,
Proceedings of SEBD, 1997.
[CT97b]
L. Cabbibo, and R. Torlone. “Querying Multidimensional Databases”, Proceedings of the
6th DBPL, 1997.
[FK77]
A.L. Furtado, and L. Kerschberg. “An Algebra of Quotient Relations”, Proceedings of
ACM SIGMOD, 1977.
[GL97]
M. Gyssens, and L.V.S. Lakshmanan. “A Foundation for Multi-Dimensional Databases”,
Proceedings of Int’l Conference on Very Large Databases, Athens, Greece, 1997.
[Linn87]
V. Linnemann. “Non First Normal Form Relations and Recursive Queries: An SQLBased Approach“, Proceedings of 3rd. International Conference on Data Engineering,
Los Angeles, California, USA, 1987.
[LW96]
C. Li and X.S. Wang. “A data model for supporting on-line analytical processing”,
Proceedings Conference on Information and Knowledge Management, November 1996.
16
Mangisengi, Tjoa, and Wagner
[MT98]
O. Mangisengi, and A M. Tjoa. “A Multidimensional Modeling Approach for OLAP
within the Framework of the Relational Model based on Quotient Relations“,
Prooceedings of first International Workshop on Data Warehousing and OLAP
(DOLAP), Washington, D.C., November, 1998.
[OY87]
Z.M. Ozsozoglu, and L.Y. Yuan. “A Design Method for Nested Relational Databases“,
Proceedings of 3rd. International Conference on Data Engineering, Los Angeles,
California, USA, 1987.
[RKS88]
M.A. Roth, H.F. Korth, and A. Silberschatz. “Extended Algebra and Calculus for Nested
Relational Databases“, ACM Transaction on Database Systems, Vol. 13, No. 4,
December 1988.
[SPS87]
M.H. Scholl, H.B. Paul, and H.J. Schek. “Supporting Flat Relations by a Nested
Relational Kernel“, Prooceedings of the 13th. International Conference on Very Large
Data Bases, Brighton, England, 1987.
[SS84]
H.J. Schek, and M.Scholl. An Algebra for the Relational Model with Relation-Valued
Attributes. TR DSVI 1984-T1, Technical University of Darmstadt, 1984 (in German).
[TJ98]
A. Totok, and R. Jaworski. Modeling of Multidimensional Data Structures with ADAPT.
Technical Report, Technical University of Braunschweig, ISBN 3-930166-92-5, 1998 (in
German).
Download