A Data Model for Supporting On-Line Analytical Processing DataBase Lab. 석사1학기 홍은주 Abstract Formalize MDD model for OLAP Develop algebraic query language → grouping algebra MDDB consists of finite set of multidimensional cube finite set of relations Contents 연구 동기 Cubes and Grouping relations MDDB and MDDB queries Multidimensional cube algebra Related Research Conclusion 연구 동기 Query 1 : 각 상점별로 올해의 모든 날(day) 에 대한 total sales amount를 구하라 Query 2 : 각 지역별(east, west..)로 작년 매출순위 상위 5위의 지역을 찾으라 Query 1 (two grouping method) By attribute (store를 의미함) 시간에 따른 position (day를 의미함) → 올해 day ‘d’까지 store ‘s’에 있던 sales amount로 산출 Query 2 (two grouping method) By attribute (product) 작년의 모든 product에 대한 각각의 sales를 계산 sales 계산 결과로 각 지역별로 매출 상위 5위를 뽑아냄 위와 같은 query들은 traditional relational query language 로는 query가 한번에 해결되 지 않음 → 本 연구를 시작 Cubes and Grouping R : attribute name D : dimension name Relations A는 dom(A)에 속함 Definition 1 V : null value를 포함한 scalar value N-dimensional cube scheme is a set {(D1,R1),…,(Dn,Rn)} cube는 (F, )와 짝을 이룸 F = {(D1,r1),…,(Dn,rn)} ri는 Ri로의 relation 는 {{(D1,t1),…,(Dn,tn)}|∀1≤ i ≤ n : t i ∈ r i} 로부터 V로의 mapping Analyze –Definition 1 ri는 Di의 dimension relation Multidimensional -cube는 dimension relations 집합과 value mapping의 집합 ※value mapping은 각 dimension의 각 tuple combination을 scalar value로 mapping Cube의 dimension 이름에 순서가 있다고 가정: Cube scheme list : <(D1,R1),…,(Dn,Rn)> Cube는 위와 같은 scheme 에 정의 Example 3-dimensional cube SALES = (rs, rp, rd, amount) <(Store Rs), (Prod, Rp), (Date, Rd)> scheme 위에 존재 Rs = {loc} (location) Rp = {p, m} (item and manufacturer) Rd = {y, m, d} (year, month, day) amount (ts, tp, td) sales amount of product tp reported by store ts on day td Definition 2 S = {(D1,R1),…, (Dn,Rn)} : cube scheme G : grouping relation scheme G is said to be applicable to S if for each non-dummy dimension attribute D.A∈G, A∈Ri and Di=D for some i Analyze-Definition 2 Non-dummy dimension attribute D.A가 D에 나타남 D : dimension name in the cube scheme A : relation scheme for dimension D Definition 3 G : grouping scheme g = G의 grouping relation X : subset of G S = <(D1,R1),…(Dn,Rn)> : cube scheme C = (r1,…,tn) : S의 cube Each tuple t in x (g ) gives the following set of coordinates, denoted by XC , g (t ) : {(t1,…,tn)| ti∈ri for each 1≤i≤n and there exist t’ in x=tr(x) such that t’[R∩Ri]= ti[R∩Ri] for each 1≤i≤n Example - Definition 3 MDDB and MDDB queries MDDB Finite set of multidimensional cube Finite set of grouping relations grouping algebra를 표현하는 4 MDDB queries MDDB on the scheme (D, C, G) D = {Date, Prod, Store}, C = {Sales} G = {Region} Region is on the scheme {reg, Store.loc} Q.1 Find out the names of the last year’s (1994) top 5 selling product (including all manufacturer) begin, step,length O (g) Q.2 For each member store, find out the year-to-date total sales amounts for each day this year (the daily cumulative sales amounts over 1995) Q.3 Find the year-to-date total sales amounts, in each region, of each product whose last year’s nation-wide total sales amount was ranked among top five Q.4 For all those products that are in the set of products manufactured by m1 and m2, find the total sales amounts of this year (1995) Multidimenstional cube algebra cube algebra cube 에서 cube로 mapping하는 6 operation 으로 구성 Purpose construct data from local databases into suitable multidimensional cubes 종류 1. add dimension, 2. transfer, 3. Union 4. cube aggregation 5. rc-join 6. construct 1. Operation-Add dimension D(C) Relation scheme이 empty set Input cube C로부터 C’을 생성 C’ has new dimension named D relation for the dimension has only one tuple Empty tuple → empty tuple [] coordinate (t1,…,tn,[]) in C’ coordinate (t1,…,tn) in C two cube를 union 하기위해 input cube가 다른 cube와 dimension을 갖도록 맞추는 것 이 목적 2. Operation - Transfer (1) D2 , B D1 , A (C ) D1의 A를 D2의 새로운 attribute B로 transfer ※ D : dimension A, B : attribute 첫번째 dimension의 A를 두 번째dimension 으로 project out (by Cartesian product) 2. Operation - Transfer (2) (Year (C)) EX) C’ = Example of transfer operations Year , y Date, y D '1 , A1 D1 , A1 3. Operation - Union union cubes C1 and C2 along the dimension D1 두 cube의 coordinate을 union each coordinate get its original value 새로운 coordinate에 null value 사용됨 4. Operation - Aggregate R 'i1 ,, R 'i1 Di1 ,, Di1 The above cube aggregation gives a cube C’’ on the scheme <(Di1,R’i1),…, (Dim,R’im)> f (C ) Cube Aggregation sumDA1 5. Operation - rc join 6. Operation - Construct join r (relation) into D1 of C (cube) result : new cube with D1 (r ) A D r(relation)에서 cube 생성 Example Suppose, Each store reports a cube Ci = (rd,rp,amounti) to the headquarter 위의 cube들과 위치 relation 등이 headquarters data warehouse에서 cube를 구축하는데 쓰임 Related Research In DB research, systems similar to OLAP systems have been studied in the domain of statistical and scientific databases(SSDB) Optimization techniques Pre-aggregation new grouping operation CUBE for the SQL group-by clause Conclusion The query language is flexible in expressing many intuitive OLAP queries, including order-related queries. Issued on Nov.1996 by Chang Li, X.Sean Wang George Mason University