Scalable Trigger Processing* -Eric N. Hanson et al. CSCi8701: Overview of Database Research Paper Presentation Group 4: Betsy George, Vijay Gandhi *International Conference on Data Engineering, 1999 Presentation Outline Motivation Problem Definition Related Work Contributions Key Concepts Validation Future Work Rewrite Today Summary Motivation Traditional uses of Triggers Constraint checking Logging Replication Web-based applications Create triggers interactively E.g. Stock Ticker notification: If share value of Google goes down below 500 notify a person Limitations of current trigger systems Not scalable Problem Definition Given: A Relational DBMS, Trigger statements, Data Stream (tokens) Find: Triggers corresponding to each token Objective: Scalable trigger processing system Constraints: Number of distinct structures of trigger expressions is small All distinct structures of trigger expressions should be small enough to fit in the main memory Related Work ECA Model (not scalable) Indexing Range Predicates, Marking based [Hans96b, Ston90] (large memory, complicated storage) Parallel Processing [Gupt89,Hell98] AI [Forg82,Mira87] (smaller rule set) The work proposed here is a combination of improvised version of some of the modules mentioned above. Contribution Data Structures If a large number of triggers are created, many of them have almost the same format Predicate Index Structure Most important contribution Concurrent processing Identified 4 levels of concurrency Implemented token-level concurrency Key Concepts – TiggerMan Architecture Key Concepts – Trigger Structure Example: Stock ticker notification Create trigger T1 from stock when stock.ticker = ‘GOOG’ and stock.value < 500 do notify_person(P1) Create trigger T2 from stock when stock.ticker = ‘MSFT’ and stock.value < 30 do notify_person(P2) Create trigger T3 from stock when stock.ticker = ‘ORCL’ and stock.value < 20 do notify_person(P3) Create trigger T4 from stock when stock.ticker = ‘GOOG’ do notify_person(P4) Key Concepts – Expression Signature Common structures in the condition of triggers T1: stock.ticker = ‘GOOG’ and stock.value < 500 T2: stock.ticker = ‘MSFT’ and stock.value < 30 T3: stock.ticker = ‘ORCL’ and stock.value < 20 Expression Signature: E1: stock.ticker = const1 and stock.value < const2 T4: stock.ticker = ‘GOOG’ Expression Signature: E2: stock.ticker = const3 Key Concepts – A-Treat Network For each trigger condition stock.ticker = const1 and stock.value < const2 Root stock.ticker = const1 stock.value < const2 Node 1 Node 2 alpha-node alpha-node predicates Key Concepts – Expression Signature Expression Signature Table Ex. ID Data Source Signature Constant Description Table Number of Constant Constants Organization E1 stock … const_e1 2 Main Memory E2 stock … const_e2 1 Main memory E1: stock.ticker = const1 and stock.value < const2 E2: stock.ticker = const3 Key Concepts – Constant Table Tables to include constants occurring in the condition of triggers const_e1 Ex. ID Trigger ID Constant 1 Constant 2 Next Node E1 T1 GOOG 500 Node 2 E1 T2 MSFT 30 Node 2 E1 T3 ORCL 20 Node 2 Rest Const_e2 Ex. ID Trigger ID Constant 1 Next Node E2 T4 GOOG Null T1: stock.ticker = ‘GOOG’ and stock.value < 500 T2: stock.ticker = ‘MSFT’ and stock.value < 30 T3: stock.ticker = ‘ORCL’ and stock.value < 20 Rest T4: stock.ticker = ‘GOOG’ Key Concepts – Summary Expression Signature Common structure in a trigger A-treat network Network for trigger condition testing E1: stock.ticker = const1 and stock.value < const2 For a Trigger to fire, all conditions must be true Constant Tables Constants for each Expressions Signature Key Concepts - Predicate Index Key Concepts - Processing Update Stock(ticker=GOOG,value=495) Root Index of stock.ticker=const1 Other source Predicate index… E1: stock.ticker = const1 and stock.value < const2 E1 E2 const_e1 const_e2 Trigger ID Constant 1 Constant 2 Next Node T1 GOOG 500 Node 2 T2 MSFT 30 Node 2 T3 ORCL 20 Node 2 const_e1 Concurrency Concurrency Better scalability Even on single processor Identified elements that can be parallelized Token-level Condition-level Multiple rule actions fired at the same time Data-level Multiple selection conditions tested concurrently Rule-action-level Multiple tokens processed in parallel Set of data values in the network processed in parallel Implemented Token-level concurrency Validation Important fact: If a large number of triggers are created, many of them have almost the same format Implemented as an Informix DataBlade No experimental comparisons Assumptions If a large number of triggers are created, many of them have almost the same format All distinct predicate structures fit into the main memory Rewrite today Validations Provide Experimental Comparisons Test on real datasets Examples: Execution Trace Remove sections on TriggerMan Command Language and Architecture Describe A-TREAT network Summary If a large number of triggers are created, many of them have almost the same format Number of distinct signatures is small enough to fit into the main memory