APACHE SLING & FRIENDS TECH MEETUP BERLIN, 22-24 SEPTEMBER 2014 Oak, the Architecture of the new Repository Michael Dürig, Adobe Research Design goals adaptTo() 2014 Scalable Big repositories Clustering Customisable, flexible OSGi friendly 2 Outline CRUD Changes Search adaptTo() 2014 3 Tree model a b adaptTo() 2014 d c 4 Updating ? a b adaptTo() 2014 d c x 5 MVCC HEAD r1: / a r2: / d r1: /d b r1: /a/b r2: /a/b adaptTo() 2014 r2: /d c r2: /d/x 6 Refresh and Garbage Collection adaptTo() 2014 7 Refresh garbage adaptTo() 2014 8 Garbage collection garbage adaptTo() 2014 9 Concurrency and Conflicts adaptTo() 2014 10 Concurrent updates r2a adaptTo() 2014 r1 r2b 11 Merging r2a merge upates r1 r3 r2b adaptTo() 2014 12 Conflict handling: serialisation Fully serialised Fail, no concurrent update Partially serialised Concurrent conflict free updates adaptTo() 2014 13 Conflict handling strategies: merging Partial merge Conflict markers, deferred resolution Full merge Need to choose victim adaptTo() 2014 14 Replicas and Sharding adaptTo() 2014 15 Replica and caches master copy adaptTo() 2014 full replica cache 16 Sharding strategies by path adaptTo() 2014 by level by hash with caching 17 Implementations adaptTo() 2014 18 MicroKernel / NodeStore Tree / Revision model implementation Responsible for Not responsible for Clustering Validation Sharding Access control Caching Search Conflict handling Versioning adaptTo() 2014 19 Current implementations DocumentMK TarMK (SegmentMK) Persistence MongoDB, JDBC Local FS Conflict handling Partial serialisation Full serialisation Clustering MongoDB clustering Simple failover Sharding MongoDB sharding N/A Node Performance Moderate High Key use cases Large deployments (>1TB), concurrent Small/medium deployments, mostly read deployments, mostly read adaptTo() 2014 20 Access Control adaptTo() 2014 21 Accessible paths a b adaptTo() 2014 d c 22 xistentialism All paths traversable Node may not exist Decorator on NodeStore root.getChildNode("a").exists(); ⟹ false root.getChildNode("a") .getChildNode("b").exists(); ⟹ true adaptTo() 2014 23 Comparing Revisions adaptTo() 2014 24 Content diff What changed between trees Cornerstone for adaptTo() 2014 Validation Indexing Observation … 25 What changed? ∆ adaptTo() 2014 26 Example: merging r2a r1 r2b adaptTo() 2014 ∆ r2a r1 ➞ “a” modified “b” removed r3 r1 ➞∆ r2b “d” modified “x” added 27 Commit Hooks adaptTo() 2014 28 Commit hooks Key plugin mechanism Higher level functionality adaptTo() 2014 Validation (node type, access control, …) Trigger (auto create, defaults, …) Updates (index, …) 29 Editing a commit ∆ adaptTo() 2014 ∆+x 30 Commit hooks Based on content diff pass a commit fail a commit edit a commit Applied in sequence adaptTo() 2014 31 Type of hooks CommitHook Editor Validator Content diff Optional Always Always Can modify Yes Yes No Programming model Simple Callbacks Callbacks Performance impact High Medium Low adaptTo() 2014 32 Observers adaptTo() 2014 33 Observers Observe changes After commit Often does a content diff Asynchronous Optionally synchronous adaptTo() 2014 Local cluster node only 34 Examples JCR observation External index update Cache invalidation Logging adaptTo() 2014 35 Search adaptTo() 2014 36 Query Engine SELECT WHERE x=y parse execute Parse r Index Parser Index post process /a//* Parse r adaptTo() 2014 Parser Index Travers e 37 Index Implementations Property (ordered) Reference Lucene In-content or file system Solr Embedded or external adaptTo() 2014 38 Big Picture adaptTo() 2014 39 Big picture JCR API Oak JCR Oak API Plugins Oak Core NodeStore API MicroKernel adaptTo() 2014 40 Resources http://jackrabbit.apache.org/oak/ adaptTo() 2014 41 Appendix adaptTo() 2014 42 Resources http://jackrabbit.apache.org/oak/ http://jackrabbit.apache.org/oak/docs/ https://svn.apache.org/repos/asf/jackrabbit/oak/trunk/ adaptTo() 2014 43