UC3 M e r r i t t Curation Micro-Services “It’s a Series of Tubes” UC3 M e r r i t t The Unix philosophy “Make each program do one thing well” “To do a new job, build afresh rather than complicate old programs by adding new features” “Expect the output of every program to become the input to another, as yet unknown, program” “Design and build software … to be tried early” “Don't hesitate to throw away the clumsy parts and rebuild them” — D. L. McIlroy et al., “Unix time-sharing system forward,” Bell System Technical Journal 57:6, part 2 (1978): 1902 UC3 M e r r i t t The micro-services “philosophy” CC http://www.flickr.com/photos/oskay/265899811/ http://www.flickr.com/photos/elsie/8229790/ UC3 M e r r i t t Curation micro-services Metaphors Assumptions Principles Preferences Practices Pipeline Safety through redundancy Modularity The small and simple over the large and complex Focus on outcomes, not means Lego bricks Meaning through context Granularity The minimally sufficient over the feature laden Complexity through composition, not addition Utility through service Orthogonality The configurable over the prescribed Policy neutral, platform and protocol independent Value through use (and reuse) Emergence The proven over the (merely) novel Approach sufficiency through incrementally necessary steps Stewardship is a relay Evolution Early prototyping, frequent refactoring Parsimony Code to interfaces UC3 M e r r i t t Curation micro-services Mode Focus Value Accretion Value Utility Annotation Visibility Notification Accessibility Access Derivation Transformation Selectivity Search Actionability Index Stewardship Ingest Epistemology Characterization Context Ontology Inventory Reliability Replication Fixity Fixity Preservation State Valence Stability Storage Identity Identity Visibility Interoperation UI / Access control / Message queuing Curation Service User-facing Application Interpretation Providerfacing Protection UC3 M e r r i t t Design goals Principle of least surprise Multiple interface modalities – RESTful HTTP – Command line – Procedural (Java, Perl, Ruby, …) Linked data Stable URL references State Storage or Storage Object Version File content node service http://example-store/ state/default/1234/3/xyz The file system is the database UC3 M e r r i t t “You say micro, I say macro…” Service Tool Access Access ANVL ARK bagit.pl BagIt CAN checkm.pl Checkm Dflat ERC EZID GhOST Ingest Inventory LockIt N2T namaste.pl Namaste Noid Pairtree ReDD RUU Storage EZID GhOST Ingest Inventory N2T RUU Storage Convention ANVL ARK BagIt Checkm Dflat ERC LockIt Namaste Pairtree ReDD UC3 M e r r i t t Development roadmap First wave Second wave Third wave Fourth wave Fifth wave Sixth wave Identity Inventory Index Search Notification Annotation Storage Ingest / Access Fixity Replication Characterization Transformation IDm / Authn / Authz Metadata standards Object / collection modeling Semantic interoperability Policy / business model development UC3 M e r r i t t Ingest process flow Create identifier Identity Identifier Submitting user agent Submit Ingest Node Add version Notification Version metadata Get version Add version metadata Storage Notification Node Version metadata Get version metadata Inventory Version metadata Node UC3 M e r r i t t Ingest implementation Submitting user agent Ingest notification HTML form Batch or single object Submission notification Job metadata Submitter Queue Servlet Implicitly multi-threaded Job payload Zookeeper dæmon Consumer Dæmon Explicitly multi-threaded Ingester Servlet Implicitly multi-threaded Storage UC3 M e r r i t t Questions? silverpipes.jog / firstpresmacomb.org UC3 M e r r i t t More information UC Curation Center (UC3) http://www.cdlib.org/uc3 Micro-service specifications https://confluence.ucop.edu/display/Curation Digital curation group http://groups.google.com/group/digital-curation UC3 Stephen Abrams Erik Hetzner Patricia Cruse Greg Janée Scott Fisher John Kunze Margaret Low Mark Reyes Perry Willett David Loy Tracy Seneca Isaac Rabinovitch Marisa Strong