Design Principles for Digital Preservation Systems? Systems Stephen Abrams University of California Curation Center California Digital Library www.cdlib.org/uc3 Design Principles for Digital Preservation Programs Systems Stephen Abrams University of California Curation Center California Digital Library www.cdlib.org/uc3 Washington May 22-24, 2013 What’s the problem we’re trying to solve? Connecting people with digital content in meaningful ways across barriers of space and time Robbi Verte, Gesture of hand holding a flash disk, www.123rf.com/ Pietro Izzo, Open hand, www.flickr.com/photos/pietroizzo/482812880 Preservation is the end, systems are just the means Washington May 22-24, 2013 Think outside the (system) box « Integration » Worry about designing your overall preservation program before considering the systems that will implement parts of it Luxmart, Working together team puzzle concept, http://commons.wikimedia.org/wiki/File:Working_Together_Teamwork_Puzzle_Concept.jpg Washington May 22-24, 2013 Think outside the (system) box « Integration » Worry about designing your overall preservation program before considering the systems that will implement parts of it A preservation program should provide effective control over managed content in several key areas Technical control Managing bits, descriptions of bits, relationships between bits, etc. Analysis, planning, monitoring, intervention, etc. Intellectual (or curatorial) control Creation, selection/acquisition, arrangement, cataloging, etc. Cf. Kenney & McGovern (2003), “Five organizational stages of digital preservation,” Digital Libraries: A Vision for the 21st Century (Ann Arbor: MPublishing,), hdl:2027/spo.bbv9812.0001.001 Washington May 22-24, 2013 Good design is good design Any principles for preservation system design should be informed by general principles for system design and engineering design Beginning with a clearly defined need Leading to a creative design in response to the need Resulting in a system fully meeting the need Cf. Royal Academy of Engineering (1999), Principles of Engineering Design www.raeng.org.uk/education/vps/principles/pdf/armstrong_keynote.pdf Jason DeRusha, Crying baby shot, www.flickr.com/photos/derusha/1465953800 Izumi Mitatami, The hamburger, www.flickr.com/photos/marvin_izumi/3881467402 Sean Dreilinger, Big hamburger, little kid, www.flickr.com/photos/seandreilinger/3002176844 Washington May 22-24, 2013 Design principles for preservation systems Integration Least surprise Definition Elegance Generality Community Parsimony Modularity Granularity Orthogonality Emergence Redundancy Evolution Transience Yes, some of these may sound, or even be, somewhat inconsistent or contradictory Washington May 22-24, 2013 Know what you (really) need « Definition » What you need, not just what you want Tie requirements to specific use cases Actual, expected, anticipated, or probable Hypothetical? Acceptance criteria Did you get what you asked for? Guardian, Five reasons why waiters don’t write down your order, www.guardian.co.uk/lifeandstyle/2011/aug/14/waiters-dont-write-orders-down Washington May 22-24, 2013 Know what you don’t need « Parsimony » Necessity vs. sufficiency vs. superfluity “A scientific theory should be as simple as possible, but no simpler” – Einstein “It is futile to do with more things that which can be done with fewer” – William of Ockham “Not too big, not too small, just right” – Goldilocks Wikimedia Commons, Goldilocks 1912, commons.wikimedia.org/wiki/File:Goldilocks_1912.jpg Washington May 22-24, 2013 Inclusive applicability « Generality » Solve a general problem today, to avoid having to solve a specific problem tomorrow Support for re-configuration or self-configuration Facilitate (re)use, potentially in novel ways Cf. Yourdon and Constantine (1979), Structure Design: Fundamentals of a Discipline of Computer Programming (Prentice-Hall), www.win.tue.nl/ ~wstomv/ quotes/structured-design.html#19 Brian Snelson, Lovely new metric spanners and torque wrench, www.flickr.com/photos/exfordy/353771860 LoggerHead Tools, Bionic wrench, loggerheadtools.com Washington May 22-24, 2013 The Unix philosophy « Modularity / Granularity / Orthogonality / Emergence » Make each program do one thing well To do a new job, build afresh rather than complicate old programs by adding new features Expect the output of every program to become the input to another, as yet unknown, program Don't hesitate to throw away the clumsy parts and rebuild them Cf. McIllroy et al. (1978), “Unix time-sharing system: Forward,” Bell Systems Technical Journal 57(6): 1899– 1904, www3.alcatel-lucent.com/bstj/vol57-1978/articles/bstj57-6-1899.pdf Windell Oskay, Inside-out Lego brick, www.flickr.com/photos/oskay/265899811 Washington May 22-24, 2013 Just in case, just in case « Redundancy » Plan for failure Replication to avoid single points of failure Decorrelation to avoid cascade failure Cf. Rosenthal (2010), “LOCKSS: lots of copies keeps stuff safe,” US Workshop on Roadmap for Digital Preservation Interoperability Framework, NIST, Gaithersburg, MD lockss.org/locksswiki/files/NIST2010.pdf NCinDC, Life is one big balancing act, www.flickr.com/photos/ncindc/3229050640 Energy Press, PWC: Το δάνειο στο ΛΑΓΗΕ δεν αρκεί για να μην καταρρεύσει η αγορά www.energypress.gr/news/lianikh-reymatos/PWC:-To-daneio-sto-LAGHE-den-arkei-gia-na-mhn-katarreysei-h-agora Washington May 22-24, 2013 First make it work, then make it work better « Evolution » Configuration Customization Iterative enhancement Cf. May and Zimmer (1996), “Evolutional development model for software,” HP Journal (August): 39-45, www.hpl.hp.com/hpjournal/96aug/aug96a4.pdf Wikimedia Commons, commons.wikimedia.org/wiki/File:Human_evolution_scheme.svg Washington May 22-24, 2013 Easy come, easy go « Transience » Preservation systems are inherently ephemeral and expendable; the content managed in them is not Avoid system lock-in Standardized content representation Standardized APIs Smooth migration paths Your (aging) system’s DIP should be a replacement system’s SIP Preferably, change at a time and place of your choosing Cf. Janée (2009), “Relay-supporting archives: Requirements and progress,” International Journal of Digital Information 4(1), www.ijdc.net/index.php/ijdc/article/view/102 Washington May 22-24, 2013 Keep the customer satisfied « Least surprise » Default system behaviors should conform to implicit user expectations Know the communities you are seeking to serve Consistency Treat like things alike Cf. Raymond (2003), “Applying the rule of least surprise,” Art of Unix Programming (Addison-Wesley), http://www.faqs.org/docs/artu/ch11s01.html Narufag, Naruto in the scream, narufag.deviantart.com/art/naruto-in-the-scream-267479366 Washington May 22-24, 2013 Commodity, firmness, and delight « Elegance » “Well building hath three conditions: firmness, commodity, and delight” – Vitruvius, De architectura [trans. Wotten, 1694] The analogous conditions for computer, rather than structural, architecture are… Utility Resilience Elegance Dominic 2007, Pantheon Dome, www.flickr.com/photos/9556741@N03/3157684854 Cf. Madni (2012), “Elegant systems design: Creative fusion of simplicity and power,” Systems Engineering 15(3): 347-54, doi:10.1002/sys.21209 Washington May 22-24, 2013 Beg, borrow, or steal « Community » Learn from the solutions and experience of the community Support the community and contribute back Cf. Anderson (2011), “National Digital Stewardship Alliance: Community, content, commitment,” CENDI Principals and Alternatives, Washington, DC www.cendi.gov/presentations/03_06_11_Anderson_Martha_NDSA.pdf Duncan Hall, Attention aux PickPockets, www.flickr.com/photos/dullhunk/4575707721 Enrique Martinez Bermejo, Community-manager, www.flickr.com/photos/kikemb/5428414543 Washington May 22-24, 2013 Principles in action: micro-services Decomposition of infrastructure function into a granular set of independent, but highly interoperable services Cf. Abrams, Cruse, Kunze, and Minor (2011), “Curation micro-services: A pipeline metaphor for repositories,” Journal of Digital Information 12(2), journals.tdl.org/jodi/article/view/1605 Mode Focus Value Service Accretion Annotation Visibility Notification Accessibility Access Derivation Transformation Selectivity Search Actionability Index Stewardship Ingest Epistemology Characterization Ontology Inventory Reliability Replication Fixity Fixity Stability Storage Identity Identity Valence Value Utility Interoperation Context Preservation State UI / Access control / Message queuing Curation Visibility User-facing “Curation Architecture Prototype Services (CAPS), is built on the micro-services approach to digital curation” http://www.libraries.psu.edu/ “The micro-services approach … seemed similar to the SDR 2.0 principle of making services more modular” http://library.stanford.edu/ Application “[SDB] provides a viable solution to the challenges of long term digital preservation by delivering a flexible, extensible set of micro-services” http://www.tessella.com/ Interpretation Provider-facing Protection “Archivematica implements a micro-service approach to digital preservation” http://www.archivematica.org/ “The University of North Texas (UNT) has implemented a robust architecture for digital library initiatives utilizing the Curation Micro Services methodology for building repository infrastructure” http://www.library.unt.edu/ Principles in action: Merritt repository User agent Load balancer UI/API LDAP DataONE coord’ing node RDBMS UI/API LDAP DataONE member node RDBMS UI/API LDAP RDBMS RDBMS Ingest Inventory Storage node DAS Storage broker Storage node SAN Fixity Storage node Cloud Load balancer Ingest Ingest IDF DataCite EZID No-SQL Message queue RDBMS RDBMS http://www.cdlib.org/uc3/merritt http://merritt.cdlib.org/ Washington May 22-24, 2013 Design principles for preservation systems Integration Least surprise Definition Elegance Generality Community Parsimony Modularity Granularity Orthogonality Emergence Note that these are principles, not rules Their applicability will depend on local needs, conditions, expertise, resources, etc. — Rely on your intuition and experience Redundancy Evolution Transience www.slideshare.net/UC3/pasig-2013abramsdesignprinciplesforpreservationsystems Washington May 22-24, 2013 UC Curation Center www.cdlib.org/uc3 uc3@ucop.edu Stephen Abrams Patricia Cruse Shirin Faenza Scott Fisher Erik Hetzner Joshua Hubbard Greg Janée John Kunze Rosalie Lack David Loy Mark Reyes Joan Starr Carly Strasser Marisa Strong Adrian Turner Bhavitavya Vedula Kenneth Weiss Perry Willet www.slideshare.net/UC3/pasig-2013abramsdesignprinciplesforpreservationsystems