Digital Documents, Formats, Standards and Standardization: Curation as a Constructive Social Process Stephen Hockema Faculty of Information KMDI, IDI / ATRC June 16, 2010 Overview What am I doing here? Digitality! Digital Curation Digital Preservation Digital Repositories Digital Archives Digital Humanities Digital Mediation Digital Texts Digital Books Digital Art Digital … Curatorship Content Specialization, Organization and Interpretation Exhibition, Display and Publication Digital Curatorship Content Specialization, Organization and Interpretation Exhibition, Display and Publication Digital curation is the selection, preservation, maintenance, and collection and archiving of digital assets. -- Wikipedia Includes acquisition, documentation and care Digital Curatorship Content Specialization, Organization and Interpretation Exhibition, Display and Publication Digital curation is the selection, preservation, maintenance, and collection and archiving of digital assets. -- Wikipedia Includes acquisition, documentation and care “Guardianship” (Curators as Caretakers) Tie-ins Who is the digital curator? When is digital curation? Where does it happen? What is being curated? (digital “objects”?) Why? Curation is about authority, access and participation too. Digital Documents As objects / processes / contexts / a form of sociality and work Formats (via standardization in conjunction with tools) “determine” what documents are. …and Metadata, and Record, … …and associated Processes, Work Flows. Some motivations Standards vs. Formats vs. Applications Applications (Default) File Format Standard Microsoft Word *.docx, *.doc OOXML(ISO) Microsoft Excel *.xlsx, *.xls OOXML (ISO) Microsoft PowerPoint *.pptx, *.ppt OOXML (ISO) Adobe Acrobat *.pdf PDF (ISO) OpenOffice *.odt ODF (ISO) DreamWeaver, IE, Firefox, Safari, etc. *.html, *.css (X)HTML + … (W3C) TextEdit, etc. *.rtf Edit, vim, emacs, etc. *.txt (</oxYgen>), … *.xml XML (ISO) iAnnotate PDF iAnnotate Designing tool and workflows and processes simultaneously Constrained by formats and standards And Infrastructure(s). (and Apple!) Some Issues Conformance, Flexibility, What is “legal”(supported, optimized for, …) Genre What is metadata (vs. document) “Mark Up” ? Document Identity (synchronization) Versioning (“discreteness”) Stability Some Issues “Addressing” and Attachment Points (Digital) Rights Management Affordances of Documents Representation and Storage Digital / Material Interfaces “Where the Document Ends!” Not just about the Web! And yes, Archiving and Curating “My” Archiving Issues Archiving to do what later? Authenticity and Context Rethinking storage vs. “in use” distinction Notion of Stability Notion of Medium vs. Format “Paperless!” Document Formats PDF Standard (ISO/IEC 32000-1:2008) OOXML (ISO/IEC 29500:2008) Over 6000 pages Roots in .doc, .ppt, .xls. ODF (ISO/IEC 26300:2006) Current version (of six): 1310 page PDF (~50 fonts!) Roots in postscript. Roots in OpenOffice.org and Sun {X}HTML Document Formats (Lexel) Vancouver: A Digitally “Open” City (In May of 2009, after a bunch of WHEREAS clauses…) THEREFORE BE IT RESOLVED THAT the City of Vancouver endorses the principles of: Open and Accessible Data - the City of Vancouver will freely share with citizens, businesses and other jurisdictions the greatest amount of data possible while respecting privacy and security concerns; Open Standards - the City of Vancouver will move as quickly as possible to adopt prevailing open standards for data, documents, maps, and other formats of media; Open Source Software - the City of Vancouver, when replacing existing software or considering new applications, will place open source software on an equal footing with commercial systems during procurement cycles; and … Vancouver: A Digitally “Open” City BE IT FURTHER RESOLVED THAT in pursuit of open data the City of Vancouver will: Identify immediate opportunities to distribute more of its data; Index, publish and syndicate its data to the internet using prevailing open standards, interfaces and formats; Develop appropriate agreements to share its data with the Integrated Cadastral Information Society (ICIS) and encourage the ICIS to in turn share its data with the public at large Vancouver: A Digitally “Open” City Develop a plan to digitize and freely distribute suitable archival data to the public; Ensure that data supplied to the City by third parties (developers, contractors, consultants) are unlicensed, in a prevailing open standard format, and not copyrighted except if otherwise prevented by legal considerations; License any software applications developed by the City of Vancouver such that they may be used by other municipalities, businesses, and the public without restriction BE IT FINALLY RESOLVED THAT the City Manager be tasked with developing an action plan for implementation of the above. Infrastructure for Openess This is also Digital Curation! Enables things like (DataMob) Participation Infrastructure for Openess Need for Inclusive Design / Accessibility Not just “future” accessibility Participation / Access loop Case study / Example from OOXML OOXML (In)Accessibility No consultation with disabilities groups, assistive technology developers / vendors, accessibility community, etc. Relative change rates and stabilities Contingencies (lock in) Curation as Inclusive Design Integrate {preservation} into workflow, yes, but workflow can be “designed” …via the Standardization process, e.g. So not just about “sheer curation at the source” or “digital preservation”. Curation as a social design challenge. Ethics of Digital Curation DCI training for participation in standardization processes for: Formats Metadata Best Practices Support from professional societies. Curation as Design Beyond: Repositories and Preservation Users as Curators Subject / Object divide Meta-curation? No. Digital curation! Questions, Comments? steve.hockema@utoronto.ca