Toward A Digital-Based Information Management Practice Presentation to CNI Task Force December 6, 2005 Avra Michelson and Michael Olson Approved for Public Release; Distribution Unlimited 05-1431 © 2005 The MITRE Corporation. All rights reserved About MITRE Not-for-profit Federally-Funded Research and Development Center (FFRDC) chartered by Congress to work in the public interest Performs high-end systems engineering addressing the nation’s hardest problems Independent honest broker who works only for government – prohibited from manufacturing products, competing with industry, or working for commercial companies Founded in 1958 with several hundred employees from MIT’s Lincoln Laboratories Today have 5,700 staff with headquarters in Bedford, MA and McLean, VA as well as 60 additional locations around the world 2 © 2005 The MITRE Corporation. All rights reserved Overview Information Management is Changing What is the Nature of the Challenge? Digital Information Management – Definition – Framework – Working Hypotheses MITRE Focus for FY06 Our work most closely aligns with CNI’s Institutional Repositories / Digital Libraries initiatives 3 © 2005 The MITRE Corporation. All rights reserved Traditional Approach to Information Management Compliance / Archive Collect / Store Represent / Disseminate Life Cycle Processes Performed With Each Discrete Application 4 © 2005 The MITRE Corporation. All rights reserved Information Management Practices Change Over Time 5 © 2005 The MITRE Corporation. All rights reserved What’s Driving the Changes? Predominance of digital as medium for storage, management, & retrieval Compelling need to share within organizations and across boundaries Skyrocketing volume along with time-sensitive need Massive heterogeneity of technical environments and content types Shift from intermediary management of information to consumer / technology 6 © 2005 The MITRE Corporation. All rights reserved Information Management (IM) Challenge is Changing Traditional Approach to IM Compliance / Collect / Archive Store Represent / Disseminate Life Cycle Performed With Each Discrete Application ENTERPRISE: Provide access to content – separate from applications -- across data holdings, regardless of boundaries USER/TEAM: Aggregate and manage distributed information in a personal space in coordination with collaborators Expanded Dimensions of IM 7 © 2005 The MITRE Corporation. All rights reserved Lots of Technology, Lots of Practice, but Lots of Unanswered Questions How is information managed … ENTERPRISE USER/TEAM – Across ill-defined, heterogeneous boundaries? – That is not a single collection, but a collection of collections? – In conformance with enterprise policy? – In a time-sensitive manner, scaling to high volume? – Taking advantage of all that technology could make possible? A coherent digital Information Management practice has yet to be defined 8 © 2005 The MITRE Corporation. All rights reserved Framing the Digital Information Management Challenge 9 © 2005 The MITRE Corporation. All rights reserved Enterprise Perspective: Emergence of the “Data Layer” Concept Data is embedded in autonomous mission applications Program ‘A’ Requirements PMO ‘A’ Program ‘B’ Requirements PMO ‘B’ App ‘A’ New technologies available… Program ‘C’ Requirements PMO ‘C’ App ‘B’ App ‘C’ New Program office required The Data Layer Data ‘A’ Data ‘B’ Data ‘C’ DL PMO Is the “data layer” the answer to sharing? 10 © 2005 The MITRE Corporation. All rights reserved Enterprise Perspective: Challenges How to implement this concept in a timely manner while serving the needs of… – – – – multiple program offices that are all working different problems on different schedules to meet different user requirements? How would an “Enterprise IM” Program Office operate? – What level of coordination is feasible/desirable across applications that are driven by different problems, schedules, and requirements? – How do you resolve questions related to… Policies governing the enterprise collection and enforcement Access controls, copyright, intellectual rights and other data permissions Data retention for operations and compliance Metadata needs for managing and using the collection Little in the way of vision and methodologies 11 © 2005 The MITRE Corporation. All rights reserved User Perspective: Personal Information Management Enterprise Applications (Search, Email, Repositories, etc.) Data Data Data Data Social Applications External Shared Data External Application … Shared File System Application “N” Application Application “A” “B” External Network Social Bookmarking Collaboration Blogs Internal Network Wiki Discussion Threads .. . Desktop Search Desktop Search Bookmarks Bookmarks Email Email Personal Library Personal Library Desktop Search Desktop Search Loosely Coupled Teams Communities of Interest Bookmarks Bookmarks Email Personal Library Bookmarks Bookmarks Email Personal Library Email Desktop Search Desktop Search Desktop Search Email Bookmarks Personal Library Email Personal Library Personal Library Individual What do I know about a subject regardless of where information is stored? 12 © 2005 The MITRE Corporation. All rights reserved User Perspective: Challenges Few capabilities available to help users manage a personal information space – Browser-based bookmarking – Private or share spaces for maintaining personal collections – Desktop search No comprehensive vision and few cross-application tools or methods for managing at a personal level New class of social applications geared towards peer to peer exchange of information emerging – Social bookmarking, Wikis, Blogging Blurring lines between the personal and team/group environments adding additional complexity Little in the way of vision, tools and methods 13 © 2005 The MITRE Corporation. All rights reserved What Is The Information Management Challenge Going Forward? App ‘A’ App ‘B’ Data ‘A’ Data ‘B’ To share information embedded in applications across the enterprise and organizational boundaries To define the tools and methods for managing a personal / team information space To harmonize these efforts into an enterprise architecture and information management practice 14 © 2005 The MITRE Corporation. All rights reserved What is the Role of the Digital Information Curator? …in every subset of government, there is a realization that legacy IM practices are falling short – Budgets for traditional services down more than 40% from 2003 – Staffing levels have declined for second year in a row, including contractors Emerging need to manage digital objects, through their lifecycle, in harmony across applications, the enterprise, and personal domain The Changing Roles of Content Management Functions: View from the Government, 2004 15 © 2005 The MITRE Corporation. All rights reserved Digital Information Management Framework Collection Development Capture and Create Collection Management Find, Present, and Deliver Compliance and Archive Life Cycle Stages Issues • Who is the audience and what are their information needs? • What information do I have? • What do I need to acquire? • What is the acquisition plan? • What are the means of acquiring information? • What are the means of creation? • What is the workflow associated with creation? • What policies govern the collection? • How will the policies be enforced? • What metadata is needed to manage the collection’s content? • What are the means for supporting search and discovery? • What are the means for supporting presentation and dissemination? • How are objects found over time? • What is the duration of active life of the content? • How long is the collection required to be retained for compliance purposes? • What are means of enabling retention? Functions/ Methods • Information needs assessment • Usage analysis • Content inventory • Source identification • Gap analysis • Requirements definition • Source exploitation strategy • Task analysis and implementation • Content management strategy • Source assessment and characterization • Access control policy and strategy • Metadata collection strategy • Intellectual property rights and copyright usage policies • Resource identifier strategy • Information architecture • Search strategy & improvement methodology • Dissemination strategy • Interoperability standards • Archival strategy • Records scheduling • Refresh and migration strategy • Usage analysis tools • Requirements management tools • Document imaging • Content / Document management systems • Digital Asset Management tools • Authoring & editing tools • Language tools • Ingest technologies • Authentication & access control technologies • Content/document management systems • Digital Asset Management tools • Digital Rights Management technologies • Search & discovery tools • Dissemination technologies • Content/document management systems • Digital Asset Management tools • Automated extraction tools • Object persistence services • Language tools • Content/document management systems • Digital Asset Management tools • Records management tools • Hierarchical Storage Management tools • File format migration and conversion technologies Technology Sample 16 © 2005 The MITRE Corporation. All rights reserved Related Computing Domains Data Management Procurement Usability Engineering Digital Information Management Collaboration Language Technology Storage Analytic Tools Security 17 © 2005 The MITRE Corporation. All rights reserved Digital IM Working Hypotheses Distributed resources, but centralized access – Goal of unified views, not centralized repositories Manage heterogeneity rather than strive for common standards – Reliance on technology to perform the necessary integration and transformations – rather than common vocabularies, etc. Automated methods for establishing the findability of digital objects – Topical metadata of diminishing value Engineering that places the user at the center of the system as opposed to the data repository – Prevailing use of service-oriented designs that allow users to subscribe to capabilities and information as desired Mission information managed at higher levels of service than records retained for compliance purposes – To avoid investing more than needed to manage less critical information or overburden applications designed for mission-critical information 18 © 2005 The MITRE Corporation. All rights reserved How Do We Get There? Challenges very large Need broad investigation of issues MITRE supporting several initiatives in FY 06 19 © 2005 The MITRE Corporation. All rights reserved MITRE Focus for FY 06 Issue Details Proof-of-concept for managing a personal information space Information management practice for multi-modal materials Metadata strategies Establishing baseline practices for cross-application data sharing Alternatives, Where does Digital IM Framework What is the vision? What are the tools? What are the “touch points” with enterprise IM? What are IM dimensions? Where do multi-modal materials fit within the broader IM workflow? Distinguishing finding information from managing it Transitioning sponsors to automated capture and extraction strengths, weaknesses records management fit? Life cycle processes, issues, methods, and technologies for managing digital content 20 © 2005 The MITRE Corporation. All rights reserved Contributors Rachael Bradley Dr. Clifford Lynch, CNI Clif Bridgers Betsi McGrath Ray D’Amore Howard Markham Richard Games Dr. Mark Maybury Meredith Goodnight Victor Perez-Nunez Julie Gravallese Arnie Rosenthal Soohee Kim Dr. Len Seligman Aaron Lesser Ted Sienknecht Dr. Frank Linton Cynthia Small Dr. Joan Lippincott, CNI Kerry Zimmerman 21 © 2005 The MITRE Corporation. All rights reserved Bibliography Gartner Research – Allega, Phillip J., Architecture Framework Debates are Irrelevant, June 7, 2005 – Allen, Nick, et. al., Vendor Rating Update: IBM Storage is Promising, but its Software Still Needs Improvement (1 April 2005) – Austin, Tom, et. al., , 2005, Client Issues In the High Performance Workplace, April 29, 2005 – Bell, Toby and Ames Lundy, Content-Centric Communications Can Revolutionize Customer Service, May 24, 2005 – Burton, Betsy, and D.M. Smith, Client Issues 2005: How to Approach, Encourage, and Support Collaborative Work, Gartner Research, April 29, 2005 – Caldwell, F. , Apply Governance Principles to Improve Content Management, 7 February 2005 – Chuba, Mike, Five Storage Vendor Ratings (5 April 2005) – DiCenzo, Carolyn, K. Chin, Magic quadrant for Email Active-Archiving Market, 2005, April 21, 2005 – Di Maio, Andrea, Strike a Balance Between Centralization and Decentralization of Government IT Management, June 3, 2005 – Dixon, Don, eCopy Looks to Set Document Imaging and Distribution Standards for MFPs, May 23, 2005 – Gassman, Bill, How to Choose an Advanced Solution for Web Site Analytics, 1 April 2005 22 © 2005 The MITRE Corporation. All rights reserved Bibliography Gartner Research (con’t.) – – – – – – – – – – Harris, Kathy, et.al., Knowledge Management Client Issues for 2005 and Beyond, 25 April 2005 Kleinberg, K., D. Logan, Digital Preservation in Healthcare: Long-Term Accessibility, 7 January 2002 Knox, R., White, A., Eid, T., Companies Should Align Their Structured And Unstructured Data, 2 February 2005 Kolsky, Esteban, Management Update: Debunk Self-Service Myths to Reap Self-Service Benefits, May 25, 2005 Kolsky, Esteban, Self-Service Gets Functional, March 16, 2005 Krischer, Josh, Consider Data Consistency When Planning Disaster Recovery, 8 March 2005 Leskela, Lane, et. Al., Client Issues 2005: How to Achieve Regulatory Compliance and ERM, March 29, 2005 Leskela, Lane, French Caldwell, 2005 Compliance Focus is on Best Practices and IT Support, 4 March 2005 Logan, Debra, et. al., Court’s Ruling Should Relieve Document Retention Burden, June 3, 2005 Lundy, James, et. al., Client Issues for Enterprise Content Management, 2005, May 3, 2005 23 © 2005 The MITRE Corporation. All rights reserved Bibliography Gartner Research (con’t.) – – – – – – – Lundy, James, Kenneth Chin, Karen Shegda, Management Update: Who Will Own the Enterprise Content Management Market? May 18, 2005 Paquet, Raymond, Poll Confirms Companies Aren’t Ready for ILM, 27 April, 2005 Phifer, Gene, Ray Valdes, David Gootzit, CIO Update: Client Issues for Enterprise Portals and Portal Technologies, 2005 Strauss, Herbert, Information Management Challenges CIOs and Mission Manager in the National Security Domain, 30 June 2005 Valdes, Ray & Whit Andrews, Design Web Applications for Standards, Not for Browsers, 2 March 2005 White, Andrew, Enterprise Information Management Is Key to Enabling Portals, 2 August 2005 White, Andrew and Zrimsek, Brian, Enterprise Information Management Represents the Future of Data, 8 February 2005 Additional Market Research – – – A Delphi Group Flash Survey: Content Security, Delphi Group North, Bill, et. al., HP Refreshes Its ILM Strategy, IDC Insight, December 2004 McDonough, Brian, Robert P. Mahowald, Joshua Duhl, and Alison Crawford, The Enterprise Workplace: How it will change the way we work, IDC, February 2005 24 © 2005 The MITRE Corporation. All rights reserved Bibliography Additional Market Research (con’t.) – – Gray, Robert C., Richard L. Villara, Users Not Racing to Merge SAN Islands, IDC Opinion, February 2005 Gray, Robert C., Eric Sheppard, Dave Reinsel, Why We Haven’t Bought a SAN Yet, IDC Opinion, February 2005 Studies – – – – – – A Global Imperative: The Report of the 21st Century Literacy Study, The New Media Consortium, 2005 Changing Roles of Content Management Functions: View from the Corporate Sector, Outsell, Vol. 7, August 20, 2004 Changing Roles of Content Management Functions: View from the Government, Outsell, Vol. 7, Sept. 17, 2004 Long-Lived Digital Data Collections: Enabling Research and Education in the 21st Century, Report of the National Science Board, May 23, 2005 Lyman, Peter and Hal R. Varian, How Much Information, 2003. Retrieved from http://www.sims.berkeley.edu/how-much-info-2003, School of Information Management and Systems, University of California at Berkeley (2003) Printing in the Age of the Web and Beyond: How Society Will Communicate in the 21st Century, The Electronic Document Systems Foundation (2001) 25 © 2005 The MITRE Corporation. All rights reserved Bibliography Studies (con’t.) – Revolutionizing Science and Engineering Through Cyberinfrastructure: Report of the NSF Blue-Ribbon Advisory Panel on Cyberinfrastructure, Daniel Atkins, Chair, January 2003 – Strouse, Roger, The changing face of content users and the impact on information providers: the old paradigms of how users interact with, and think about, information has changed, Online, Sept.1, 2004 Corporate Executive Board – CIO Executive Board, Organizational Structures for Information Integration, January 2005 – Working Council for Chief Information Officers, Digital Archiving Strategies, April 2002 Miscellaneous – Awre, Chris, How Do Users Search? Examining User Behavior and Testing Innovative Possibilities within the CREE Project, D-Lib Magazine, Vol. 11, Number 4, April 2005 – Connor, Deni, EMC Deal Highlights Storage Evolution, Network World, Oct. 20, 2003 26 © 2005 The MITRE Corporation. All rights reserved Bibliography Miscellaneous (con’t.) – Del Rosso, Michael, The State of Storage, Computer Technology Review (Feb. 2003) – Djorgovski, S.G., Virtual Observatory, Cyber-Science, and the Rebirth of Libraries, slides, October 2004 – Earnshaw, R. A., The Challenges of Digital Media: Research Issues and Future Directions, IEEE 2000 – Farber, Miriam and S. Shoham, Users, End-Users, and End-User Searchers of Online Information: a Historical Overview, Online Information Review, Vol. 26, Number 2, 2002, pp. 92-100 – Hammond, Tony., et.al., Social Bookmarking Tools, D-Lib Magazine, April 2005 – How Do You Define Excellence? Montague Institute Review, May 2005 – Klischewski, R. , and Jeenicke, M., Semantic Web Technologies for Information Management within e-Government Services, Proceedings of the 37th Hawaii International Conference on System Sciences, 2004 – KM Collaboration within law firms, Montague Institute Review, March 2005 – Lynch, Clifford, Reflections Towards the Development of a “Post-DL” Research Agenda, June 10, 2003 – Lyon, Liz, Realising the Scholarly Knowledge Cycle: The Experience of eBank UK, CNI Task Force Meeting Spring 2004, Alexandria, VA. 27 © 2005 The MITRE Corporation. All rights reserved Bibliography Miscellaneous – Marcum, Deanna B. and Gerald George, Who Uses What: Report on a National Survey of Information Users in Colleges and Universities, D-Lib Magazine, October 2003 – Mearian, Lucas, EMC Warms Up to Tape, Signs Resale Agreement, Computerworld (June 14, 2004) 38, 24 – Mulroy, Kevin, Review of Looking for Information: A Survey of Research on Information Seeking, Needs, and Behavior by Donald O. Case, Portal: Reviews – Reiner, D. et. Al., Information Lifecycle Management: The EMC Perspective, Proceedings of the 20th International Conference on Data Engineering (ICDE’04) 2004 – Notes, Greg R., The Changing Information Cycle, Online, Sept./Oct 2004; 28, 5 – Reddick, Christopher G., Citizen interaction with e-government: From the streets to servers?, Government Information Quarterly 22(2005), 38-57. – Savolainen, Reijo, Placing the Internet in Information Source Horizons. A Study of information Seeking by Internet Users in the Context of Self-Development, Library and Information Science Research 26 (2004) 415-433. – Schottlaender, Brian E. C., E-Research and Supporting Cyberinfrastructure: Next Steps within Our Institutions, ARL/CNI Forum (15 October 2004) 28 © 2005 The MITRE Corporation. All rights reserved Bibliography Miscellaneous (con’t.) – Stephens, David O., Digital Preservation: A Global Information Management Problem, Information Management Journal (July 2000), pg. 68-71 – Van de Sompel, Herbert, Untitled I, Challenges Ahead, Presented at Olybris 2005, Greece, April 18, 2005 – Wiggins, Richard, Digital Preservation, Paradox and Promise, Library Journal (Spring 2001), pp. 12-15 – US Government Printing Office, Concept of Operations for the Future Digital System, October 1, 2004 29 © 2005 The MITRE Corporation. All rights reserved Background Slides 30 © 2005 The MITRE Corporation. All rights reserved Nature of the Change Information shift from the physical to the virtual resulting in… – Ease of publishing, sharing and replicating – Ability to directly extract information from the content – Enormous growth of content and sources both in personal and enterprise libraries – Changes in the concept of information persistence Information management shift from the intermediary to the consumer resulting in… – Personal responsibility for information management to augment the enterprise – Shift of responsibility to the end-user for source evaluation, content lineage and research – Loss of control at the enterprise level and shift to individual responsibility to organize the information space – Introduction of new class of *social software applications to exchange knowledge such as Blogging, P2P applications and Wikis 31 * D-Lib Magazine, Social Bookmarking Tools (I), Volume 11 Number 4, April 2005 © 2005 The MITRE Corporation. All rights reserved What will continue to change… Continued advances in information extraction and semantic understanding resulting in… – Greater use of technology in gathering, assimilating and understanding data – Further reduction in the reliance of expert intermediaries to research and manage information Continued advances in computer and communications resulting in… – Improved ability to process large volumes of data with complex algorithms dynamically – Improved ability to process remote information and exchange vast volumes of information – Expansion of multimedia and language formats in the presentation and consumption of information Greater reliance on technology to perform traditional roles of the Information Manager 32 © 2005 The MITRE Corporation. All rights reserved Proof-of-Concept for Managing a Personal information Space Analysts cite the inability to manage a personal corpus as their chief mission challenge Expand understanding of the business need Identify related research Establish alternate visions Explore explosion of new technology within context of those visions Demonstrate an information management practice that operates in harmony with enterprise IM 33 © 2005 The MITRE Corporation. All rights reserved Information Management Practice for Foreign Language Materials Focus of foreign language materials typically on advances in automatic translation capabilities; little attention to information management Identify foreign language information management issues – Document types and how best to organize them – User types and their information management needs – Chief alternatives / trade-offs Investigate efficacy of language-independent workflow – Integration of foreign language and native content into a unified information management environment – Position the sponsor for a more integrated future 34 © 2005 The MITRE Corporation. All rights reserved Metadata Strategies Tools to perform automated entity extraction, indexing, and categorization of text (and increasingly multimedia) are maturing, diminishing the need for topical metadata. However, there is no roadmap for transitioning sponsors to more automated means of performing search and discovery, while continuing to apply/extract metadata for establishing source context Identify state of the art for managing and finding digital objects Define aspects that can be automated and aspects that require curation Identify viable models across the IC, academia, and commercial environments for transitioning a work force to new search methods Provide strategic guidance for evolving to next generation practice 35 © 2005 The MITRE Corporation. All rights reserved Establishing Baseline Practices For Cross-Application Data Sharing Making information available across an enterprise and organizational boundaries encompasses non-trivial challenges, but there has been little evaluation of past efforts or identification of state-of-the-art practices What are the alternatives for achieving cross-application sharing? To what degree have they worked? Where are the challenges? What are industry best practices in this area? What are the lessons learned? 36 © 2005 The MITRE Corporation. All rights reserved Digital Information Management Framework Address recognized deficiencies in framework – Social / organizational issues – Layers of IM: Personal information management, enterprise, application, etc. Solicit comment internally/externally Assess its usefulness in sponsor work Revise along the way 37 © 2005 The MITRE Corporation. All rights reserved