Next Generation Information Management

Toward A Digital-Based
Information Management Practice
Presentation to CNI Task Force
December 6, 2005
Avra Michelson and Michael Olson
Approved for Public Release; Distribution Unlimited 05-1431
© 2005 The MITRE Corporation. All rights reserved
About MITRE
 Not-for-profit
Federally-Funded Research and
Development Center (FFRDC) chartered by Congress to
work in the public interest
 Performs high-end systems engineering addressing the
nation’s hardest problems
 Independent honest broker who works only for
government – prohibited from manufacturing products,
competing with industry, or working for commercial
companies
 Founded in 1958 with several hundred employees from
MIT’s Lincoln Laboratories
 Today have 5,700 staff with headquarters in Bedford, MA
and McLean, VA as well as 60 additional locations
around the world
2
© 2005 The MITRE Corporation. All rights reserved
Overview

Information Management is
Changing

What is the Nature of the
Challenge?

Digital Information
Management
– Definition
– Framework
– Working Hypotheses

MITRE Focus for FY06
Our work most
closely aligns with
CNI’s Institutional
Repositories / Digital
Libraries initiatives
3
© 2005 The MITRE Corporation. All rights reserved
Traditional Approach to Information Management
Compliance /
Archive
Collect /
Store
Represent /
Disseminate
Life Cycle Processes Performed
With Each Discrete Application
4
© 2005 The MITRE Corporation. All rights reserved
Information Management Practices
Change Over Time
5
© 2005 The MITRE Corporation. All rights reserved
What’s Driving the Changes?
 Predominance
of digital as medium for storage,
management, & retrieval
 Compelling need to share within organizations
and across boundaries
 Skyrocketing volume along with time-sensitive
need
 Massive heterogeneity of technical
environments and content types
 Shift from intermediary management of
information to consumer / technology
6
© 2005 The MITRE Corporation. All rights reserved
Information Management (IM) Challenge is Changing
Traditional
Approach to IM
Compliance /
Collect /
Archive
Store
Represent /
Disseminate
Life Cycle Performed
With Each Discrete
Application
ENTERPRISE: Provide access to
content – separate from applications -- across
data holdings, regardless of boundaries
USER/TEAM: Aggregate and
manage distributed information in
a personal space in coordination
with collaborators
Expanded Dimensions of IM
7
© 2005 The MITRE Corporation. All rights reserved
Lots of Technology, Lots of Practice,
but Lots of Unanswered Questions
How is information managed …
ENTERPRISE
USER/TEAM
– Across ill-defined, heterogeneous
boundaries?
– That is not a single collection, but a
collection of collections?
– In conformance with enterprise
policy?
– In a time-sensitive manner, scaling
to high volume?
– Taking advantage of all that
technology could make possible?
A coherent digital Information Management
practice has yet to be defined
8
© 2005 The MITRE Corporation. All rights reserved
Framing the Digital
Information Management
Challenge
9
© 2005 The MITRE Corporation. All rights reserved
Enterprise Perspective: Emergence of
the “Data Layer” Concept
Data is embedded in autonomous mission applications
Program ‘A’
Requirements
PMO ‘A’
Program ‘B’
Requirements
PMO ‘B’
App ‘A’
New technologies
available…
Program ‘C’
Requirements
PMO ‘C’
App ‘B’
App ‘C’
New Program office
required
The Data Layer
Data ‘A’
Data ‘B’
Data ‘C’
DL PMO
Is the “data layer” the answer to sharing?
10
© 2005 The MITRE Corporation. All rights reserved
Enterprise Perspective: Challenges

How to implement this concept in a timely manner while serving the
needs of…
–
–
–
–

multiple program offices that are all
working different problems on
different schedules to meet
different user requirements?
How would an “Enterprise IM” Program Office operate?
– What level of coordination is feasible/desirable across applications that
are driven by different problems, schedules, and requirements?
– How do you resolve questions related to…

Policies governing the enterprise collection and enforcement

Access controls, copyright, intellectual rights and other data permissions

Data retention for operations and compliance

Metadata needs for managing and using the collection
Little in the way of vision and methodologies
11
© 2005 The MITRE Corporation. All rights reserved
User Perspective:
Personal Information Management
Enterprise Applications
(Search, Email, Repositories, etc.)
Data
Data
Data
Data
Social Applications
External
Shared
Data
External
Application
…
Shared
File System
Application
“N”
Application Application
“A”
“B”
External
Network
Social Bookmarking
Collaboration
Blogs
Internal
Network
Wiki
Discussion Threads
..
.
Desktop Search
Desktop Search
Bookmarks
Bookmarks
Email
Email
Personal
Library
Personal
Library
Desktop Search
Desktop Search
Loosely Coupled Teams
Communities of Interest
Bookmarks
Bookmarks
Email
Personal
Library
Bookmarks
Bookmarks
Email
Personal
Library
Email
Desktop Search
Desktop Search
Desktop Search
Email
Bookmarks
Personal
Library
Email
Personal
Library
Personal
Library
Individual
What do I know about a subject regardless of where information is stored?
12
© 2005 The MITRE Corporation. All rights reserved
User Perspective: Challenges
 Few
capabilities available to help users manage a
personal information space
– Browser-based bookmarking
– Private or share spaces for maintaining personal collections
– Desktop search
 No
comprehensive vision and few cross-application tools
or methods for managing at a personal level
 New class of social applications geared towards peer to
peer exchange of information emerging
– Social bookmarking, Wikis, Blogging
 Blurring
lines between the personal and team/group
environments adding additional complexity
Little in the way of vision, tools and methods
13
© 2005 The MITRE Corporation. All rights reserved
What Is The Information Management
Challenge Going Forward?
App ‘A’
App ‘B’
Data ‘A’
Data ‘B’
To share information
embedded in
applications across the
enterprise and
organizational
boundaries
To define the tools
and methods for
managing a
personal / team
information space
To harmonize these efforts into an
enterprise architecture and information
management practice
14
© 2005 The MITRE Corporation. All rights reserved
What is the Role of the Digital Information Curator?
…in every subset of government,
there is a realization that legacy
IM practices are falling short
– Budgets for traditional
services down more than 40%
from 2003
– Staffing levels have declined
for second year in a row,
including contractors
Emerging need to manage
digital objects, through their
lifecycle, in harmony across
applications, the enterprise,
and personal domain
The Changing Roles of Content Management Functions:
View from the Government, 2004
15
© 2005 The MITRE Corporation. All rights reserved
Digital Information Management Framework
Collection
Development
Capture and
Create
Collection
Management
Find, Present,
and Deliver
Compliance
and Archive
Life Cycle Stages
Issues
• Who is the audience and
what are their information
needs?
• What information do I have?
• What do I need to acquire?
• What is the acquisition
plan?
• What are the means of
acquiring information?
• What are the means of
creation?
• What is the workflow
associated with creation?
• What policies govern the
collection?
• How will the policies be
enforced?
• What metadata is needed
to manage the collection’s
content?
• What are the means for
supporting search and
discovery?
• What are the means for
supporting presentation and
dissemination?
• How are objects found over
time?
• What is the duration of active
life of the content?
• How long is the collection
required to be retained for
compliance purposes?
• What are means of enabling
retention?
Functions/
Methods
• Information needs
assessment
• Usage analysis
• Content inventory
• Source identification
• Gap analysis
• Requirements definition
• Source exploitation
strategy
• Task analysis and
implementation
• Content management
strategy
• Source assessment and
characterization
• Access control policy and
strategy
• Metadata collection
strategy
• Intellectual property rights
and copyright usage
policies
• Resource identifier strategy
• Information architecture
• Search strategy &
improvement methodology
• Dissemination strategy
• Interoperability standards
• Archival strategy
• Records scheduling
• Refresh and migration
strategy
• Usage analysis
tools
• Requirements
management
tools
• Document imaging
• Content / Document
management systems
• Digital Asset
Management tools
• Authoring & editing tools
• Language tools
• Ingest technologies
• Authentication & access
control technologies
• Content/document
management systems
• Digital Asset Management
tools
• Digital Rights
Management technologies
• Search & discovery tools
• Dissemination technologies
• Content/document
management systems
• Digital Asset Management
tools
• Automated extraction tools
• Object persistence services
• Language tools
• Content/document
management systems
• Digital Asset Management
tools
• Records management tools
• Hierarchical Storage
Management tools
• File format migration and
conversion technologies
Technology
Sample
16
© 2005 The MITRE Corporation. All rights reserved
Related Computing Domains
Data
Management
Procurement
Usability Engineering
Digital
Information
Management
Collaboration
Language Technology
Storage
Analytic Tools
Security
17
© 2005 The MITRE Corporation. All rights reserved
Digital IM Working Hypotheses

Distributed resources, but centralized access
– Goal of unified views, not centralized repositories

Manage heterogeneity rather than strive for common standards
– Reliance on technology to perform the necessary integration and transformations –
rather than common vocabularies, etc.

Automated methods for establishing the findability of digital objects
– Topical metadata of diminishing value

Engineering that places the user at the center of the system as
opposed to the data repository
– Prevailing use of service-oriented designs that allow users to subscribe to
capabilities and information as desired

Mission information managed at higher levels of service than records
retained for compliance purposes
– To avoid investing more than needed to manage less critical information or overburden applications designed for mission-critical information
18
© 2005 The MITRE Corporation. All rights reserved
How Do We Get There?
Challenges very large
Need broad
investigation of
issues
MITRE supporting
several initiatives in
FY 06
19
© 2005 The MITRE Corporation. All rights reserved
MITRE Focus for FY 06
Issue
Details
Proof-of-concept for managing a
personal information space



Information management practice
for multi-modal materials


Metadata strategies

Establishing baseline practices
for cross-application data sharing
 Alternatives,
 Where does
Digital IM Framework

What is the vision?
What are the tools?
What are the “touch points” with
enterprise IM?
What are IM dimensions?
Where do multi-modal materials fit within
the broader IM workflow?
Distinguishing finding information from
managing it
Transitioning sponsors to automated
capture and extraction
strengths, weaknesses
records management fit?
Life cycle processes, issues, methods,
and technologies for managing digital
content
20
© 2005 The MITRE Corporation. All rights reserved
Contributors

Rachael Bradley

Dr. Clifford Lynch, CNI

Clif Bridgers

Betsi McGrath

Ray D’Amore

Howard Markham

Richard Games

Dr. Mark Maybury

Meredith Goodnight

Victor Perez-Nunez

Julie Gravallese

Arnie Rosenthal

Soohee Kim

Dr. Len Seligman

Aaron Lesser

Ted Sienknecht

Dr. Frank Linton

Cynthia Small

Dr. Joan Lippincott, CNI

Kerry Zimmerman
21
© 2005 The MITRE Corporation. All rights reserved
Bibliography

Gartner Research
– Allega, Phillip J., Architecture Framework Debates are Irrelevant, June 7, 2005
– Allen, Nick, et. al., Vendor Rating Update: IBM Storage is Promising, but its
Software Still Needs Improvement (1 April 2005)
– Austin, Tom, et. al., , 2005, Client Issues In the High Performance Workplace,
April 29, 2005
– Bell, Toby and Ames Lundy, Content-Centric Communications Can Revolutionize
Customer Service, May 24, 2005
– Burton, Betsy, and D.M. Smith, Client Issues 2005: How to Approach, Encourage,
and Support Collaborative Work, Gartner Research, April 29, 2005
– Caldwell, F. , Apply Governance Principles to Improve Content Management, 7
February 2005
– Chuba, Mike, Five Storage Vendor Ratings (5 April 2005)
– DiCenzo, Carolyn, K. Chin, Magic quadrant for Email Active-Archiving Market,
2005, April 21, 2005
– Di Maio, Andrea, Strike a Balance Between Centralization and Decentralization of
Government IT Management, June 3, 2005
– Dixon, Don, eCopy Looks to Set Document Imaging and Distribution Standards
for MFPs, May 23, 2005
– Gassman, Bill, How to Choose an Advanced Solution for Web Site Analytics, 1
April 2005
22
© 2005 The MITRE Corporation. All rights reserved
Bibliography

Gartner Research (con’t.)
–
–
–
–
–
–
–
–
–
–
Harris, Kathy, et.al., Knowledge Management Client Issues for 2005 and
Beyond, 25 April 2005
Kleinberg, K., D. Logan, Digital Preservation in Healthcare: Long-Term
Accessibility, 7 January 2002
Knox, R., White, A., Eid, T., Companies Should Align Their Structured And
Unstructured Data, 2 February 2005
Kolsky, Esteban, Management Update: Debunk Self-Service Myths to Reap
Self-Service Benefits, May 25, 2005
Kolsky, Esteban, Self-Service Gets Functional, March 16, 2005
Krischer, Josh, Consider Data Consistency When Planning Disaster Recovery,
8 March 2005
Leskela, Lane, et. Al., Client Issues 2005: How to Achieve Regulatory
Compliance and ERM, March 29, 2005
Leskela, Lane, French Caldwell, 2005 Compliance Focus is on Best Practices
and IT Support, 4 March 2005
Logan, Debra, et. al., Court’s Ruling Should Relieve Document Retention
Burden, June 3, 2005
Lundy, James, et. al., Client Issues for Enterprise Content Management, 2005,
May 3, 2005
23
© 2005 The MITRE Corporation. All rights reserved
Bibliography

Gartner Research (con’t.)
–
–
–
–
–
–
–

Lundy, James, Kenneth Chin, Karen Shegda, Management Update: Who Will Own
the Enterprise Content Management Market? May 18, 2005
Paquet, Raymond, Poll Confirms Companies Aren’t Ready for ILM, 27 April, 2005
Phifer, Gene, Ray Valdes, David Gootzit, CIO Update: Client Issues for Enterprise
Portals and Portal Technologies, 2005
Strauss, Herbert, Information Management Challenges CIOs and Mission Manager
in the National Security Domain, 30 June 2005
Valdes, Ray & Whit Andrews, Design Web Applications for Standards, Not for
Browsers, 2 March 2005
White, Andrew, Enterprise Information Management Is Key to Enabling Portals, 2
August 2005
White, Andrew and Zrimsek, Brian, Enterprise Information Management Represents
the Future of Data, 8 February 2005
Additional Market Research
–
–
–
A Delphi Group Flash Survey: Content Security, Delphi Group
North, Bill, et. al., HP Refreshes Its ILM Strategy, IDC Insight, December 2004
McDonough, Brian, Robert P. Mahowald, Joshua Duhl, and Alison Crawford, The
Enterprise Workplace: How it will change the way we work, IDC, February 2005
24
© 2005 The MITRE Corporation. All rights reserved
Bibliography
Additional Market Research (con’t.)

–
–
Gray, Robert C., Richard L. Villara, Users Not Racing to Merge SAN Islands, IDC
Opinion, February 2005
Gray, Robert C., Eric Sheppard, Dave Reinsel, Why We Haven’t Bought a SAN Yet,
IDC Opinion, February 2005
Studies

–
–
–
–
–
–
A Global Imperative: The Report of the 21st Century Literacy Study, The New Media
Consortium, 2005
Changing Roles of Content Management Functions: View from the Corporate
Sector, Outsell, Vol. 7, August 20, 2004
Changing Roles of Content Management Functions: View from the Government,
Outsell, Vol. 7, Sept. 17, 2004
Long-Lived Digital Data Collections: Enabling Research and Education in the 21st
Century, Report of the National Science Board, May 23, 2005
Lyman, Peter and Hal R. Varian, How Much Information, 2003. Retrieved from
http://www.sims.berkeley.edu/how-much-info-2003, School of Information
Management and Systems, University of California at Berkeley (2003)
Printing in the Age of the Web and Beyond: How Society Will Communicate in the
21st Century, The Electronic Document Systems Foundation (2001)
25
© 2005 The MITRE Corporation. All rights reserved
Bibliography
 Studies
(con’t.)
– Revolutionizing Science and Engineering Through Cyberinfrastructure: Report of
the NSF Blue-Ribbon Advisory Panel on Cyberinfrastructure, Daniel Atkins, Chair,
January 2003
– Strouse, Roger, The changing face of content users and the impact on information
providers: the old paradigms of how users interact with, and think about,
information has changed, Online, Sept.1, 2004
 Corporate
Executive Board
– CIO Executive Board, Organizational Structures for Information Integration,
January 2005
– Working Council for Chief Information Officers, Digital Archiving Strategies, April
2002
 Miscellaneous
– Awre, Chris, How Do Users Search? Examining User Behavior and Testing
Innovative Possibilities within the CREE Project, D-Lib Magazine, Vol. 11, Number
4, April 2005
– Connor, Deni, EMC Deal Highlights Storage Evolution, Network World, Oct. 20,
2003
26
© 2005 The MITRE Corporation. All rights reserved
Bibliography
 Miscellaneous
(con’t.)
– Del Rosso, Michael, The State of Storage, Computer Technology Review (Feb.
2003)
– Djorgovski, S.G., Virtual Observatory, Cyber-Science, and the Rebirth of Libraries,
slides, October 2004
– Earnshaw, R. A., The Challenges of Digital Media: Research Issues and Future
Directions, IEEE 2000
– Farber, Miriam and S. Shoham, Users, End-Users, and End-User Searchers of
Online Information: a Historical Overview, Online Information Review, Vol. 26,
Number 2, 2002, pp. 92-100
– Hammond, Tony., et.al., Social Bookmarking Tools, D-Lib Magazine, April 2005
– How Do You Define Excellence? Montague Institute Review, May 2005
– Klischewski, R. , and Jeenicke, M., Semantic Web Technologies for Information
Management within e-Government Services, Proceedings of the 37th Hawaii
International Conference on System Sciences, 2004
– KM Collaboration within law firms, Montague Institute Review, March 2005
– Lynch, Clifford, Reflections Towards the Development of a “Post-DL” Research
Agenda, June 10, 2003
– Lyon, Liz, Realising the Scholarly Knowledge Cycle: The Experience of eBank UK,
CNI Task Force Meeting Spring 2004, Alexandria, VA.
27
© 2005 The MITRE Corporation. All rights reserved
Bibliography
 Miscellaneous
– Marcum, Deanna B. and Gerald George, Who Uses What: Report on a
National Survey of Information Users in Colleges and Universities, D-Lib
Magazine, October 2003
– Mearian, Lucas, EMC Warms Up to Tape, Signs Resale Agreement,
Computerworld (June 14, 2004) 38, 24
– Mulroy, Kevin, Review of Looking for Information: A Survey of Research on
Information Seeking, Needs, and Behavior by Donald O. Case, Portal:
Reviews
– Reiner, D. et. Al., Information Lifecycle Management: The EMC Perspective,
Proceedings of the 20th International Conference on Data Engineering
(ICDE’04) 2004
– Notes, Greg R., The Changing Information Cycle, Online, Sept./Oct 2004; 28,
5
– Reddick, Christopher G., Citizen interaction with e-government: From the
streets to servers?, Government Information Quarterly 22(2005), 38-57.
– Savolainen, Reijo, Placing the Internet in Information Source Horizons. A Study
of information Seeking by Internet Users in the Context of Self-Development,
Library and Information Science Research 26 (2004) 415-433.
– Schottlaender, Brian E. C., E-Research and Supporting Cyberinfrastructure:
Next Steps within Our Institutions, ARL/CNI Forum (15 October 2004)
28
© 2005 The MITRE Corporation. All rights reserved
Bibliography
 Miscellaneous
(con’t.)
– Stephens, David O., Digital Preservation: A Global Information Management
Problem, Information Management Journal (July 2000), pg. 68-71
– Van de Sompel, Herbert, Untitled I, Challenges Ahead, Presented at Olybris
2005, Greece, April 18, 2005
– Wiggins, Richard, Digital Preservation, Paradox and Promise, Library Journal
(Spring 2001), pp. 12-15
– US Government Printing Office, Concept of Operations for the Future Digital
System, October 1, 2004
29
© 2005 The MITRE Corporation. All rights reserved
Background Slides
30
© 2005 The MITRE Corporation. All rights reserved
Nature of the Change

Information shift from the physical to the virtual
resulting in…
– Ease of publishing, sharing and replicating
– Ability to directly extract information from the content
– Enormous growth of content and sources both in personal and
enterprise libraries
– Changes in the concept of information persistence

Information management shift from the intermediary to the
consumer resulting in…
– Personal responsibility for information management to augment
the enterprise
– Shift of responsibility to the end-user for source evaluation,
content lineage and research
– Loss of control at the enterprise level and shift to individual
responsibility to organize the information space
– Introduction of new class of *social software applications to
exchange knowledge such as Blogging, P2P applications and
Wikis
31
* D-Lib Magazine, Social Bookmarking Tools (I), Volume 11 Number 4, April 2005
© 2005 The MITRE Corporation. All rights reserved
What will continue to change…

Continued advances in information extraction and semantic
understanding resulting in…
– Greater use of technology in gathering, assimilating and
understanding data
– Further reduction in the reliance of expert intermediaries to
research and manage information

Continued advances in computer and communications
resulting in…
– Improved ability to process large volumes of data with complex
algorithms dynamically
– Improved ability to process remote information and exchange vast
volumes of information
– Expansion of multimedia and language formats in the
presentation and consumption of information
Greater reliance on technology to
perform traditional roles of the
Information Manager
32
© 2005 The MITRE Corporation. All rights reserved
Proof-of-Concept for Managing a Personal
information Space
Analysts cite the inability to manage a personal
corpus as their chief mission challenge





Expand understanding of the business need
Identify related research
Establish alternate visions
Explore explosion of new technology within context of those visions
Demonstrate an information management practice that operates in
harmony with enterprise IM
33
© 2005 The MITRE Corporation. All rights reserved
Information Management Practice for Foreign
Language Materials
Focus of foreign language materials typically on
advances in automatic translation capabilities; little
attention to information management
 Identify foreign language information management issues
– Document types and how best to organize them
– User types and their information management needs
– Chief alternatives / trade-offs
 Investigate
efficacy of language-independent workflow
– Integration of foreign language and native content into a unified
information management environment
– Position the sponsor for a more integrated future
34
© 2005 The MITRE Corporation. All rights reserved
Metadata Strategies
Tools to perform automated entity extraction, indexing,
and categorization of text (and increasingly multimedia) are maturing, diminishing the need for topical
metadata.
However, there is no roadmap for transitioning
sponsors to more automated means of performing
search and discovery, while continuing to apply/extract
metadata for establishing source context




Identify state of the art for managing and finding digital objects
Define aspects that can be automated and aspects that require curation
Identify viable models across the IC, academia, and commercial
environments for transitioning a work force to new search methods
Provide strategic guidance for evolving to next generation practice
35
© 2005 The MITRE Corporation. All rights reserved
Establishing Baseline Practices For
Cross-Application Data Sharing
Making information available across an enterprise and
organizational boundaries encompasses non-trivial
challenges, but there has been little evaluation of past
efforts or identification of state-of-the-art practices




What are the alternatives for achieving cross-application sharing?
To what degree have they worked? Where are the challenges?
What are industry best practices in this area?
What are the lessons learned?
36
© 2005 The MITRE Corporation. All rights reserved
Digital Information Management Framework
 Address
recognized deficiencies in framework
– Social / organizational issues
– Layers of IM: Personal information management,
enterprise, application, etc.
 Solicit
comment internally/externally
 Assess its usefulness in sponsor work
 Revise along the way
37
© 2005 The MITRE Corporation. All rights reserved