Electronic Records Management - A Review of the Work of a Decade and a Reflection on Future Directions INTRODUCTION The decade of the 1990s will undoubtedly be remembered as a period that witnessed an incredible diffusion of information technology through a massive and unanticipated spread in the use of personal computers and local area networks, the maturing of the Internet, and the development of the World Wide Web and its enabling browser interface software. It was a decade that saw the emergence of networking and the widespread sharing of information, of the transformation from personal to work group computing, and of enterprise architecture and integrated systems. In short, the 1990s was a time when the power of computing and document creation passed out of the hands of traditional centralized providers of data and into the hands of individual workers. 1 Two of the more important consequences of these truly revolutionary changes were the transformation of how businesses functioned and individuals worked and in how institutions and workers communicated. Among the most prominent changes in these areas were the emergence of less centralized communication patterns, of more horizontal communication outside of the traditional bureaucratic channels, and of collaborative team projects and the concept of “virtual shared work space.” The resultant transformations in the flow of inter- and intra- organizational information and in workflow and business processes dramatically and irrevocably altered the workplace. 2 1 For descriptions of these technology changes, see Martin Campbell-Kelly and William Aspray, Computer: A History of the Information Machine (New York: Basic Books, 1996), pp. 157-300; Joel Shurkin, Engines of the Mind: The Evolution of the Computer from Mainframes to Microprocessors (New York: W.W. Norton, 1996), pp. 248-337; and Don Tapscott and Art Caston, Paradigm Shift. The New Promise of Information Technology (New York: McGraw-Hill, Inc., 1993), pp. 121-164, 231-255. 2 For descriptions of the changing work environment, see Thomas H. Davenport, Information Ecology (New York: Oxford University Press, 1997), pp. 3-28; Thomas H. Davenport, Process Innovation. Reengineering Work through Information Technology (Boston: Harvard Business School Press, 1993): 7193; Michael Hammer and James Champy, Reengineering the Corporation. A Manifesto for Business Revolution (Harper Business, New York, N.Y., 1994), pp. 65-101. James Martin, Cybercorp. The New Business Revolution (New York, N.Y.: AMACOM, 1996), pp. 3-58; Don Tapscott and Art Caston, Paradigm Shift. The New Promise of Information Technology, pp. 1-27, 185-230. 1 Significant changes were also occurring in the products of this communication – the business record. Rapid transformations in the form of the record – the emergence of hypermedia documents, dynamic documents, e-mail – prompted technologists but especially records’ professionals, i.e., archivists and records managers, to increasingly ask: what is this electronic (digital) record? How is it different from traditional analog forms, such as those preserved on paper, microfilm and on audio and videotapes? How will we manage and preserve this record? Eventually, this debate gave rise to broader reflections and a host of questions and issues related to the role of the records management professional. What do archivists and records managers contribute to society? What is their relationship to other information management professionals? Do archivists and records managers possess the knowledge and skills required to manage digital records? What theories, principles and techniques will continue to guide records' professionals in their work? 3 Before one can address these questions and begin to ascertain why changes in technology have had such an impact on archivists and records managers, we need briefly to review the evolution of automation and recordkeeping. Brief Review of Computing and Recordkeeping The early days of computing, from the 1950-1970s, were dominated by small business and massive mainframe computers (used primarily for scientific applications), which managed data inputted from punched cards, produced massive amounts of paper printouts, and supported an attached network of a few local and remote terminals. The emphasis was on inputting data found in traditional paper forms and on automating computing intensive business transactions, such as accounting and payroll. The outputs of these systems were automated versions of traditional paper documents, such as bills, paychecks and orders, or video screen displays often formatted to resemble a familiar document. Most employees had little or no direct access to the systems or to the data; they were largely dependent on programmers and systems analysts to interpret their data needs. Requests for data or information in the form of summaries or reports were submitted to the computer center and the results, processed in batches over night or in the course of a week, were returned in the form of paper printouts. 3 For a discussion on the changing form of records, see Charles Dollar, Archival Theory and Information Technologies: The Impact of Information Technologies on Archival Principles and Methods (Macerata: University of Macerata Press, 1992), pp. 36-40. 2 Similarly, archivists and records managers of this period relied heavily upon conversion of computer data to paper documentation to do their work. The prevailing recordkeeping methodology of the time was to generate printouts of computer files - the so-called "data dumps" - as a means of appraising the value of the data. For records with primary value to the institution, it was common practice to print to paper and store the record in established filing systems, and to summarize the data and produce various standard reference reports (the annual budget, the biweekly payroll, etc. For records with secondary values, either evidential or informational, the general rule was to retain the files on computer tapes in tape libraries and develop descriptive finding aids to facilitate access to the tapes. Overall, recordkeeping practices in the early decades of automation were not radically different from techniques employed for paper records, and so some degree this was justified. In a system where the basic strategy was to convert paper forms to an automated environment, where file management systems predominated, and where systems were characterized by functional units creating and managing their own files in isolation from other applications, it was possible to devise a records management strategy based on capturing screen views or forms and converting them to paper documents. In this environment methodologies designed for the management of papers systems still had relevance. 4 The 1980s and 1990s witnessed dramatic and frequent changes in technology, featuring most prominently the emergence of the personal computer and of the Internet, and the development of database management systems, client-server architectures, distributed computing and enterprisewide applications. All of these developments and more have had the effect of dramatically changing the way data, information and records were created and managed. Perhaps the most dramatic transformations were in document or record creation and in the resultant changing form of documents. To better understand this issue, let us first review how the most prevalent systems in use by businesses, Transaction Processing Systems (TPS), manage data and records. Transaction Processing Systems Employing DBMS Software The most basic business system and the heart of most organizations is the Transaction Processing System (TPS). A transaction processing system "is For a description of recordkeeping practices in the early days of computing, see Terry Cook, “Easy to Byte, Harder to Chew: The Second Generation of Electronic Records Archives,” Archivaria 33 (Winter, 1991-92): 202-216. 4 3 a computerized system that performs and records the daily routine transactions necessary to the conduct of business." 5 The primary goal of these systems is to automate computing intensive business transactions, such as those undertaken in the financial and human resource functional areas. The emphases is on processing data (sorting, listing, updating, merging), on reducing clerical costs, and on outputting documents required to do business, such as bills, paychecks and orders. The guiding principles of these systems are to create data that is current, accurate, and consistent. To achieve these goals, these systems employ traditional Database Management System (DBMS) or modern Enterprise Resource Planning (ERP) software. Unlike traditional file management systems, data elements in a database management system (DBMS) are integrated and shared among different tables and databases. Consequently, one of the primary advantages of DBMS is its ability to limit and control redundant data in multiple systems. Instead of the same data field being repeated in different tables, the information appears just once, often in separate tables or databases, and computer software reconnects the bits of data when needed. Another advantage of DBMS is that it improves data integrity. Updates are made only once, and all changes are made for that data element no matter where it appears. 6 For database managers, this is a much more efficient system, which minimizes data redundancy and maximizes data integrity. Without question, TPS are very good at supporting current business needs for information, minimizing the amount of data stored in the system, improving overall efficiency of the system, removing obsolete data and providing an organizational resource to current data. But are they good recordkeeping systems? The answer, with few exceptions, is a resounding no, because these systems were never designed and structured for the purpose of capturing and maintaining business records. 5 Kenneth C. Laudon and Jane P. Laudon, Essentials of Management Information Systems (Englewood Cliffs, N.J.: Prentice Hall, 1999), p. 42. 6 For descriptions of how these systems function, see Judith Gordon and Steven Gordon, Information Systems. A Management Approach (Fort Worth, Texas: Dryden Press, 1999), pp. 192-233, 364-400; Kenneth C. Laudon and Jane P. Laudon, Essentials of Management Information, pp. 41-43; Ralph Stair, Principles of Information Systems. A Managerial Approach (Boston, MA: Boyd & Fraser Publishing Co., 1992), pp. 152-164, 238-258. 4 In a typical transaction processing system, business records are not stored as stable, finite, physical entities. Rather, these systems create records by combining and reusing data stored in discrete units organized into tables. Once created, a record of a business process may not, indeed, likely will not be captured as a physical entity. Not only will the record not be captured at the time of creation, it may be impossible to recreate at some later date. Databases are dynamic, volatile systems, in a state of continual change. Data updates occur frequently, and with DBMS software managing the system, these revisions are made in every file containing that revised data element. Moreover, databases typically maintain only the current value for any given data element. As a result, in a typical transaction processing system, inviolate business records are difficult, if not impossible, to locate and retrieve. There are a few transaction processing systems, however, where the objective is to create and maintain records of business processes. Prominent examples include systems maintaining general financial ledgers and those that manage academic records and transcripts. In systems managing financial ledgers, data documenting actual business events, such as updating the ledger as a result of a transaction, is captured and maintained as an inviolate record stored as a row of data in a sequential table. These inviolate records represent a cumulative and historical account fixed in time of specified business events. As such, they meet in many respects the definition of a record as articulated by archivists. However, even these systems fail to meet all the requirements of a recordkeeping system. They often do not capture and retain all the metadata necessary to create complete, authentic and reliable records. In addition, these systems often summarize business processes, resulting in a set of records that do not contain sufficient detail to document all relevant business events. To summarize, automated systems do only what they are designed to do, and for most transaction processing systems, recordkeeping is not the primary objective. Consequently, TPS fail to meet most of the basic requirements of a recordkeeping system. 7 While TPS do routinely bring together data from 7 The same can be said of data warehouses. These systems were never designed to function as recordkeeping systems, i.e., systems that capture and manage records documenting business events. The primary functions of data warehouses are to assist in reporting, in understanding historical trends, and in creating summaries. To this end, selected data from operational systems is extracted, often standardized and normalized, and moved into the warehouse. Although business records may be found in a warehouse, managing records is not the primary objective of the system. However, one can certainly use the example of a data warehouse in making a case for developing a recordkeeping system. As with a data warehouse, 5 various sources to form a logical view of a record at the time of making a decision, they typically do not physically create and preserve a record of that transaction. 8 Even systems that do capture and store business records often summarize business processes, and consequently do not document all pertinent business events. Typically, transaction processing systems do not capture and retain complete documentation about business events, particularly as it relates to the context of creation. TPS typically retain only current data, and consequently do a poor job of tracking the history of changes to data values. Finally, because data about a business transaction is typically stored in separate tables or databases, key content data or critical metadata about a business transaction can become disconnected over time, or may be preserved or discarded according to different timetables. For archivists and records managers this new architecture presented many new and difficult challenges for capturing, accessing and describing records. With the emergence of database views and dynamic and virtual documents, the differences in the way paper and electronic records were created and managed were accentuated and could no longer be ignored. The widespread use of personal computers had an equally destabilizing effect on the management of records. By creating a less structured, less centralized environment for record creation and use, in which records were frequently not integrated into the normal business processes, PC’s made the capture and management of the work products much more difficult. Eventually, archivists came to recognize that they were dealing with systems that would support the transactions of a functional area, but would not routinely and systematically capture and maintain the records or evidence of those business transactions. With this recognition came the realization that archival and records management principles and practices needed to be reviewed and perhaps revised. 9 one would create a recordkeeping system by capturing data from an operational system and moving it to another automated environment. The precedent for extracting and transfering data to another system has been established, and some of the technology solutions have been resolved. The key difference is in the type of digital object one captures and moves. 8 For discussions of these concepts, see David Bearman, “The Electronic Office,” in Electronic Evidence: Strategies for Managing Records in Contemporary Organizations (Pittsburgh: Archives and Museum Informatics, 1994), pp. 157-168; and Clifford Lynch, “The Integrity of Digital Information: Mechanics and Definitional Issue,” Journal for the Society of Information Science 45 (December 1994): 337-344. For analyses of the impact of automation on archival concepts and theories, see Richard Barry, “The Changing Workplace and the Nature of the Record” at http://www.rbarry.com/aca-pv16/ACA-PV16.html; David Bearman, “Diplomatics, Weberian Bureaucracy, and the Management of Electronic Records in Europe and American,” in Electronic Evidence: Strategies for Managing Records in Contemporary Organizations (Pittsburgh: Archives and Museum Informatics, 1994), pp. 254-277; David Bearman and 9 6 The emergence of this new generation of technology prompted the archival profession to reexamine some its most basic archival theories and concepts, such as provenance, original order, the nature of a record and the life cycle concept. It also resulted in a spirited debate about whether traditional methodologies and procedures developed for paper records would be effective in the world of electronic records, and about what changes in traditional concepts and practices might need to be made. In short, throughout the 1990s, archivists have been asking themselves the question, what are the principles and criteria that will guide the development of international, national, and organizational strategies, policies, and standards for the long-term preservation of authentic and reliable electronic records? As might be expected, responses to this question have differed widely. Some archivists have argued that traditional archival concepts and methods do not easily lend themselves to the world of electronic records, and that archival theories and concepts require a new theoretical basis and justification if they are to remain valid. These archivists suggest that a "new archival paradigm” is required. 10 Other archivists have argued that traditional concepts and methods still have great value in managing electronic records, and that traditional archival concepts "continue to have resonance and, in fact, provide a powerful and internally consistent methodology for preserving the integrity of electronic records." 11 Margaret Hedstrom, “Reinventing Archives for Electronic Records: Alternative Service Delivery Options,” Electronic Records Management Program Strategies, ed. Margaret Hedstrom (Pittsburgh, PA: Archives and Museum Informatics, 1993), pp. 82-98; Terry Cook, “Electronic Records, Paper Minds: The Revolution in Information Management and Archives in the Post-Custodial and Post-Modernist Era,” Archives and Manuscripts 22 (November 1994): 300-328; Elizabeth Yakel, “The Way Things Work: Procedures, Processes, and Institutional Records,” American Archivist 59, No. 4 (Fall 1996): 454-464. For extended discussions of this concept of a new archival paradigm see Philip Bantin, “Strategies for Managing Electronic Records: A New Archival Paradigm? An Affirmation of Our Archival Traditions?” Archival Issues. Journal of the Midwest Archives Conference , Vol. 23, No. 1 (1998): 17-34; Terry Cook, “Electronic Records, Paper Minds, The Revolution in Information Management and Archives in the PostCustodial and Post-Modernist Era,” Archives and Manuscripts, pp. 300-328; Terry Cook, “What is Past is Prologue: A History of Archival Ideas Since 1898, and the Future Paradigm Shift,” Archivaria 43 (Spring 1997): 17-63; Greg O’Shea and David Roberts, “Living in a Digital World: Reorganizing the Electronic and Post-Custodial Realities,” Archives and Manuscripts, Vol. 24, No. 2 (November 1996): 286-311; and Frank Upward and Sue McKemmish, “Somewhere Beyond Custody,” Archives and Manuscripts, Vol. 22, No. 1 (May 1994): 136-149. 10 Luciana Duranti and Heather MacNeil, “The Protection of the Integrity of Electronic Records: An Overview of the UBC-MAS Research Project,” Archivaria, Vol. 42 (Fall 1996): 64. 11 7 Objectives of Article As yet no one overall strategy or methodology for electronic records management has emerged, largely because few of these concepts or ideas have been properly implemented and tested. At this point, however, one can safely assert that there is overall agreement among archivists on the major issues or problems. The issues or questions most frequently articulated in the archival literature include: 1) What is a record in an automated environment? 2) How will archivists identify and appraise records? 3) What documentation must be present to create a reliable and authentic record? 4) What is a recordkeeping system in an automated environment? How will the system manage these records? 5) How will archivists and records managers preserve inviolate electronic records for long as necessary? How do we keep records alive in an automated environment? 6) How will access and physical custody of electronic records be managed? 7) What is the overall role of the archivist/records manager in the information system development process and in the overall information technology environment? For the remainder of this article, these issues will be reviewed with the goals being to: 1) define and describe the nature of the problem or challenge; 2) identify how various archivists have sought to address the issue; and 3) identify commonalties among the theories or strategies, and articulate where a consensus on how to solve the problem may be emerging. Please be advised that the goals of this paper are to examine the broad issues and to provide a type of roadmap to prominent management strategies for electronic records, particularly for archivists who are just beginning this journey. In the process, however, recognize that often complex arguments are somewhat simplified and reduced, but hopefully not distorted or taken out of context. For those readers who seek to construct a fuller, more textured picture of the issues or strategies under review, numerous footnotes containing notes and citations are provided. Finally, it must be acknowledged at the start that the definitions of problems and issues and the articulation of potential solutions expressed in this article reflect the debate and discussion emanating from the archival community 8 and literature. This essay does not necessarily reflect the content of the debate presently occurring in the literature and at the conferences of records managers or technologists. The author does not pretend to speak for all professionals who manage digital objects. This article focuses on definitions of the problems and descriptions of solutions as articulated primarily by the archival communities in North America, Europe and Australia. 12 WHAT IS A RECORD? Why do we need to ask this question? After all the profession has been comfortable with the definition of the record for many decades. As some archivists might say, I know a record when I see it. Well, in fact, seeing or viewing the record is part of the problem, and it is why some archivists are suggesting the profession needs a more precise definition of a record. As discussed earlier, the creation and retention of complete and inviolate records documenting business events are not the primary objectives of most transaction processing systems. In an environment where records often exist as logical and not physical entities, and where data documenting a business event is incomplete, volatile, and reflects primarily current or near-current data values, archivists are attempting to construct a conceptual model of a record that includes enough detail to permit one to describe and identify a 12 Overall, the roles of archivists and records managers within an electronic environment and the interaction between the two professions are still not clearly defined or understood. Some recent records management theories, such as the Records Continuum concept, envision a blending of responsibilities of archivists and records managers along the records continuum. However, it is still not at all clear how this interaction will occur and at what point in the continuum. Nonetheless, no matter how one defines or redefines roles and responsibilities, the common bond between the two professions has been the management of the record, whether that record is the evidence of present and ongoing business processes, or the archival record providing evidence of longer-term administrative, legal or historical requirements. In essence, it is the record and its management over time that has defined the primary missions of both professions. So, it particularly distressing for this author to witness a trend within the records management community to redefine their primary objectives in the digital world. For archivists, the ultimate goal continues to be the management of digital or electronic records, which represents a particular type of digital object. For some records managers, however, managing records, as defined by archivists, does not appear to be the primary and certainly not the sole objective. They propose that the primary goal must be to be manage all types of digital documents and systems, from document management to information to knowledge systems. This basic and fundamental difference in what the two professions hope to capture and manage has caused archivists and records managers to begin taking very different paths in their search for answers and viable solutions. For a good discussion of the role of archivists and records managers along the records continuum, see Dan Zelenyj, "Archivy Ad Portas: The Archives-Records Management Paradigm Revisited in the Electronic Information Age," Archivaria, Vol. 47 (Spring 1999): 66-84; and Charles Dollar, “Archivists and Records Managers in the Information Age,” Archivaria 36 (Autumn 1993): 37-52. 9 record even though it cannot be viewed or accurately and completely represented as a physical object. The ultimate objective is to define a record with enough precision to inform systems designers when records are created and what kind of data needs to be captured. In addition, archivists recognize that they need to differentiate the concept of a record from the numerous, other forms of documentation, and to distinguish the mission of the archivist/records manager from that of other information and data professionals. Archivists increasingly are aware that they must be able to articulate to administrators, information technologists and other potential partners how records differ from other digital objects, and why it is important to capture and manage records. Definition of a Record So, what are records? How are records different from other types of recorded documentation, such as data, information, documents and knowledge? Organizations collect, create, and use a wide variety of recorded documentation. There is data or the “raw facts about the organization and its business transactions.” 13 There is information, defined as "data that has been refined and organized by processing and purposeful intelligence."14 There are documents or "a grouping of formatted information objects that can be accessed and used by a person." 15 More recently there is knowledge, which is defined as something more than information because it includes the expertise, logic and reasoning developed by accomplished experts in a specific field to solve problems and make decisions. 16 13 Jeffrey L. Whitten and Lonnie D. Bentley, Systems Analysis and Design Methods, 4th ed. (Boston: McGraw-Hill, 1998), p. 37. 14 Ibid, p. 38. For additional descriptions and definitions of information and information management systems, see Judith Gordon and Steven Gordon, Information Systems. A Management Approach; Kenneth C. Laudon and Jane P. Laudon, Essentials of Management Information Systems; Ralph Stair, Principles of Information Systems. A Managerial Approach. 15 Michael J.D. Sutton, Document Management for the Enterprise. Principles, Techniques, and Applications (New York: John Wiley & Sons, Inc., 1996), p. 343. For additional descriptions and definitions of documents and document management systems, see Larry Bielawski & Mim Boyle, Electronic Document Management Systems (Upper Saddle River, NJ: Prentice Hall PTR, 1997). 16 For definitions and descriptions of knowledge and expert systems, see Efraim Turban, Decision Support and Expert Systems: Management Support Systems (New York, NY: Macmillan Publishing Company, 1993), p. 465-552; Kenneth C. Laudon and Jane P. Laudon, Essentials of Management Information Systems, pp.370-399; Ralph Stair, Principles of Information Systems. A Managerial Approach, pp. 356379. 10 Archivists argue that a record is a specific and unique type of information quite different in its creation and purpose than any of these other types of recorded documentation. Archivists have identified two distinguishing characteristics of records. First of all, records reflect business processes or individual activities; a record is not just a collection of data, but is the consequence or product of an event. Of course, this is not new concept; older definitions identify records with a process or an activity. What is new is the emphasis on defining more precisely and conceptually when the record is created by the business event or personal activity. The other part of the definition of a record stresses that records provide evidence of these transactions or activities. In other words, recorded documentation cannot qualify as a record unless certain evidence about the content and structure of the document and the context of its creation are present and available. Now again, this is not exactly a new concept. However, these newer definitions provide much more detail than ever before on the type and exact nature of this evidence. This topic will be explored in more detail in the section on metadata. 17 Within the profession, there is a growing consensus around the definition of a record as: Recorded information in any form created or received and maintained by an organization, person or system in the transaction of business or the conduct of affairs and kept in a widely accessible form as evidence of such activity. 18 This definition, however, must be recognized as only the starting point for a complete and useful definition. To be meaningful, it must be accompanied by a detailed set of definitions that identify when a record is created and what type of evidence is required to 17 For discussions of the evolution of the concept of the record and redefinitions of the term, see Richard Cox, “The Record: Is it Evolving?” The Records and Retrieval Report 10, No. 3 (1994): 1-16; Richard Cox, “The Record in the Information Age: A Progress Report on Research,” The Records and Retrieval Report, No. 1 (January 1996): 1-16; David Roberts, “Defining Electronic Records, Documents and Data,” Archives and Manuscripts 22, No. 1 (May 1994): 14-26; Glenda Ackland, “Managing the Record Rather Than the Relic,” Archives and Manuscripts 20, No. 1 (1992): 57-63; Sue McKemmish, “Are Records Ever Actual?” in The Records Continuum, Ian Maclean and Australian Archives First Fifty Years, Sue McKemmish and Michael Piggott, eds. (Clayton, Vic: Ancora Press, 1994), pp. 187-203; David Bearman, “Managing Electronic Mail,” in Electronic Evidence, pp. 188-91; David Bearman, “Item Level Control and Electronic Recordkeeping, Archives and Museum Informatics, Vol. 10, No. 3 (1996): 211-14; Charles Dollar, Archival Theory and Information Technologies. The Impact of Information Technologies on Archival Principles and Methods, pp. 45-48; National Archives of Australia, “Managing Electronic Records: A Shared Responsibility” at http://www.naa.gov.au/recordkeeping/er/manage_er/summary.html 18 This definition is taken almost verbatim from a draft International Standard on Records Management (ISO/DIS 15489), which the author found reproduced in Charles Dollar, Authentic Electronic Records: Strategies for Long-Term Access, (Chicago, IL: Cohasset Associates, Inc., 2000), p. 23. 11 create reliable and authentic records. 19 In addition, archivists are recognizing that this definition needs to articulate the cultural, historical and heritage dimensions of archives. The dialogue on this issue is often presently framed in terms of describing “archives as evidence” and “archives as memory.” 20 HOW WILL ARCHIVISTS IDENTIFY AND APPRAISE RECORDS? If physically reviewing records or browsing automated systems is not a realistic strategy, how will archivists identify and appraise a record? Most archivists working with electronic records would respond by asserting that the answer is derived from the definition of the record and involves tracing the record back to the process that created it. This again is not a new revelation. Archivists have been writing about this concept for well over a 19 Most archivists agree that the nature and amount metadata required will be dependent on several factors, including the business context, accountability required, and the risk of not having a complete and authentic record available. David Bearman describes risks as including “failure to locate evidence that an organization did something it was supposed to have done under contract or according to regulation; inability to find information that is critical for current decision making; loss of proof of ownership, obligations owed and due, or liabilities; failure to document whether it behaved according to its own policies or in adherence to law; inability to locate in the proper context information which would be incriminating in one context but innocent in another.” David Bearman, “Archival Data Management to Achieve Organizational Accountability for Electronic Records” in Electronic Evidence. Strategies for Managing Records in Contemporary Organizations, pg. 24. Helen Samuels and Tim McGovern at MIT have also developed an electronic records management strategy based on risk assessment. In a paper on the topic, they wrote that “Risks are particularly great when employees in the organization do not recognize that records are, or should be created, as a consequence of transactions.” Helen Samuels and Tim McGovern, “Managing Electronic Evidence: A Risk Management Perspective,” 1996, an unpublished paper. The archival discourse on the possible shortcomings of the “records as evidence” concept has only recently surfaced. To date the most public airings of the topic have occurred at an Australian Society of Archivists Conference in Melbourne in August, 2000, and at the International Council on Archives Congress in Seville in September, 2000. At the Australian meeting, Terry Cook, Canadian archivist and educator, asserted that “the archival profession is threatened, at least in the English-speaking world, with serious schism,” between those archivists who champion the importance of records as evidence for organizational accountability and those who emphasize the importance of records as sources of cultural memory. Cook claimed that what is needed is a “renewed balancing of the two concepts” of evidence and memory. At the ICA meeting, Verne Harris of the National Archives of South Africa stated that the “records as evidence” defintion “excludes the possibility that people (individuals, organizations, societies) generate and keep records for reasons other than ‘evidence of process.’ It excludes the possibility that qualities, or attributes, or dynamics, other than ‘evidence’ enjoy equally legitimate claims on the concept of ‘record’ – for instance, remembering, forgetting, imaging, falsifying, constructing, translating, fictionalizing, narrating.” (The text of Cook’s and Harris’ presentations are available at http://www.archivists.org.au/whatsnew.html). 20 12 decade, most notably in the context of an appraisal theory and methodology based on functions and activities. 21 As any archivist knows, traditional appraisal theory in North America focuses on finding value in records, these values commonly expressed as primary and secondary, with secondary values being divided into evidential and informational values. This methodology, most closely identified with the writings of Theodore Schellenberg, placed special emphasis on the archivist’s responsibility for appraising records to identify secondary, research values, as his definition of archives makes clear: “Those records of any public or private institution which are adjudged worthy of permanent preservation for reference and secondary purposes.” 22 For many archivists, the search for research value remains at the heart of the appraisal process. Increasingly, however, critics of this appraisal methodology have argued that by defining appraisal primarily in terms of secondary research value based largely on content analysis, the Schellenberg model does not provide a proper answer for why we appraise records. Critics of Schellenberg have put forward four arguments to support this judgment. In the first place, they argue that predicting or anticipating research needs or trends is not a realistic goal, and at best will mean the archivist will remain “nothing more than a weathervane moving by the changing winds of historiography.” 23 Secondly, critics assert that content-oriented appraisal cannot give a true or, For descriptions of the functional appraisal model see Terry Cook, “What is Past is Prologue,” pp. 30-40; Helen Willia Samuels, Varsity Letters. Documenting Modern Colleges and Universities (Metuchen, N.J.: The Society of American Archivists and Scarecrow Press, 1992); Helen Samuels, “Improving Our Disposition: Documentation Strategy,” Archivaria 33 (Winter 1991-92): 125-140; Terry Cook, “Electronic Records, Paper Minds,” pp. 300-328; Margaret Hedstrom, “Electronic Archives: Integrity and Access in the Network Environment,” American Archivist 58, No. 3 (Summer 1995): 312-324; Hans Booms, “Uberlieferungsbildung: Keeping Archives as a Social and Political Activity,” Archivaria 33 (Winter 199192): 25-33; Greg O’Shea, “The Medium is NOT the Message: Appraisal of Electronic Records by the Australian Archives,” Archives and Manuscripts, Vol. 22, No. 1 (May 1994): 68-93; David Bearman, “Diplomatics, Weberian Bureaucracy, and the Management of Electronic Records in Europe and America,” in David Bearman, Electronic Evidence. Strategies for Managing Records in Contemporary Organizations, pp. 261-266; Charles Dollar, Archival Theory and Information Technologies, pp. 55-60, 76-77. For applications of the functional model, see Catherine Bailey, “From the Top Down: The Practice of MacroAppraisal,” Archivaria 43 (Spring 1997): 89-128; Brian P.N. Beaven, "Macro-Appraisal: From Theory to Practice," Archivaria, No. 48 (Fall, 1999): 154-198; and Jim Suderman “Appraising Records of the Expenditure Management Function: An Exercise in Functional Analysis,” Archivaria 43 (Spring 1997): 129-142. 21 22 T.R. Schellenberg, Modern Archives. Principles and Techniques ( Chicago: University of Chicago Press, Midway reprint, 1975), p. 16. For a detailed discussion of Schellenberg’s appraisal methodology, see , Modern Archives. Principles and Techniques, pp. 133-160. 23 F. Gerald Ham, “The Archival Edge,” The American Archivist, Vol. 38, No. 1 (January 1975): 8. 13 even, representative image of society. 24 Thirdly, archivists who support Hilary Jenkinson’s theory on the nature of archives assert that selection by content to support research is in direct conflict with basic archival theory and the very nature of archives. 25 Finally, critics of traditional appraisal methodology assert that in the modern world of high volume documentation and of electronic records that exist as logical and not physical entities, archivists can no longer hope to focus on the record and appraisal by content. 26 So, what have archivists offered in its place? Although specific appraisal theories and methodologies abound, almost all major commentators agree that a principal objective or aim of archival appraisal must be the preservation of evidence 27 documenting the functions, processes, activities, and transactions 28 undertaken and completed by the institution or individual. In the words of two prominent commentators on appraisal: “Archivists are Servants of Evidence,” 29 and “Evidence is an aim…of Angelika Menne-Haritz, “Appraisal or Documentation: Can we Appraise Archives by Selecting Content?” American Archivist 57, No. 3 (Summer 1994): 528-542. 24 Luciana Duranti, “The Concept of Appraisal and Appraisal Theory,” American Archivist 57, No. 2 (Spring 1994): 328-344. Also see Hilary Jenkinson, , “Reflections of an Archivist,” in A Modern Archives Reader (Washington, D.C.: National Archives and Records Service, 1984), pp. 15-21. 25 Terry Cook, “Mind over Matter,” pp. 38-52; Terry Cook, “What is Past is Prologue,” pp. 40-49; Helen Samuels, “Improving Our Disposition: Documentation Strategy,” Archivaria, pp. 125-139. 26 27 Perhaps the most widely quoted definition of evidence is provided by David Bearman. For a good discussion of evidence, see David Bearman, “Archival Principles and the Electronic Office,” in Electronic Evidence. Strategies for Managing Records in Contemporary, pp. 147-149. It is important to recognize that evidence in this context refers, in the terms of Hilary Jenkinson, to those impartial, authentic, and interrelated records that are created “naturally” in the process of conducting business or undertaking activities. It does not refer to Schellenberg’s concept of evidential value or information that is gathered, largely by examining the content of records, for the purpose of answering questions about the history, mission, and activities of the subject under review. In short, evidence is the actual record made or received in the course of undertaking and completing the activity; it is not the pieces of information or bits of data selected to document the event. For a discussion of Jenkinson’s concept of evidence, see Hilary Jenkinson, A Manual of Archive Administration (London: Percy Lund, Humphries & Co. LTD, 1966). 28 Slowly, the archival profession is working towards providing more precise definitions of these concepts. In particular, see the working definitions of functions and transactions created for the Indiana University Electronic Records and described in the article by Philip Bantin, “The Indiana University Electronic Records Project Revisited,” The American Archivist, Vol. 62, No. 1 (Spring 1999): 153-163; see also Chris Hurley, “What, If Anything, Is a Function,” Archives and Manuscripts 21, no. 2 (1993): 208-220. Terry Eastwood, “Toward a Social Theory of Appraisal,” in The Archival Imagination: Essays in Honour of Hugh A. Taylor (Ottawa: Association of Canadian Archivists, 1992), p. 74. 29 14 archival appraisal.” 30 And where is this evidence to be found? Most archivists writing on this topic, particularly as it relates to the appraisal of electronic records, have advocated a functional appraisal model. Proponents of functional appraisal assert that in the search for evidence and value, the most accurate and complete documentation will be provided by examining the function, activity, and transaction that generated the record, rather the record itself. In short, supporters of functional appraisal argue that the context and not the content of the record must be the starting point in the search for evidence and hence value. 31 30 Angelika Menne-Haritz, “Appraisal or Documentation,” p. 541. 31 Beyond ensuring the preservation of evidence, do archivists have additional duties as an interpreter and a documenter of society? It is in response to this question that disagreements about the objectives of archival appraisal have occurred. At one end of the spectrum, that represented originally by Hilary Jenkinson and in the modern era by Luciana Duranti and reflected in the theoretical framework and methodology of the University of British Columbia electronic records project, is the belief that evidence itself is the aim of appraisal. In other words, the archivist’s goal is not to interpret this evidence, attribute external values to the records or to the creators or functions generating the records, or create a representative image of society. Rather, in this view, the goal is to retain intact “the internal functionality of the documents, and the documents aggregations, with respect to one another, so that compact, meaningful, economical and impartial societal experience can be preserved for the next generations.” (Luciana Duranti, “The Concept of Appraisal and Archival Theory,” American Archivist 57, No. 2 (Spring 1994): 34). In other words, the archivist’s primary contributions are to preserve authentic and impartial records and by so doing provide researchers with the evidence that will permit them to interpret events in their own way. Consequently, within this theoretical framework the role of the archivist in the appraisal process is very limited – archivists are not judges or interpreters; they are custodians and preservers. On the other end of the spectrum are those archivists who support an appraisal model that advocates a more active role for the archivist in shaping the documentary record. Two prominent strategies in this category are those that locate value 1) in the provenance of the records and 2) in the assessment of use of the records. Supporters of the provenance based appraisal model argue that the essence of appraisal is the “articulation of the most important societal structures, functions, record creators, and records creating processes, and their interaction, which together form a comprehensive reflection of human experience.” (Terry Cook, “Mind over Matter: Towards a New Theory of Archival Appraisal,” in Barbara L. Craig, ed. The Archival Imagination: Essays in Honour of Hugh A. Taylor , Ottawa: Association of Canadian Archivists, 1992: 41). Terry Cook has labeled this strategy “macro-appraisal,” which he defines as an approach “that focuses research instead on records creators rather than directly on society, on the assumption that those creators, and those citizens and organizations with whom they interact, indirectly represent the collective functioning of society.” It is an appraisal methodology, Cook writes, that is built on “a context-based, provenancecentred framework rather than in a content-based, historical-documentalist one.” (Terry Cook, “What is Past is Prologue,” p. 31). The other appraisal model which advocates a more active role for archivists identifies “the means of documenting the precise form and substance of past interactions between and among people in society” in the “analysis of the use to which they [records] are put by the society that created them, all along the continuum of their existence.” (Terry Eastwood, “Toward a Social Theory of Appraisal,” in The Archival Imagination, p. 80, 83). In other words, in this model appraisal decisions mirror or reflect the values a wide variety of users assigned to the records, resulting in the selection of archival records that are most cherished or frequently consulted by the society that created and used the records. 15 Business Process Modeling Clearly, a theoretical basis for identifying records based on function is in place. What has been slow to develop is a methodology for actually undertaking and completing a functional analysis of business processes or personal activities. However, there have been some interesting and promising beginnings. Most promising are the arguments that suggest that a source of information on how to gain this knowledge of processes and activities can be found in the writings of the discipline known as systems analysis. Systems analysis has been defined as “the study of the problems and needs of a business to determine how the business systems and information technology can best solve the problem and accomplish improvements for the business. The products of this activity may be improved business processes, improved information systems, or new or improved computer applications.” 32 Clearly emphasized in this definition is a focus on understanding and analyzing business processes as a means to improving the system, whether that system is defined as the business system or the information system. Without question, archivists and systems analysts have something in common; both regard an understanding of business requirements as critical to the design of systems. The recognition that archivists and systems analysts share a common concern in the identification of business requirements has led some archivists to emulate the methodology and techniques analysts employ in modeling system processes. What they have discovered is that the methodology and techniques analysts employ in reviewing system processes provide useful tools in the quest to identify records. One such methodology is a popular and widely practiced technique known as “modern structured analysis.” 33 This form of analysis has been defined as “a process-centered 32 Jeffrey L. Whitten and Lonnie D. Bentley, Systems Analysis and Design Methods, 4th ed., p. 8. The other major type of systems-related analysis is known is systems design, which is “the specification or construction of a technical, computer-based solution for the business requirements identified in a system analysis.” Whitten and Bentley, Systems Analysis and Design Methods, 4th ed., p. 7. Clearly, the work of archivists has much more in common with systems analysts than systems designers. For descriptions of the technique known as “Modern Structured Analysis,” consult the works of Tom DeMarco, Structured Analysis and System Specification (Englewood Cliffs, N.J: Prentice-Hall, 1978), Stephan McMenamin and John Palmer, Essential Systems Analysis (Englewood Cliffs, N.J: Prentice-Hall, 1984), James and Suzanne Robertson, Complete Systems Analysis (New York: Dorset House Publishing, 1994), Jeffrey Whitten and Lonnie Bentley, Systems Analysis and Design Methods, 4th ed.; Edward Yourdan, Modern Structured Analysis (Englewood Cliffs, N.J.:Yourdon Press, 1989); and Jeffrey Hofer, Joey George, and Joseph Valacich, Modern Systems Analysis and Design. 2nd ed. (Reading, MA: AddisonWesley, 1999). 33 16 technique that is used to model business requirements for a system. The models are structured pictures that illustrate the processes, inputs, outputs, and files required to respond to business events.” 34 The products of this analysis, business process models, depict the business functions and transactions and the inputs and outputs required to respond to business events. Business process models can further be broken down into business function decomposition diagrams, business events diagrams, and business process data flow models. The value of business models for archivists is that they can depict precisely when, where and how record creation occurs. They provide the archivist a conceptual model based on depiction of real-life activities of the context for creation, and consequently provide the information needed to precisely describe and define for system designers what pieces of data need to be captured as evidence of the business transaction. It is not illogical or too much of a jump to arrive at an overall strategy that views conceptual model building as a methodology that will allow archivists to deal with many or most of the issues the profession faces in attempting managing records in automated environments. For example, some archivists are suggesting that rather than physically reviewing records and systems to conduct basic activities such as appraisal and description, archivists should be creating and employing conceptual models designed to analyze and document record systems. Thus appraisal of records could be still be undertaken by employing traditional appraisal values, but the analysis would be based on conceptual models of the processes and records rather than on a physical review of data content. Evidential values could be derived from business process and metadata models, and informational values from reviewing data and metadata models. In documenting records, some archivists are suggesting that a complete, authentic and reliable record could be captured not by physically reviewing the record but by analyzing metadata and data models and comparing the results to an established set of metadata recordkeeping requirements. 35 34 Whitten and Bentley, Systems Analysis and Design Methods, 4th ed, p. 122. 35 This approach is inherent in the methodology advocated by the University of Pittsburgh Electronic Records project. A conceptual approach is the one presently be adopted and tested by the Indiana University Electronic Records project. 17 WHAT DOCUMENTATION MUST BE PRESENT TO CREATE A RELIABLE AND AUTHENTIC RECORD? As indicated earlier, the concept of evidence is a very critical element within the definition of a record. Without sufficient documentation describing the content of the record and the context of its creation, the record loses it value as evidence and in some cases ceases to be a record at all. Now again, the need for supporting documentation is not a new requirement created by electronic records. The emergence of electronic records, however, has created some new problems and challenges for archivists attempting to preserve evidence. Challenges and Issues The primary challenge is associated with the basic but extremely important recognition that unlike paper documents, electronic records are logically constructed and often “virtual” entities. Consequently, electronic documents cannot be viewed in the same way as paper records, where so much of the content, context and structural metadata is embedded in or is part of the record. In automated systems, the vital metadata, if it exists at all, may or may not be physically associated with the content data. Vital links between metadata and the record content data may exist only in computer software programs. In some cases, the metadata may actually not be a part of the automated system at all, but may exist only as a paper document totally disassociated with the records it is describing. Archivists also discovered that system metadata as typically defined by systems designers and technologists is often not as complete as necessary to describe a record. Transaction logs maintained in typical TPS do contain some critical data on updates and revisions, but on the whole, archivists generally agree that these logs do not provide sufficient evidence. Of particular concern is the relative lack of metadata related to the context of creation and use - metadata that addresses the questions of why the record was created, who were the users of the record, and who had custody of the record. The availability of this contextual metadata, archivists argue, could make the difference between a useful and a useless record, particularly when viewed over longer periods of time. Another deficiency from a recordkeeping perspective of typical system metadata is the absence of some critical documentation on the structure of the record. Of particular importance is structural metadata describing how to open and read a record 18 as it was originally created and viewed. 36 Taken as a whole, the absence of critical metadata has meant, as one archivist has noted, that "most collections of electronic data, electronic documents, or information are not records because they cannot qualify as evidence." 37 The recognition that critical documentation may never have been created or may not be available with the content of the record has caused archivists to begin rethinking strategies for documenting records. Specifically, three strategies have been prominently featured. Identification of Recordkeeping Metadata The challenge receiving the most attention from archivists is the determination of which types of metadata are needed to meet requirements for recordkeeping. Archivists quickly recognized that before they could properly describe and identify records (comprising content data and the evidence or metadata documenting context, content, and structure), they needed first to precisely define what types or categories of metadata must be captured. The first research project designed to identify key recordkeeping metadata was the electronic records project undertaken in the period from 1993-1996 at the University of Pittsburgh with funding by the National Historical Publications and Records Commission. The primary objective of the Pittsburgh project was to develop a statement of requirements for ensuring the preservation of evidence in recordkeeping. One of the products of this project was a set of metadata specifications "designed to satisfy the functional requirements for evidence," and to "guarantee that the data object will be usable over time, be accessible by its creator, and have properties required to be fully trustworthy as evidence and for purposes of executing business." Pitt project personnel identified sixty-seven metadata items organized into six categories or layers. 38 36 For discussions on the need for metadata documenting content, context and structure, see David Bearman, “Item Level Control and Electronic Recordkeeping,” Archives and Museum Informatics, 211-14; David Bearman, “Documenting Documentation” in Electronic Evidence, pp. 222-252; David Wallace, “Managing the Present: Metadata as Archival Description,” Archivaria 39 (Spring 1995): 11-21; and Margaret Hedstrom “Descriptive Standards for Electronic Records: Deciding What is Essential and Imagining What is Possible,” Archivaria 36 (Autumn 1993): 53-63. 37 David Bearman, Electronic Evidence, “Introduction. Constructing a Methodology for Evidence,” p. 2. 38 The University of Pittsburgh Electronic Records Project, Metadata Specifications can be found at http://www.lis.pitt.edu/~nhprc/meta96.html 19 Since the emergence of the Pittsburgh metadata specifications, several other institutions or projects have put forward their own set of recordkeeping metadata. Among the most prominent are those proposed by the National Archives of Australia and Canada, the State Archives of Victoria (Australia) and New South Wales, the United States Department of Defense, the University of British Columbia School of Library and Information Science, the Indiana University Archives, and most recently by personnel associated with the SPIRT Project and the InterPARES Project. 39 Most of these lists of recordkeeping metadata differ noticeably in the way they are organized, in the amount of description they provide on the specifications, and, most importantly, in the specific items they list as essential or mandatory. At present, there is no real consensus on a core set of metadata specifications or a set of minimum metatdata standards; as yet, there is nothing for recordkeeping that resembles or has been accepted in the way that say the Dublin Core Metadata 40 has been embraced by the library community. One can discern, however, some growing consensus among archivists about certain key issues relating to metadata. For example, there is general agreement among archivists that records require their own unique, particular kind of metadata that goes beyond what is required in the Dublin Core standard. More specifically, archivists stress that records require more 39 See the following Web sites for details on these projects: National Archives of Australia, Record Recordkeeping Metadata Standard at http://www.naa.gov.au/recordkeeping/control/rkms/summary.htm; SPIRT, Recordkeeping Metadata Project at http://www.sims.monash.edu.au/rcrg/research/spirt/index.html; see also article on SPIRT project, Sue McKemmish and Glenda Acland “Accessing Essential Evidence on the Web: Towards an Australian Recordkeeping Metadata Standard" (1999) at http://ausweb.scu.edu.au/aw99/papers/mckemmish/paper.html; Proposed New South Wales Recordkeeping Metadata Standard : New South Wales, Australia at http://www.records.nsw.gov.au/publicsector/erk/metadata/NRKMSexplan.htm; United States, Department of Defense, “Records Management Application (RMA) Design Criteria Standard” and “Standard Revision” and “Certification Test and Evaluation Process and Procedures” at http://jitc.fhu.disa.mil/recmgt/ Indiana University Electronic Records Project at http://www.indiana.edu/~libarch/phase2.html University of British Columbia Project, “The Preservation of the Integrity of Electronic Records” at http://www.slais.ubc.ca/users/duranti/ ; On the British Columbia Project also review the article by Luciana Duranti and Heather MacNeil, “The Protection of the Integrity of Electronic Records: An Overview of the UBC-MAS Research Project,” Archivaria 42 (Fall 1996): 46-67; International Research on Permanent Authentic Records in Electronic Systems (INTERPARES) Project at http://www.interpares.org/ Victorian Electronic Records Strategy at http://home.vicnet.net.au/~provic/vers/ ; Also review the article on the Victorian Electronics Records Project by Justine Heazlewood, et.al., “Electronic Records: Problem Solved?” in Archives and Manuscripts, Vol. 27, No. 1 (May 1999): 96-113; “Record Keeping Metadata Requirements for the Government of Canada at http://www.imforumgi.gc.ca/new_docs/metadata1_e.html 40 The Dublin Core Metadata specifications can be found at http://purl.oclc.org/dc/ 20 metadata documenting the context of creation if they are to be understood and interpreted, particularly over long periods of time. There is also agreement about the basic categories of metadata that systems should capture and retain. For example, most record metadata lists include various pieces of documentation describing the context of creation. This contextual metadata typically includes information on the agents involved in creating, receiving, and transmitting the record; the date of receipt; and the relationship of the record to the specific business processes and to related records. There is also general agreement that the metadata model include some documentation on terms and conditions for access and use, and that the system document use history. Most lists of metadata specifications also include data on the disposition of the record, such disposal authorization and date, and a disposal action history. Predictably, most lists also include metadata describing the record content, such as information on title of the record, date of creation, and subject. Finally, the majority of record metadata lists include information on the structure of the record, most notably documentation on how the record is encoded, how the record can be rendered, and how the content of the record is structured. In short, most metadata specifications include documentation in varying degrees of detail on the content and structure of the record and the context of its creation. Timing of Archival Input A second issue relating to the documentation of electronic records involves the determination of when, at what point, archivists should become actively involved in the process. Many archivists have come to the conclusion that the profession must be more proactive and be involved at the systems design stage. Proponents of this position argue that documentation of business processes cannot be postponed until the point when records become inactive; to be effective, description must take place over the life of the record. Only in this way, it is argued, can archivists hope to document business transactions throughout their life cycle. Advocates of this position warn that if procedures for early identification and maintenance are not established, records, and particularly electronic records, may never survive or even be created. 41 For discussions of this issue see David A. Wallace, “Managing the Present: Metadata as Archival Description,” Archivaria, pp. 11-31; and Charles Dollar, Archival Theory and Information Technologies, pp. 60-62, 77-78. 41 21 Other archivists, however, have warned that by introducing metadata requirements designed to satisfy the needs of future users, archivists compromise the impartiality of the records. And if “the impartiality of the metadata is compromised, their value as evidence will be compromised, which means, ultimately, that the underlying objective of metadata strategies-the preservation of evidence-will be defeated.” 42 In short, advocates of this position argue that, “archival participation in the design and maintenance of metadata systems must be driven by the need to preserve them as archival documents, that is, as evidence of actions and transactions, not as descriptive tools.” 43 Still, this must be regarded as the minority position at this point. Most archivists involved in studying this problem have agreed that early intervention, preferably at the systems design stage, is the only viable documentation strategy. Value of Traditional Finding Aids Finally, archivists are debating whether traditional methods for describing archival records (descriptive inventories, guides, and other finding aids created after the records are transferred to the archives) are adequate and useful tools for documenting electronic records. Critics of traditional strategies for describing electronic records identify three major reasons for adopting other methods. In the first place, critics claim that traditional descriptive methodologies that depend upon physically reviewing records, files and series to identify content and context are not viable in the world of electronic records. In addition, they argue that traditional prose narratives and descriptions of data structures cannot possibly describe the multitude of record linkages or reflect the relationships between and among transactions in automated systems. To properly describe these complex record systems, they recommend that much more dynamic and interactive documentation strategies be employed. Finally, proponents of this position of change argue that a viable system of documenting business processes already exists in the form of record system metadata. Systems designers and programmers routinely generate documentation on the content and structure of the systems and programs they create. Why not, it is suggested, make this metadata/metatag system the basis for describing electronic records? Why Heather MacNeil, “Metadata Strategies and Archival Description: Comparing Apples to Oranges,” Archivaria, No. 36 (Spring 1995): p. 28. See also Luciana Duranti and Heather MacNeil, “The Protection of the Integrity of Electronic Records: An Overview of the UBC-MAS Research Project,” Archivaria, Vol. 42 (Fall 1996): 57. 42 43 MacNeil, “Metadata Strategies and Archival Description” p. 30. 22 not consider a shift from creating descriptive information to capturing, managing, and adding value to system metadata? 44 Naturally, not all archivists agree with the strategy described above. Their arguments focus on the themes of the authenticating role and the unique and vital contributions of traditional archival description. For example, Luciana Duranti argues that the “verification of the authenticity of electronic records over the long term will have to rely on one thing and one thing only: their archival description.” 45 Traditional arrangement and description verify authenticity, according to Duranti, by preserving the network of administrative and documentary relationships. “Administrative relationships are revealed and preserved through the writing of the administrative history of the archival fonds and its parts, including the preservation and custodial history. Documentary relationships are revealed and preserved through the identification of the levels of arrangement of the fonds and their representation in structured descriptions.” 46 Another argument put forward in defense of traditional archival description is that it performs a vital function that system metadata cannot. Advocates of this position argue that because the scope and context of system metadata is “comparatively narrow, metadata circumscribe and atomize these various contexts. Archival description, on the other hand, enlarges and integrates them. In so doing it reveals continuities and discontinuities in the matrix of function, structure, and record-keeping over time.” 47 WHAT IS A RECORDKEEEPING SYSTEM? Traditional records management methodology focuses on managing and controlling records, usually as part of a record series. Newer, revised definitions of the objectives of records management, however, focus on evaluating the processes creating records and the systems for managing them. For example, one prominent definition identifies the goals of records management as the identification and capture of records generated in the For an excellent discussions of this position, see . Wallace, “Managing the Present: Metadata as Archival Description,” pp. 11-31; and David Bearman, “Archival Strategies,” pp. 384-85. 44 45 Duranti, “The Protection of the Integrity of Electronic Records,” p. 57. 46 Ibid., p. 57 47 Heather MacNeil, “Metadata Strategies and Archival Description: Comparing Apples to Oranges,” p. 25. 23 context of business processes, and the creation of systems that manage and preserve these records. 48 In essence, the new definition is concerned less with managing records and is more focused on defining and assisting in the management of recordkeeping systems. Identification of Recordkeeping Requirements What is a recordkeeping system, and how is it different from other types of systems, such as transaction processing, information management, and document management systems? In this context, the term “system” is used in its broadest sense to depict the organizational mission, business processes, policies, procedures, practices, and human and automated mechanisms to bring about desired ends, which in this case is trustworthy recordkeeping. 49 To address these questions, archivists have designated the identification of a set of requirements for a recordkeeping systems as one of the profession's critical, initial tasks. The University of Pittsburgh School of Information Science conducted the first systematic research on this topic. The Pitt project established a set of functional requirements for recordkeeping that addressed three levels of requirements: the organizational level, the recordkeeping system level, and the record level. Within these levels, they established five categories – Conscientious Organization, Accountable Recordkeeping System, Captured Records, Maintained Records and Useable Records – and within these categories twenty requirements, which they claimed “are identified in law, regulation, and best practices throughout society as the fundamental properties" of evidential records. 50 Since the creation of the Pittsburgh document, numerous other projects have produced lists of requirements for recordkeeping systems. Among the most prominent requirements are those created by the United States Department of Defense; the National Archives of Australia and Canada; the State Archives of Victoria (Australia), New York, Delaware, and Kansas; and at the University of British Columbia and Indiana University. 51 48 For detailed descriptions of this electronic records management strategy, see David Bearman, Electronic Evidence, “Recordkeeping Systems,” pp. 34-70, and “Electronic Records Guidelines,” pp. 72-116; and David Bearman and Margaret Hedstrom “Reinventing Archives for Electronic Records: Alternate Service Delivery Options,” Electronic Records Management Strategies, ed. Margaret Hedstrom (Pittsburgh, PA: Archives and Museum Informatics, 1993): pp. 82-98. 49 50 Rick Barry provided this definition in a memo to the author dated September 11, 2000. The Pitt Project Functional Requirements can be viewed at http://www.lis.pitt.edu/~nhprc/prog1.html 51 These lists of functional requirements are available at the following Web sites: Department of Defense standard can be found at http://jitc.fhu.disa.mil/recmgt/#standard; The National Archives of Australia, “Designing and Implementing Recordkeeping Systems,” at 24 As with the creation of metadata specifications, the various lists of recordkeeping requirements differ, in some cases significantly. There is general agreement and a growing consensus, however, on several critical points. For example, the majority of archivists agree that "not all information systems are recordkeeping systems, " and that "recordkeeping systems are a special kind of information system" (in this instance, “system” is used at the software application level). 52 Most of the lists of recordkeeping requirements also agree on the basic types or categories of functionality a recordkeeping system must possess. These typically include requirements that the system be compliant by meeting legal and administrative requirements, national and international standards, and best practices for recordkeeping. Many lists of recordkeeping requirements also specify that the system be accountable and reliable. Specific requirements included in this category are that system policies and procedures be well documented, that system hardware and software be regularly tested to ensure that consistent and accurate business records are created, and that system audit trails be maintained for all business processes. All lists of requirements specify that the system capture all business records and all essential metadata related to that business process. Similarly, all lists of recordkeeping system requirements mandate that the system maintain and manage the business record. Typical requirements in this category include the specification that the system maintain inviolate records protected from accidental or intentional deletion or alteration; that the system ensure that all components of a record, including relevant metadata, notes, attachments, etc., can be accessed, displayed and managed as a unit or complete record of a business process; and that the system include an authorized disposition plan that is implemented as needed. Finally, all sets of requirements specify that the recordkeeping system ensure the future usability of the business records. As part of this requirement, systems must be capable of recreating http://www.naa.gov.au/recordkeeping/dirks/summary.html; State Archives of Victoria (Australia), “System Requirements for Archiving Electronic Records” at http://www.prov.vic.gov.au/vers/standard/997-1toc.htm; Canadian State Archives, “ “Recordkeeping in the Electronic Work Environment” at http://www.archives.ca/06/0603_e.html: Delaware State Archives, “Model Guidelines for Electronic Records” at http://www.archives.lib.de.us/recman/g-lines.htm; New York State Archives, “Functional Requirements to Ensure the Creation, Maintenance, and Preservation of Electronic Records” at http://www.ctg.albany.edu/resources/abstract/mfa-4.html; Kansas State Historical Society, “Kansas Electronic Records Management Guidelines” at http://www.kshs.org/archives/recmgt.htm; University of British Columbia, “The Preservation of the Integrity of Electronic Records” at http://www.slais.ubc.ca/users/duranti/; Indiana University Electronic Records Project, “Functional Requirements for Recordkeeping Systems” at http://www.indiana.edu/~libarch/funcreqs.html 52 David Bearman, “Recordkeeping Systems,” pp. 34-35. 25 the content of records and any relevant metadata within a new system without loss of any vital information. Relationship of Recordkeeping to Other Types of Systems How will these recordkeeping systems function in relation to other data and information systems, like TPS, DBMS and Management Information Systems (MIS)? In other words, will recordkeeping functionality be built into the active transactions processing system, or will records be managed in a completely separate system or environment, or might there be a combination of these two approaches? At present there is no consensus on this issue, largely because there have been no significant tests of the costs and effectiveness of building recordkeeping systems in a variety of automated environments. Conceptually, some archivists argue that it may likely be easier to manage records in their own separate environment, much in the same way that Management Information Systems manage information and decision support data. To populate decision support systems and data warehouses, data is extracted from the TPS and moved to a separate automated system, which is typically managed by a separate staff operating with its own set of policies and procedures. Some archivists argue that this same strategy could be applied to create recordkeeping systems. As records are created in the TPS, they would be captured and moved to a separate but linked environment managed according to its own set of requirements by a staff of records managers. The proponents of the view consider it extremely important that records are maintained by an independent organization with no special interests in the records and by a staff trained in archives and records management. 53 In the final estimation, however, the strategy employed for building recordkeeping functionality may well be determined largely on the basis of the nature and requirements of the specific system environment under review. As one colleague stated to this author: “in less structured environments, such as those where e-mail and electronic documents are exchanged without the benefit of defined work flow or structured work processes, the need for a separate, well defined recordkeeping environment may be essential to the capture and preservation of records. In other systems defined by structured business processes, however, the design parameters might be such that recordkeeping could be incorporated inside the overall 53 Overall this is the strategy that is most favored by the Indiana University Electronic Project staff. 26 design of the existing system.” 54 In other words, every environment is different and will demand different approaches. Consequently the “one-size fits all” strategy for designing recordkeeping systems will likely not be effective. PRESERVATION OF RECORDS What is the best strategy for preserving digital objects over time? This has proven to be a very difficult question to answer. To date, archivists, librarians and technologists have identified the challenges, but have been far less successful creating viable strategies for solving or addressing the issues. Definitions and Issues Professionals working on long-term preservation of digital objects have generally agreed on a definition of the overall goal of digital preservation as the ability to ensure readability and intelligibility in order to facilitate data exchange over time. In this context, readability is defined as digital objects or composite objects that can be processed on a computer system or device other than the one that initially created them or on which they are currently stored. Intelligibility can defined as the requirement that the digital information be comprehensible to a human being. 55 Archivists intent on preserving records have stressed that any strategy must also preserve the authenticity and integrity of records, which translates into requirements for preserving formal document structure (structural characteristics) and descriptive metadata. Consequently, as Charles Dollar has written, archival preservation demands that records be more than readable and intelligible; records preserved for the future use must also be identifiable, encapsulated, retrievable, reconstructable, understandable and authentic. 56 54 A colleague, John McDonald, wrote this in an electronic message to this author dated August 1, 2000. 55 Charles Dollar, Authentic Electronic Records: Strategies for Long-Term Access, (Chicago, IL: Cohasset Associates, Inc., 2000), pp. 47-50. 56 Charles Dollar, Authentic Electronic Records, pp. 50-57; 27 Professionals working on preservation generally also agree on the problems or challenges. They are typically described under three categories: hardware obsolescence, software dependence, and storage medium deterioration. While all three are eventually lethal to the long-term survival of digital objects, most experts agree that it is software dependence or "the fact that digital documents are in general dependent on application software to make them accessible and meaningful" that presents the greatest challenge. 57 Moving from the identification of goals and issues to the formulation of specific strategies to address these challenges, there is far less uniformity of opinion. Over the last two decades, quite a number of credible and not so credible strategies for the long-term preservation of digital documentation have been proposed. The digital preservation strategies most prominently discussed in the literature include creating computer museums, copying to paper or microfilm, converting to standard formats or into software independent modes, the "emulation" strategy, and the conversion or migration of records. Let us now look at these strategies in more detail. Computer Museums One preservation strategy recommends that society create museums of obsolescent hardware and software, as a means of maintaining continuing access to digital materials. On the whole, most experts have dismissed this strategy as unrealistic and too expensive. One critic of this strategy, Jeff Rothenberg, observes that this strategy ignores the fact that data will have to be transferred to new media that did not exist when the document's original computer was current. "The museum approach would therefore require building unique new device interfaces between every new medium and obsolete computer." 58 Another critic observes, “the likelihood of keeping Jeff Rothenberg, “Avoiding Technological Quicksand: Finding a Viable Technological Foundation for Digital Preservation” Section 5. “Technical Dimensions of the Problem” (Washington, D.D.: Council on Library and Information Resources, 1999) at http://www.clir.org/cpa/reports/rothenberg/contents.html; See also John Garrett and Donald Waters, Preserving Digital Information: Report of the Task Force on Archiving Digital Information, Section on “The Challenge of Archiving Digital Information” (Washington, D.D.: Commission on Preservation and Access and Research Libraries Group, 1996) at http://lyra.rlg.org/ArchTF/tfadi.index.htm ; and Gregory S. Hunter, Preserving Digital Information (New York: Neal-Schuman Publishers, Inc., 2000), pp. 5-10; 57 Jeff Rothenberg, “Avoiding Technological Quicksand: Finding a Viable Technological Foundation for Digital Preservation” at http://www.clir.org/cpa/reports/rothenberg/contents.html; David Bearman, "Collecting Software: A New Challenge for Archives & Museums", Archival Informatics Technical Reports vol. 1, #2, Summer 1987; and David Bearman, “Reality and Chimeras in the Preservation of 58 28 any piece of machinery running for many decades is simply not very high, since replacement parts, chips, and software could not be easily reproduced. A computer system is far more complex than a steam locomotive or shuttle loom.” 59 Finally, critics have observed that this strategy is not compatible with the widely held notion that data and information in legacy systems should be easily accessible over time and integrated with current information and technology architectures. 60 COPYING TO PAPER OR MICROFILM Another preservation strategy endorsed and practiced by some is to create a paper or microfilm copy of the digital object. Paper and microfilm are more chemically stable than digital media, and no special hardware or software is required to retrieve information from them. Most archivists, however, view this strategy as a short-term fix with only limited applications. Archivists and technologists argue that copying to paper might be a solution when the information exists in a "software independent" format such ASCII or as flat files with simple, uniform structures. It is not a viable strategy, they argue, for preserving complex data objects in complex systems. Jeff Rothenberg expresses the opinion of many when he writes: "Printing any but the simplest, traditional documents results in the loss of their unique functionality (such dynamic interaction, nonlinearity, and integration), and printing any document makes it no longer truly machine readable, which in turn destroys it core digital attributes (perfect copying, access, distribution, and so forth). Beyond this loss of functionality, printing digital documents sacrifices their original form, which may be of unique historical, contextual or evidential interest." 61 Electronic Records” D-Lib Magazine, Vol. 5, No. 4 (April 1999) at http://www.dlib.org/dlib/april99/bearman/04bearman.html Terry Cook, “It’s 10 O’Clock: Do You Know Where Your Data Are?” Technology Review (January 1995) on the Web at http://www.techreview.com/articles/dec94/cook.html 59 60 A colleague, Richard Barry, made this point to the author in a memo dated September 11, 2000. Jeff Rothenberg, “Avoiding Technological Quicksand: Finding a Viable Technological Foundation for Digital Preservation” – “3. Preservation in the Digital Age” Section at http://www.clir.org/cpa/reports/rothenberg/contents.html 61 29 CONVERTING TO STANDARD FORMATS OR INTO SOFTWARE INDEPENDENT MODES Proponents of employing current technical standards to preserve digital objects argue that converting these objects to current standard forms, and migrating to new standards if necessary, is the surest way to ensure that the document survives. The strategy is based on the assumption that "standards initiatives that address business needs for the secure and reliable exchange of digital information among the current generation of systems will impose standardization and normalization of data that ultimately will facilitate migrations to new generations of technology." 62 At present the preferred formats for textual records are Standard Generalized Markup Language (SGML), Extensible Markup Language (XML) or Rich Text Format Perhaps the most promising digital preservation project employing a standards approach is the San Diego Supercomputer Center (SDSC) Project. SDSC personnel define the challenge of preserving digital objects as "the ability to discover, access, and display digital objects that are stored within an archives, while the technology used to manage the archives evolves.” The goal “is to store the digital objects comprising the collection and the collection context in an archive a single time.” 63 To achieve this, the SDSC solution or strategy creates infrastructure independent representations of digital objects in XML and develops migration strategies for upgrading any infrastructure component of the system. 64 Like all preservation strategies, converting to standard formats has it detractors. The most common criticism of standard format as a long-term solution is their relatively short life span. David Bearman expresses this sentiment when he writes, “no computer technical standards have yet shown any likelihood of lasting forever -- indeed most have become completely obsolete within a couple of software generations.” 65 As for migrating to Margaret Hedstrom, “Digital Preservation: a Time Bomb for Digital Libraries,” - Section on “Current Preservation Strategies and Their Limitations” at http://www.uky.edu/~kiernan/DL/hedstrom.html 62 Reagan Moore, et al., “Collection-Based Persistent Digital Archives: Part I,” D-Lib Magazine, Vol. 6, No. 3 (March 2000) – “Introduction” Section and “Managing Persistence” Section – available at http://www.dlib.org/dlib/march00/moore/03moore-pt1.html 63 Ibid., also see Reagan Moore, et al., “Collection-Based Persistent Digital Archives: Part II,” D-Lib Magazine, Vol. 6, No. 4 (April 2000) at http://www.dlib.org/dlib/april00/moore/04moore-pt2.html 64 David Bearman, “Reality and Chimeras in the Preservation of Electronic Records” at http://www.dlib.org/dlib/april99/bearman/04bearman.html; see also Jeff Rothenberg, “Avoiding Technological Quicksand: Finding a Viable Technological Foundation for Digital Preservation” at 65 30 new standards, Jeff Rothenberg claims this is “analogous to translating Homer into modern English by way of every intervening language that has existed during the past 2,500 years. The fact that scholars do not do this (but instead find the earliest original they can, which they then translate directly into the current vernacular) is indicative of the fact that something is always lost in translation. Rarely is it possible to recover the original by retranslating the translated version back into the original language.” 66 Finally, like most other preservation strategies, conversion to standard formats presents certain risks in regard to maintaining the authenticity of the record. 67 In sum, critics of converting digital objects to standard forms and migrating to new standards if necessary would agree with Jeff Rothenberg’s judgment that this “may be a useful interim approach while a true long-term solution is being developed.” 68 The strategy of transferring records to software independent formats, such “plain” ASCII text or for hierarchical and relational database records, a flat table structure, has the advantage of moving records out of a software dependent mode, thus ensuring the accessibility of the records for longer periods of time. Most archivists agree, however, that in many cases this advantage is achieved at a great cost, i.e., in the loss of instructions or code used in representing or formatting the record. As a result, “the authenticity of the electronic records as ‘imitative copies’ that replicate the structure, content, and context of the original records could no longer supported.” 69 In other words, the evidence required to understand and interpret a record may no longer be present. http://www.clir.org/cpa/reports/rothenberg/contents.html; in this publication Rothenberg writes that “even the best standards are often bypassed and made irrelevant by the inevitable paradigm shifts that characterize information science—and will continue to do so.” Jeff Rothenberg, “Avoiding Technological Quicksand: Finding a Viable Technological Foundation for Digital Preservation” – “6.2 Reliance on Standards” Section at http://www.clir.org/cpa/reports/rothenberg/contents.html 66 67 Charles Dollar, Authentic Electronic Records, p. 68-69. Jeff Rothenberg, “Avoiding Technological Quicksand: Finding a Viable Technological Foundation for Digital Preservation” – “6.2 Reliance on Standards” Section at http://www.clir.org/cpa/reports/rothenberg/contents.html 68 Charles Dollar, Authentic Electronic Records, p. 67; see also Margaret Hedstrom, “Digital Preservation: a Time Bomb for Digital Libraries” at http://www.uky.edu/~kiernan/DL/hedstrom.html 69 31 Emulation Jeff Rothenberg advocates another digital preservation strategy, which he calls "emulation." Rothenberg argues that other proposed solutions “are short-sighted, labor-intensive, and ultimately incapable of preserving digital documents in their original forms. 70 The only reliable way to recreate a digital object's original functionality, he argues, is "to run the original software under emulation on future computers. This is the only reliable way to recreate a digital document's original functionality, look and feel." 71 According to Rothenberg, implementation of the emulation approach involves "1) developing generalizable techniques for specifying emulators that will run on unknown future computers and that capture all of those attributes required to recreate the behavior of current and future digital documents; 2) developing techniques for saving - in human readable form the metadata needed to find, access and recreate digital documents, so that emulation techniques can be used for preservation; 3) developing techniques for encapsulating documents, their attendant metadata, software, and emulator specifications in ways that ensure their cohesion and prevent their corruption." 72 Some archivists, such as David Bearman, argue forcefully that emulation is an impractical and ineffective strategy for preserving records. Most critically, Bearman argues, it is a fundamentally flawed process from a recordkeeping perspective, because the strategy is "trying to preserve the wrong thing by preserving information systems functionality rather than records. As a consequence, the emulation solution would not preserve electronic records as evidence.” 73 Of all preservation strategies presently under review, emulation is the most untested and experimental. 74 Because Jeff Rothenberg, “Avoiding Technological Quicksand: Finding a Viable Technological Foundation for Digital Preservation,” - “1. Introduction” Section at http://www.clir.org/cpa/reports/rothenberg/contents.html 70 Jeff Rothenberg, “Avoiding Technological Quicksand: Finding a Viable Technological Foundation for Digital Preservation,” - “Executive Summary” Section at http://www.clir.org/cpa/reports/rothenberg/contents.html. 71 Jeff Rothenberg, “Avoiding Technological Quicksand: Finding a Viable Technological Foundation for Digital Preservation,” – “8. The Emulation Solution” Section at http://www.clir.org/cpa/reports/rothenberg/contents.html 72 73 David Bearman, “Reality and Chimeras in the Preservation of Electronic Records” at http://www.dlib.org/dlib/april99/bearman/04bearman.html Emulation is presently being researched at the University of Michigan’s School of Information. For information on this research, see ?? 74 32 of the need to create emulators and to encapsulate a great deal of data, emulation is also potentially the most expensive preservation strategy. Migration/Conversion There is no question that one of the most popular preservation strategies at present is the set of activities described in the process known as migration. Despite its popularity, however, the definition of migration is still very much debated and unsettled. A popular definition provided by the Task Force on Archiving Digital Information describes migration as "the periodic transfer of digital materials from one hardware/software configuration to another, or from one generation of computer technology to a subsequent generation. The purpose of migration is to preserve the integrity of digital objects and to retain the ability for clients to retrieve, display, and otherwise use them in the face of constantly changing technology." 75 Proponents of this definition emphasize that unlike the older strategy know as "refreshing" or the process of copying digital information onto new media, migration addresses both the obsolescence of the storage media and of the hardware/software controlling and managing the digital documents. As such migration is a boarder and richer concept than refreshing. Other archivists and technologists, most notably Charles Dollar, find this definition of migration too broad and inclusive, and propose a set of definitions that clearly distinguish routine conversion of records from more complex migration strategies. Dollar and Gregory Hunter define conversion “as the automatic transfer of authentic electronic records from one application environment to a new application environment with little or no loss in structure and no loss of content or context even though the underlying bit stream is altered.” 76 A typical example of converting electronic records is moving them from one software environment or application to another, such as converting a file from WordPerfect to Microsoft Word. Dollar “limits the migration of electronic records to narrow circumstances in which neither backward compatibility nor export/import gateways exist between the legacy system that contains the records and the new application system.” 77 In Dollar’s view, the primary difference between migration and other 75 Task Force on Archiving Digital Information, Preserving Digital Information (The Commission on Preservation and Access and the Research Libraries Group, Inc., 1996), p. 6 at http://www.rlg.org/ArchTF/ 76 Charles Dollar, Authentic Electronic Records, p. 65; see also Gregory S. Hunter, Preserving Digital Information (New York: Neal-Shuman Publishers, Inc., 2000), pp. 57-58. 77 Charles Dollar, Authentic Electronic Records, p. 69. 33 digital preservation strategies is “that migration involves proprietary legacy systems that lack export software functionality and the only way now known to migrate the records along with essential software functionality to an open system is to write special purpose code or programs.” 78 Consistent with this definition, Dollar views migration as the most complex and costly of the digital preservation strategies. Because it is a complex preservation strategy, migration presents serious challenges to the records’ professionals attempting to preserve authentic records. Of particular note, are the potential loss of structure and functionality resulting in the inability to faithfully represent, use and interpret the record. 79 Because of these issues, opponents of migration, like Jeff Rothenberg, argue that "migration is essentially an approach based on wishful thinking." He argues that experience with migrating digital documents has clearly demonstrated the process to be "labor-intensive, timeconsuming, expensive, error-prone and fraught with the danger of losing or corrupting information." 80 At this point in time, however, Rothenberg and other critics of migration remain in the minority. The most prevalent view is that migration is a legitimate digital preservation strategy, and, along with converting to standard formats, offers the best hope for the future. Yet, even the most vociferous advocates of migration recognize that much additional research involving a variety of different types of systems and digital objects is needed to test the technical feasibility, establish best practices and identify costs. To date there simply has not been enough research to accurately "predict when migration will be necessary, how much reformatting will be needed, and how much migration will cost." 81 At this point in time, it is still fair to state that migration as a strategy for maintaining access to complex digital objects over time remains largely experimental and untested. 78 Ibid, p. 31. 79 Ibid, p. 31-32. Jeff Rothenberg, “Avoiding Technological Quicksand: Finding a Viable Technological Foundation for Digital Preservation,” - “6.4 – Reliance on Migration” Section at http://www.clir.org/cpa/reports/rothenberg/contents.html 80 Margaret Hedstrom, “Digital Preservation: a Time Bomb for Digital Libraries,” - Section on “Migration” at http://www.uky.edu/~kiernan/DL/hedstrom.html 81 34 CUSTODY Where are electronic records to be physically housed, and who will service them? In response to these questions, archivists have put forward two possible strategies: 1) Centralized Archival Custody Approach - "Archives as a Place"; and 2) Non-Custody, "Post Custody”, or “Distributed Custody” Approach. Centralized Custody Approach Supporters of the centralized custody model argue that the authenticity over time of inactive records can be ensured only when their custody is entrusted to professional archivists. In the words of one advocate: “The life cycle of the managerial activity directed to the preservation of the integrity of electronic records may be divided into two phases: one aimed at the control of the creation of reliable records and to the maintenance of authentic active and semi-active records, and the other aimed at the preservation of authentic inactive records.” 82 The position of the proponents of this argument can be characterized as a centralized archival custody approach, or “Archives as a Place,” where there must exist an “archival threshold” or “space beyond which no alteration or permutation is possible, and where every written act can be treated as evidence and memory.” 83 More specifically, proponents of this position identify five reasons inactive records should be transferred to an archival repository and not left in the custody of the record creators. 1) Mission - Competencies: It is not part of the mission of the creating agency, nor does its staff possess the necessary skills to safeguard the authenticity of non-current, archival records. 2) Ability to Monitor Compliance: There are not enough trained archivists available to monitor or audit records in a distributed custody environment. 3) Cost to Monitor Compliance: Costs to manage records in a distributed environment are as yet unknown and untested, but it may likely be more costly to monitor recordkeeping practices than to assume custody of the records. 4) Changes in Work Environment: Changes in staffing and in departmental priorities can place records left with creating offices at great risk. 82 Luciana Duranti, “Archives as a Place,” Archives and Manuscripts, 24, No. 2 (November 1996): 252. 83 Ibid, p. 252. 35 5) Vested Interests: Inactive records must be taken from those who have a vested interest in either corrupting or in neglecting the records. 84 For all these reasons, supporters of the "Archives as Place" argument conclude “that the routine transfer of records to a neutral third party, that is, to a competent archival body…is an essential requirement for ensuring their authenticity over time.” 85 Distributed Custody Approach As opposed to the “Archives as a Place” position, archivists who support a less centralized custody model portray their strategy regarding custody and use as a “Post-Custody” or “Distributed Custody” approach. In this strategy, the transfer of inactive records to an archives may be delayed or deferred for much longer periods than in the past; in some cases, the records may actually remain indefinitely in the custody of the originating office. The basic premise supporting this position is that in the electronic environment archival institutions can fulfill their responsibilities without assuming physical custody of the records. To achieve these goals, however, archivists must develop new methodologies and techniques for managing records in a distributed custody environment. Proponents of this strategy identify four arguments to support their position of distributed custody and access. 1) Costs: It would be enormously expensive and a massive waste of resources to attempt to duplicate within the archival setting the technological environments already in place within the creating offices. 2) Changes in Technology: Rapid technological change and reluctance of manufacturers to support old hardware make it extremely difficult for a centralized repository to manage an institution’s electronic records. 3) Skills Required: It would be difficult, if not impossible, for an archives staff to learn the skills and provide the expertise needed to access and preserve the wide variety of technologies and formats in use. For articulation of these arguments see Duranti and MacNeil, “The Protection of the Integrity of Records,” pp. 46-67; Duranti, “Archives as a Place,” pp. 242-255; Terry Eastwood, “Should Creating Agencies Keep Electronic Records Indefinitely?,” Archives and Manuscripts, Vol. 24, No. 2 (November 1996): 256-267; Ken Thibodeau, “To Be Or Not To Be: Archive Methods for Electronic Records” in Archival Management of Electronic Records, ed. by David Bearman, Archives and Museum Informatics Technical Report, No. 13 (Pittsburgh, PA: Archives and Museum Informatics, 1991): 1-13. 84 85 Luciana Duranti and Heather MacNeil, “The Protection of the Integrity of Electronic Records,” p. 60. 36 4) Loss of Records: Insisting on custody will result in some cases in leaving important records outside the recordkeeping boundary. 86 In the words of one advocate of this position, “archivists cannot afford – politically, professionally, economically, or culturally – to acquire records except as a last resort… Indeed, the evidence indicates that acquisition of records and the maintenance of the archives as a repository, gets in the way of achieving archival objectives and that this dysfunction will increase dramatically with the spread of electronic communications.” 87 As some archivists have argued, however, the primary issue may not be custody, but rather ensuring that a viable and widely accepted system for managing electronic records is in place. This means establishing policies and procedures that ensure that no matter where the records are housed they will be managed according to well-established standards. More specifically, a distributed strategy for custody necessitates the creation of legally binding agreements with offices, of reliable means of auditing records, of an extensive network of training programs, and of other mechanisms designed to ensure that custodians of records understand their responsibilities and are living up to those expectations. An Australian archivist sums up this position when he writes: “The real issue is not custody, but the control of records and the archivist’s role in this…What archivists should have been talking to their clients about is not custody, but good recordkeeping practices which make it possible for archivists to exercise the necessary control.” 88 For descriptions of the “Distributed Custody” approach and articulation of arguments for implementing this strategy see David Bearman, “An Indefensible Bastion: Archives Repositories in the Electronic Age,” in Archival Management of Electronic Records, ed. by David Bearman, Archives and Museum Informatics Technical Report, No. 13 (Pittsburgh, PA: Archives and Museum Informatics, 1991): 14-24; Greg O’Shea and David Roberts, “Living in a Digital World,” pp. 286-311; Adrian Cunningham, “Journey to the End of Night: Custody and the Dawning of a New Era on the Archival Threshold,” Archives and Manuscripts, Vol. 24, No. 2 (November 1996): 312-321; Charles Dollar, The Impact of Information Technologies on Archival Principles and Methods (Macerata, Italy: University of Macerata, 1992): pp. 5355, 75. 86 David Bearman, “An Indefensible Bastion: Archives Repositories in the Electronic Age,” in Archival Management of Electronic Records, p. 14. 87 Adrian Cunningham, “Ensuring Essential Evidence,” paper for the National Library of Australia News, November 1996, On-line version located at http://www.nla.gov.au/nla/staffpaper/acunning5.html 88 37 OVERALL MODELS FOR MANAGING ELECTRONIC RECORDS Strategies for managing electronic records have been described and depicted within two basic records management models or theoretical frameworks: the records life cycle model and the records continuum. Life Cycle Model The life cycle model for managing records, as articulated by Theodore Schellenberg and others, has been the prominent model for North American archivists and records managers since at least the 1960s. However, the question being asked recently is: does the model provide a viable strategy for managing electronic records? Before we examine archivists’ responses to this question, let us briefly review the basic characteristics of the life cycle model. This model portrays the life of a record as going sequentially through various stages or periods, much like a living organism. In stage one, the record is created, presumably for a legitimate reason and according to certain standards. In the second stage, the record goes through an active period when it has maximum primary value and is used or referred to frequently by the creating office and others involved in the decision-making process. During this time the record is stored on-site in the active or current files of the creating office. At the end of stage two the record may be reviewed and determined to have no further value, at which point it is destroyed, or the record can enter stage three, where it is relegated to a semiactive status, which means it still has value, but is not needed for day-to-day decision-making. Because the record need not be consulted regularly, it is often stored in a off-site storage center. At the end of stage three, another review occurs, at which point a determination is made to destroy or send the record to stage four, which is reserved for inactive records with long-term, indefinite, archival value. This small percentage of records (normally estimated at approximately five per cent of the total documentation) is sent to an archival repository, where specific activities are undertaken to preserve and describe the records. The life cycle model not only describes what will happen to a record, it also defines who will manage the record during each stage. During the creation and active periods, the record creators have primary responsibility for managing the record, although records managers may well be involved to various degrees. In the semi-active stage, it is the records manager who takes center stage and assumes major responsibility for managing the 38 records. Finally, in the inactive stage, the archivist takes the lead in preserving, describing, and providing access to the archival record. 89 To summarize, the life cycle model has contributed, particularly in North America, to the creation of a fairly strict demarcation of responsibilities between the archives and records management professions. Among archivists it has resulted in a tendency to view the life of a record in terms of pre-archival and archival and active and inactive, and to regard the stage when the archivist intervenes in the cycle as occurring sometime towards the end of the life cycle when the record becomes inactive and archival. The chief supporters of the life cycle model as it pertains to electronic records have come from the electronic records research project team at the Master of Archival Science Program at the University of British Columbia. The directors of this project, Luciana Duranti and Heather MacNeil, write that what makes the life cycle model and its division of responsibilities so valuable is that it “ensures the authenticity of inactive records and makes them the impartial sources that society needs.” 90 According to UBC personnel, the intellectual methods required to guarantee the integrity of active records are very much different than those required for inactive records. Hence, it is argued, there must exist a two-phase life cycle approach to the management of records, the creating body “with primary responsibility for their reliability and authenticity while they are needed for business purposes, and the preserving body with responsibility for their authenticity over the long term.” 91 Records Continuum Model Criticisms of the life cycle model as means of managing records have surfaced at times in the past, but it has been the emergence of electronic records that has initiated a very spirited debate. This dialogue has resulted in not only a critique of the model but in the definition of an alternate model or framework. This alternate model has come to be most commonly referred 89 For a summary of the Life Cycle concept see Ira A. Penn, Gail Pennix and Jim Coulson, Records nd Management Handbook (Hampshire, England: Gower Publishing Limited, 2 Edition, 1994), pp. 12-17. 90 Duranti and MacNeil, “The Protection of the Integrity of Electronic Records,” p. 62. Ibid, p. 60; for an extensive discussion of the concepts of reliability and authenticity, see Duranti’s article “Reliability and Authenticity: The Concepts and Their Implications,” Archivaria 39 (Spring 1995): 5-10. 91 39 to as the “Records Continuum Model.” What is this continuum model, why did it emerge, and how does it differ from the life cycle model? Discussions of strategies for better integrating the activities of archivists and records managers date back at least several decades. 92 It was not until the 1990s, however, that a more formally constructed model emerged for viewing records management as a continuous process from the moment of creation, in which archivists and records managers are actively involved at all points in the continuum. The primary motivation in formulating and supporting this model was a concern that lacking a strategy for active and early intervention by the archivist in the records management process, electronic records documenting vital transactions may never be created, may never be fully documented, or may never survive. 93 Perhaps the most basic difference between the continuum model and the life cycle approach is that while the life cycle model proposes a strict separation of records management responsibilities, the continuum model is based upon an integration of the responsibilities and accountabilities associated with the management of records. The new Australian records management standard, which has adopted the continuum model, defines the integrated nature of the record continuum in the following terms: the record continuum is “the whole extent of a record’s existence.” It “refers to a consistent and coherent regime of management processes from the time of creation of records (and before creation, in the design of recordkeeping systems) through to the preservation and use of records as archives.” 94 A noted Australian archivist describes the differences between the life cycle and continuum models in the 92 For discussions of a records continuum theory that pre-dates the archival dialogue on electronic records see Frank Upward, “In Search of the Continuum: Ian Maclean’s ‘Australian Experience’ Essays on Recordkeeping” in The Records Continuum. Ian Maclean and Australian Archives First Fifty Years, Sue McKemmish and Micheal Piggott, eds. (Clayton, Victoria: Ancora Press, 1994): 110-130; and Jay Atherton, “From Life Cycle to Continuum: Some Thoughts on the Records Management-Archives Relationship,” Archivaria, Vol. 21 (Winter 1985-1986): 43-51. 93 The primary proponents of the continuum model have been archivists in the Australian archival community. The research project that most embodies the premises of the continuum model is the University of Pittsburgh Functional Requirements project. For descriptions of the records continuum model see Frank Upward, “Structuring the Records Continuum. Part One, Post-custodial Principles and Properties,” Archives and Manuscripts, Vol. 24, No. 2 (November 1996): 268-285; Frank Upward, “Structuring the Records Continuum. Part Two: Structuration Theory and Recordkeeping," Archives and Manuscripts, Vol. 25, No. 1 ( May 1997): 10-35; Adrian Cunningham, “Journey to the End of the Night: Custody and the Dawning of a New Era on the Archival Threshold - A Commentary,” pp. 312-321; and David Bearman, “Item Level Control and Electronic Recordkeeping,” Archives and Museum Informatics. Cultural Heritage Informatics Quarterly, pp. 242-245. 94 AS 4390.1-1996F: General, Clause 4.6 40 following manner: “The life cycle relates to records and information…records have a life cycle…The continuum is not about records. It is about a regime for recordkeeping. The continuum is a model of management that relates to the recordkeeping regime,” which is “continuous, dynamic and ongoing without any distinct breaks or phases.” 95 A direct result of viewing records management as a continuum is to undercut and destroy the distinction between active and inactive, and archival and non-archival records, and to blur or wipe out the defined set of responsibilities associated with managing records at each stage. One of the consequences of this viewpoint is to propel archivists and archival functions forward in the records management process. In other words, according to the continuum model, strategies and methodologies for appraising, describing, and preserving records are implemented early in the records management process, preferably at the design stage, and not at the end of the life cycle. 96 CONCLUSION Reviewing the work of the decade in electronic records management, it is easy to be pessimistic and to overlook the achievements. Even though the profession is still lacking consensus on a number of issues, there has been some remarkable progress on many fronts. In the identification and capture of electronic records, there is widespread recognition that automated environments present new challenges requiring different methodologies and techniques. In general, archivists working with electronic systems understand that transaction processing systems will not consistently and systematically produce records. To ensure that records are identified and captured, archivists have been promoting the creation of 95 Ann Pederson in an e-mail message to the Australian Archivists listserv, 17 February 1999. 96 Another model or framework for conceptualizing electronic records management has come to be known as the “Steering Rather Than Rowing Approach” to managing archives. The main features of this strategy are a greater emphasis on archival monitoring and oversight activities, on empowering others to solve their record problems, and finally, on developing a decentralized or distributed approach to archival management. It is a strategy that has much in common with the Records Continuum Model. The “Steering Rather Than Rowing” strategy for archives was introduced by David Bearman and Margaret Hedstrom in “Reinventing Archives for Electronic Records: Alternate Service Delivery Options,” in Electronic Records Management Program Strategies, ed. Margaret Hedstrom (Pittsburgh, PA: Archives and Museum Informatics, 1993), pp. 82-98. 41 conceptual models, which identify when and where records are generated. What has been slow to develop, however, is a methodology for undertaking and creating these models. Moreover, for many archivists moving from a methodology for identifying records based on physically reviewing objects to one based largely on analyzing conceptual models of record creation continues to be a very difficult transition. Theories on the appraisal of electronic records have clearly tended to focus on functions and business processes as the keys to understanding the context and value of records. The goal of preserving and making accessible evidence as found in the transactions or activities that generated the record is repeated over and over again in the literature on electronic records. Functional appraisal, of course, is a not a new concept, but electronic records management has elevated the model to new heights and to a level of popularity previously unknown. In reaction to this development, some archivists are now claiming that the profession has gone too far in its emphasis on evidence, and that archivists are in danger building an appraisal methodology that fails to properly identify the secondary values of records and particularly informational values. Certainly, one of the tasks for the next decade will be to create appraisal theories for the modern age of records that satisfy all requirements for record value, and that are capable of helping “society remember its past, its roots, its history, which by definition combines recorded evidence of both the private and the public, the institutional and the personal.” 97 In the area of documenting records, there is universal agreement that archivists need to define the categories and types of metadata that must be present to preserve a reliable and authentic record. Consequently, numerous lists of metadata specifications have been created during the last five years. Increasingly there is a consensus among archivists concerning the basic categories of metadata that systems must capture and retain with record content. Most of the metadata lists include documentation in varying degrees of detail on the content and structure of the record and the context of its creation. What has not yet been developed or accepted is a core set or minimum set of metadata standards. Terry Cook, “Beyond the Screen: The Records Continuum and Archival Cultural Heritage,” page 11; paper delivered at the Australian Society of Archivists Conference, Melbourne, 18 August 2000, available at http://www.archivists.org.au/sem/conf2000/terrycook.pdf 97 42 As to how and when these documentation activities will be undertaken, the prominent, but certainly not the universal view, is that traditional methodologies for documenting records will have to change. Critics of employing traditional methodologies to describe electronic records argue that methods based on direct observation and review will not work, and that the finding aids produced will not adequately describe these complex systems. As an alternative strategy they recommend a shift to the management of system metadata, but they caution that this strategy will only work if archivists define and articulate the required metadata elements and are involved near or at the beginning of the design process. Supporters of traditional descriptive practice for electronic records assert that because of the unique and vital role of archival description in maintaining authenticity and in describing the context of records over time, metadata systems cannot replace traditional archival description. The answers they claim will be found by following “the dictates of archival science,” and by building strategies “on the foundation of descriptive principles and practices that have already been established.” An unresolved question is whether there might be a better, more effective approach that would accommodate both views. As with metadata, archivists have universally agreed that the profession must develop a precise description of how recordkeeping systems must capture, manage and preserve records. Again as in the case of metadata, this consensus of opinion has resulted in the creation of numerous lists of recordkeeping requirements. Among the most encouraging developments in this area is a growing recognition by software vendors and creators of records of the importance of incorporating recordkeeping functionality into systems design. The growing prominence of recordkeeping is demonstrated by the fact that recordkeeping models at the national government level, most notably those developed by the U.S Department of Defense and by the National Archives of Australia, have emerged as standards for not only government agencies, but also for software vendors. 98 The question of how best to ensure the long-term preservation of authentic records remains largely unresolved. Several viable strategies exist, but each has its own set of risks and liabilities as complete and long-term solutions. In the last few years, some important research on long-term preservation has 98 Recently, this author participated in a search for enterprise-wide document management software. It was encouraging to see that all the vendors that were interviewed had made plans to incorporate recordkeeping functionality into their systems. Usually this meant, as in the case of IBM and FileNet, partnering with some smaller vendor specializing in records management, such as Tower or Provenance. 43 been undertaken, in particular the work dealing with conversion to standard formats and migration to new standards being undertaken at the San Diego Supercomputer Center and other institutions. Meanwhile, of course, institutions are moving ahead to develop preservation strategies and to address current needs as best they can. This has prompted some experts, like Charles Dollar, to state that “too much attention has been devoted to ensuring access to electronic records fifty or one hundred years from when we have no way of forecasting what kinds of technology will be available then.” Dollar goes on to say, that a more productive or at least parallel line of research to long-term access, it to “focus on a much shorter time frame, perhaps on the order of ten to twenty years or so, during which time information technologies are likely to be relatively stable.” 99 While research on long-term solutions to the preservation of digital objects will certainly continue, it is likely that for the foreseeable future, most professionals in the field will be working on establishing best practices and guidelines designed to address current and ongoing preservation needs and requirements. Custody of electronic records has been perhaps the most contentious issue to date. Proponents of centralized and of distributed custody feel strongly that the archival record will not survive unless their strategies are adopted, and opponents of this position feel just as strongly that records will be destroyed or altered if records remain with the creators. Archivists who see merit in both these arguments are arguing for adoption or at least the testing of a compromise position, becoming known as the “Semi-Custodial” strategy. The problem is that there is still not nearly enough evidence to justify adopting any of these positions, and many more field tests and applications will be required to document which of these strategies, alone or in combination, will prove most effective. Finally, when one looks for an overall framework or model to guide electronic archives management, it clear that most archivists favor a model that advocates a much more active role by archivists in the management process. Increasingly, archivists are recommending active involvement in all phases of the recordkeeping regime. However, much research and testing needs to be completed to determine just how this strategy will be implemented and how archivists will interact with other records management partners. 99 Charles Dollar, Authentic Electronic Records, p. 5. 44 In conclusion, in the last decade archivists have made significant strides in the quest to develop strategies for managing electronic records. Perhaps the most important advances have been in the areas of identifying issues and developing a variety of theoretical frameworks or models for addressing these challenges. Most archivists would agree that the profession is going in the right direction. What they cannot yet predict is precisely where they will end up or exactly how they will get to their destination. In other words, while the decade has witnessed the creation of many significant and potentially useful management models or strategies, the profession still lacks examples of concrete applications or field tests demonstrating the value of these concepts. In the words of one prominent archival educator: “What we lack is an evaluation of the usefulness of these findings from the perspective of organizations that are responsible in some way for preserving and providing access to electronic records. We need assessments from the administrators of archival and records management programs about the feasibility of putting the proposed policies and models into practice.” 100 In short, archivists will likely characterize the 1990s as a decade that witnessed the emergence of many new and creative theories, concepts, and strategies for managing electronic records. Hopefully, the first decade of the 21st century will be equally well remembered as a period when archivists tested and evaluated these various theories and began to implement proven and realistic policies, methodologies and techniques for managing electronic records. 100 Margaret Hedstrom, Electronic Records Research and Development. Final Report of the 1996 Ann Arbor Conference (Ann Arbor, MI), p. 37. 45