IST 547: Electronic Records Management University at Albany, College of Computing and Information Department of Information Studies Spring 2016 M 7:15-10:05pm Instructor: Catherine Stollar Peters Contact Information: Email: cstollarpeters@gmail.com Cell phone: (to be used only in emergencies) (512) 573-0081 Office location: TBD Office Hours: By appointment as arranged by student and instructor. I am usually available just before and just after class. If there is not sufficient time to discuss your questions during those times, it is very easy to set up a time to meet with me during the week or weekend. Either talk to me after class or email me to setup a meeting. Course Overview: This course is an introduction to issues in record keeping in the digital age. In addition to covering issues related to electronic records management, we will discuss digital curation, web archiving, personal information management, and managing electronic records in manuscript repositories. Course Objectives: At the end of this class, students should understand: Structures of electronic records and levels of representation Models for understanding records creation, use, disposal, and curation Implications of authenticity, integrity, reliability, and usability for records in electronic systems Issues related to long- and short-term retention of electronic records and strategies for mitigating those issues Record keeping strategies for electronic records in a variety of environments including: corporations, governments, cultural heritage institutions, and personal archives Course Expectations: Students are expected to attend almost every class, participate in class discussions, complete assignments on time, and complete a final class project. This course will largely depend on student participation in discussions. In order to contribute adequately to class discussions, students will need to read materials before class. 10% of each student’s grade comes from inclass participation and a failure to contribute to discussions in-class will result in a lowering of the participation portion of your final grade. Academic Integrity: Please consult the undergraduate bulletin for guidance in regards to plagiarism and academic integrity at www.albany.edu/undergraduate_bulletin/regulations.html. The excerpt below identifies the importance of academic integrity in the pursuit of learning that we value in this course. “As a community of scholars, the University at Albany has a special responsibility to integrity and truth. By testing, analyzing, and scrutinizing ideas and assumptions, scholarly inquiry produces the timely and valuable bodies of knowledge that guide and inform important and significant decisions, policies, and choices. Our duty to be honest, methodical and careful in the attribution of data and ideas to their sources establishes the foundations of our work. Misrepresenting or falsifying scholarship undermines the essential trust on which our community depends. Every member of the community, including both faculty and students, shares an interest in maintaining academic integrity.” Textbook: There is one required textbook for this course: Franks, Patricia C. (2013). Records and Information Management. Chicago, IL: American Library Association. This book is available at the ALA store http://www.alastore.ala.org/detail.aspx?ID=4244 and on Amazon.com. Assignments: Class Participation (10%) File Listing Exercise (5%) File identification exercise (5%) XML Email File Creation (10%) Midterm (30%) Final Project (40%) File listing exercise (5%) due February 15 You will create a list of electronic files that you have on either removable media (such as a flash drive) or on your personal computer. We will discuss methods of creating a file listing in class. File identification exercise (5%) due February 22 I will assign a few files for which you will need to identify the file format. You will need to use a variety of approaches discussed in class to complete this assignment. XML homework (10%) due March 7 You will create an XML file written in the Email Account Schema from an example email provided. In addition to the XML file, you will write a 1-2 page response describing the difference between an email encoded in the Email Account Schema and an email viewed in software (such as a web-based interface, Outlook, Thunderbird, etc.) Midterm (30%) due March 21 The midterm for this course will be composed of multiple choice and short answer questions. Content of the midterm will come from course readings with an emphasis on concepts discussed in class. Final project (40%) due May 2 You will have a few options for completing the final project for this course. You must be present on the final class day to present and watch your classmates present or you will loose 20% of your earned points for this project. Option A: Personal Inventory and Preservation Plan 1. Inventory your records. Create a file listing for electronic records. Make sure you incorporate your paper records into your inventory. How will you survey your paper records? Is your paper records inventory like your electronic records? How and why? 2. Categorize your records into general series. 3. Using your records categories, find a schedule that would apply to your records. Would you implement that schedule? Why or why not? Use references from the course readings to clarify and support your opinions. You may have to adapt a records retention schedule to meet the needs of personal records. 4. Identify preservation strategies for your records by format. 5. What impact would loss of “permanent” records have on you or your family? 6. How will someone access your personal archive? 7. Now consider traces of your digital self on the Internet. Who controls those traces? Are they records? What retention policies apply to those traces? Use references from the course readings to clarify and support your opinions. Turn in to me by Monday, May 2, 2016: 1. Your detailed answers to the questions above. 2. Your inventory documents. (file listings do not count towards page limits) 3. Your records retention schedule (or a reference to it so I can look at it.) Total submission (minus any computer generated file listings and record retention schedule) should be 15-25 pages. Option B: Archive your email. (Minimum of 100 emails) 1. Create an archive of your email. If the original format is proprietary, create an .mbox copy of your archive. 2. Determine retention periods for your email. 3. Determine if you will implement a records retention schedule/retention periods on your email archive. Why or why not? Include references from the readings to justify your position. 4. Visualize your email using one of the tools we discussed in class. How might you use a visualization tool to learn about your email corpus? As an appraisal tool? An access tool? Other reasons? 5. Create an XML file for your email archive. Use the CERP parser, mail2xml, Xena, or the PeDALS extractor. Explain your process. Do you think this is the most effective way to preserve your email corpus? Why or why not? 6. Redact a file in the email archive (the redaction can be in the body of the email or as an attachment.) How did you find the information that needed to be redacted? Will you keep an original version of the redacted file? Is your process scalable to an email archive of 50,000 redactions? 7. Are there other ways to archive your email? Turn in to me by Monday, May 2, 2016: 1. A paper answering the questions above and documenting your process. 2. A screenshot of your visualization(s). 3. Part of the XML file you created (2 pages or so). (You can submit this in Blackboard). 4. The redacted email file or attachment. Total submission (minus the XML file) should be 15-25 pages. Option C: Literature Review Delve further into a topic discussed in class including website archiving, digital forensics in archives, or archiving and big data. Your paper should be a substantial review of the literature of your topic and include areas for future research. I expect you to use 10-15 significant resources in your paper. Turn in to me by Monday, May 2, 2016. Total submission should be 15-20 pages. Assignment weights: 10% 5% 5% 10% 30% 40% Assignment Weights Class participation File listing exercise File identification exercise XML homework Midterm Final project Grading scale: Grade Percent A AB+ B BC+ C CD+ D DF >=93% >=90% AND >=87% AND >=83% AND >=80% AND >=77% AND >=73% AND >=70% AND >=67% AND >=63% AND >=60% AND <60%____ <93% <90% <87% <83% <80% <77% <73% <70% <67% <63% Readings and Assignments (subject to change): Week 1 (1/25): Introduction to course No readings Week 2 (2/1): Overview of issues in electronic records management Chapter 1 and 3 Records and Information Management An, Xiaomi. (2003). An Integrated approach to records management. Information Management Journal July/August (2003) pp. 24–30. (you can skim this article) Week 3 (2/8): Designing and implementing record keeping systems Assigned: File listing exercise (5%) (Due 2/15) Chapter 2 and Chapter 6 of Records and Information Management Duranti, L. Reliability and authenticity: The Concepts and their Implications, Archivaria 39:1-10. ISO 15489 (an overview) http://www.dcc.ac.uk/resources/briefing-papers/standards-watch-papers/iso-15489 Week 4 (2/15): Inventories and file formats Assigned: File identification exercise (5%) (Due 2/22) Smallwood, Robert. (2013). Managing Electronic Records: Methods, Best Practices, and Technologies. Hoboken, NJ: John Wiley & Sons, Inc. Chapter 5: Inventorying E-Records Underwood, W. et al. (2009). Advanced decision support for archival processing of presidential electronic records: Final scientific and technical report. Georgia Tech. pg. 27-30. Review the following file format registries tools and compare them: [REGISTRIES] PRONOM (http://www.nationalarchives.gov.uk/PRONOM/Default.aspx) and UDFR (http://www.udfr.org/) [TOOLS] JHOVE (http://jhove.sourceforge.net/) and DROID (http://www.nationalarchives.gov.uk/information-management/projects-and-work/droid.htm) File format identification tools and COPTR project http://coptr.digipres.org/Category:File_Format_Identification Week 5 (2/22): Inventory, Appraisal, records retention, scheduling and disposal Chapter 4 and 5 in Records and Information Management Bailey, Steve. (2008). Managing the Crowd: Rethinking Records Management for the Web 2.0 World. London: Facet Publishing. Chapter 9: Appraisal, Retention and Destruction. Week 6 (2/29): Unstructured data I: Email, text messages and documents Assigned: XML email homework (10%) (Due 3/7) Prom, Christopher. (2011). Preserving email. DPC Technology Watch Report 11-01, ISSN 20487916, Digital Preservation Coalition 2011. (read to page 21 at minimum.) Rubens, Paul. (2010). The Importance of Managing Unstructured Data. Server Watch, October 29, 2010 http://www.serverwatch.com/trends/article.php/3910671/The-Importance-ofManaging-Unstructured-Data.htm Find an article on backing up text messages (such as one of these) Leswing, Kif. (2013).Total Recall: How to Back Up All the Text Messages on Your iPhone. Wired, November 5, 2013 http://www.wired.com/gadgetlab/2013/11/backup-sms-iphone/ Ashenfelder, Mike. (2012). Archiving Cell Phone Text Messages. The Signal: Digital Preservation. Library of Congress. April 27, 2012 http://blogs.loc.gov/digitalpreservation/2012/04/archiving-cell-phone-text-messages/ Optional: CERP http://siarchives.si.edu/cerp/index.htm (read the CERP overview at CERP_project_summary_122008_CC.pdf) Preservation of Electronic Mail Collaboration Initiative (read overview) EMCAP (http://www.history.ncdcr.gov/SHRAB/ar/emailpreservation/default.htm) ePADD (Stanford) https://library.stanford.edu/projects/epadd EAS (Harvard) http://hul.harvard.edu/ois/systems/eas/ Week 7 (3/7): Unstructured data II: Web 2.0, websites, art and literature Chapter 7 in Records and Information Management Bailey, Steve. (2008). Managing the Crowd: Rethinking Records Management for the Web 2.0 World. London: Facet Publishing. Chapter 7: The Centralized command and control ethos. Garfinkle, Simpson. (2009). Finding and archiving the internet footprint, invited paper, presented at the First Digital Lives Research Conference: Personal Digital Archives for the 21st Century, London, England, 9--11 February 2009. (http://www.simson.net/webprint.pdf ) Optional: Marshall, C. and Shipman, F. Attitudes about institutional archiving of social media, in Proceedings of Archiving 2011, Society for Imaging Science and Technology, May 2011 at http://research.microsoft.com/apps/pubs/default.aspx?id=147623 Rinehart, R. (2000). The Straw that broke the museum's back? Collecting and preserving digital media art works for the next century. Switch [Online]. http://switch.sjsu.edu/web/v6n1/article_a.htm Optional (For local flavor): Lewis, C. Archiving the ephemeral: Processing and preservation problems associated with the iEAR archives. Museum Computer Network 33rd Annual Meeting. November 2-5, 2005. No class (3/14) Week 8 (3/21): Structured data Midterm Witt, M., Carlson, J. and Brandt, D. S. (2009). Constructing data curation profiles. International Journal of Digital Curation. 4(3) pp. 93-103. http://ijdc.net/index.php/ijdc/article/view/137/165 R. Arovelius et al. (2010). Management and preservation of scientific records and data. ICA. (skim this) Gingrich, L and Morris, B. (2006). Retention and disposition of structured data: the next frontier for records managers. The Information Management Journal. March/April pp. 30-39 Week 9 (3/28): Digital curation strategies Chapter 10 in Records and Information Management Rothenberg, J. Ensuring the Longevity of Digital Information available at: http://www.clir.org/pubs/archives/ensuring.pdf Ross, Seamus. (2006). Approaching digital preservation holistically. In Record Keeping in a Hybrid Environment: Managing the creation, use, preservation and disposal of unpublished information objects in context. Ed. Alistair Tough and Michael Moss. Week 10 (4/4): Personal record keeping Marshall, C. (2008). Rethinking personal digital archiving, Part 1: Four challenges from the field, in DLib Magazine, vol. 14, no. 3/4. http://www.dlib.org/dlib/march08/marshall/03marshallpt1.html Marshall, C (2008). Rethinking personal digital archiving, Part 2: Implications for services, applications, and institutions, in D-Lib Magazine, vol. 14, no. 3/4. http://www.dlib.org/dlib/march08/marshall/03marshall-pt2.html Lee, C. A. (2011). And Now the twain shall meet: Exploring the connections between PIM and archives. I, Digital: Personal Collections in the Digital Era. Chicago, Society of American Archivists. Week 11 (4/11): Record keeping in small archives and manuscript archives Erway, Ricky. (2012). You’ve Got to Walk Before You Can Run: First Steps for Managing BornDigital Content Received on Physical Media. Dublin, Ohio: OCLC Research. http://www.oclc.org/research/publications/library/2012/2012-06.pdf. Cook, Terry. Byte-ing off what you can chew: Electronic records strategies for small archival institutions: http://www.aranz.org.nz/Site/publications/papers_online/terry_cook_paper.aspx AIMS. (2012). Born-digital collections: An Inter-institutional model for stewardship. White paper. (Read report without appendixes. Read 1 processing plan in Appendix E). Optional: PARADIGM: Workbook on digital private papers http://www.paradigm.ac.uk/workbook/ Week 12 (4/18): Digital Forensics Lee, Christopher A. (2012). Archival Application of Digital Forensics Methods for Authenticity, Description and Access Provision. In Proceedings of the International Council on Archives Congress, Brisbane, Australia, August 20-24, 2012. Kirschenbaum, M., Ovenden, R. and Redwine. G. (2010). Digital forensics and born-digital content in cultural heritage collections. Washington, DC: Council on Library and Information Resources http://clir.org/pubs/reports/pub149/pub149.pdf Week 13 (4/25): Work day Tool evaluation Week 14 (5/2): Final project presentations