Digital Planning and Implementation Team Primary Goals & Assumptions October 2008

advertisement
Digital Planning and Implementation Team
Primary Goals & Assumptions
October 2008
The University Library is responsible for the long-term management and survival of an
increasing amount of digital content that represents the University’s investments in collections
and other materials that support the institution’s educational and research mission. While the
IDEALS repository supports the deposit by faculty and other researchers of scholarship produced
under the auspices of Illinois, the Library must put in place a similar set of technologies and
services for the digital content that represents its collections and other materials. The
development of a digital library repository is an integral component of the Library’s ongoing
stewardship responsibility to the University. In order to plan for a centralized repository
function, the University Library intends to charge a team to assess program and technical needs,
within the Library and across stakeholder units and programs.
The Repository Planning Team will lead the general planning and brainstorming of a
“centralized repository system” (CRS) for the University Library. Although the singular term
“repository” is used, this CRS may consist of one or more disparate systems which work together
to manage and preserve the Library’s digital content.
The Repository Planning Team is charged with assessing current Library needs, establishing
general system requirements and performing an environmental scan to identify possible
solutions. The primary goal is to provide the Repository Implementation Team with a staged
plan (and intermediate goals) for prototyping and establishing a centralized repository system.
The Repository Planning Team prepares updates, meets periodically with the AUL’s, and
submits a report with recommendations to the AUL’s and CAPT. The AUL for IT is responsible
for coordinating the budget and program statement in preparation for the implementation phase.
Repository Planning Assumptions
1. Despite its name, the “centralized repository system” (CRS) will likely consist of more
than one different software solution. Therefore, we are not looking to identify a single
software candidate as the best and only system. We recognize that different types of
content will likely require different types of systems.
2. The term “repository” in CRS is used generically as a “place to store/manage digital
content.” The software candidates for the CRS should not be limited to common digital
repository software (e.g. DSpace, Fedora, etc.). More simplistic solutions (e.g.
filesystems, databases) should also be investigated.
3. The CRS is primarily concerned with the management, curation and preservation of
finalized Library digital content (i.e. content that is ready for dissemination and is
generally stable or not changing frequently). Although it may be worth brief discussion
Page | 1
of the needs of “living” digital content, this content will likely not be accepted until
much later.
4. At a higher level, the CRS will be modeled after the OAIS Reference Model, and attempt
to follow guidelines laid out by the Trustworthy Repositories Audit & Certification
checklist.
5. Although there may be a public web-access interface, the primary role of the CRS is to
be a backend management system, not another access system.
6. Digital content residing in an existing access system (e.g. IDEALS, Content DM, etc.) will
likely still remain in that location. However, a preservation copy of that content will
likely to be provided to the CRS by these access systems.
7. The CRS system will provide a secure log-in (likely based on NetID)
8. Digital content stored in the CRS may have different levels of permissions. Some
content may be publicly accessible, and others may be restricted to smaller groups of
users.
9. Some digital content stored in the CRS may not be important to preserve for the long
term. It is likely the CRS will need to have an idea of an “expiration date” which could
be assigned for temporary content.
10. The CRS must be able to maintain identifiers (e.g. “handles” assigned by LSDWG or
IDEALS)
11. The CRS must be able to store relationships between files (even if the files exist in
different systems). For example, being able to note that a PDF in IDEALS is another
representation of a set of JPEG2000 images in the CRS.
12. The CRS need not store every important digital file of the Library. However, it’s
recommended that the CRS be able to store identifiers to important digital content
stored elsewhere (e.g. content deemed appropriate for the Hathi Trust may not need to
be duplicated in the CRS, but we may wish to store the identifiers to that content)
13. The CRS needs to have a public API which would allow other (Library or non-Library)
services to access any publicly available content.
14. The CRS will be implemented in many stages. It is important that we move quickly and
take smaller steps, rather than wait to find a solution that will immediately meet all our
needs. The initial implementation will likely be a “bare bones” solution, which may not
meet the needs of all types of content. However, we should plan that “bare bones”
solution such that we can extend it for the future.
15. In order to move quickly, we should concentrate first on Library needs but with the
input from stakeholders outside of the Library so that we can extend the CRS as needed.
16. After each stage of implementation, the Repository Planning Team (and others in the
Library) will have an opportunity to assess the CRS and suggest implementation or
directional changes.
Page | 2
17. Even though the CRS will require a large amount of storage (i.e. disk space), planning
should concentrate on access and preservation needs of digital content, and an
environmental scan of solutions which could meet those needs.
Key Milestones
To be decided…
(6 months after first meeting)
Initial report/recommendations of Planning Team
Issues under Discussion
1. Storage Space. Library digital content and the CRS may or may not reside on Library
servers. Based on decisions of the Planning Team, it’s possible some content may reside
in the Hathi Trust repository. Depending on the space required by remaining content, it
may also be worth a discussion with CITES or NCSA about server space, etc.
2. Ongoing Staffing/Support. Based on determinations of the Planning Team, there will
need to be a broader recommendation on how best to provide ongoing support for the
CRS and the content within the CRS
3. Purchased Digital Content. Is the CRS the most appropriate place for E-books and
electronic journals that the Library has purchased?
4. Intellectual Property. How will the CRS deal with content that may have IP/copyright
issues? Will we need some sort of license agreement for content being disseminated by
the CRS?
Primary Goals of Planning Team

Assess the Library’s needs for a CRS

Work with “Content/User Stakeholders” (see below) to determine the scope of content
and usage needs for the CRS. Generate a list of common questions to ask these
stakeholders (build off of Purdue’s Data Curation questions?)

Assess the types of digital content to be placed in the CRS, and decide scope of initial
stage.

Establish higher level needs for preserving these distinct digital content types (Video,
Audio, Images, etc.)

Establish a list of general requirements for a CRS (in terms of access needs, preservation
needs, API needs, etc.)

Perform an environmental scan of possible solutions or software available.
Page | 3

Scope implementation into a series of higher level stages. Help establish higher level
success criteria for each implementation stage.

Reconvene and assess the success of the CRS after each stage. Provide suggestions for
implementation or directional changes.
Deliverables

Report of CRS requirements and needs, based on types of content stored. This includes
a list of all known content to be placed in the CRS. This also may include a higher level
diagram of how content will be obtained or disseminated via CRS.

Report of environment scan. Recommendations regarding which software solution(s)
we should be prototyping or investigating in further detail.

Recommendations for staged implementation of CRS. How many stages, general
timelines, proposed staffing of Implementation Team, etc.

A written assessment of the initial prototype, and recommendations on how to move
forward with full implementation
Planning Team Membership

Tim Donohue, chair

Tom Habing

Joanne Kaczmarek

Emma Lincoln

Bill Mischo

Sarah Shreeves

Tom Teper

John Weible

Beth Sandore (Administrative Liaison)
Content / User Stakeholders
Although they may not be official members of the Repository Planning Team, these
stakeholders would be worth interviewing in regards to their content or usage needs regarding
a Library CRS. (Tentatively, Beth Sandore has suggested we may be able to obtain a part-time
Grad Hourly to help conduct these interviews.)
Page | 4

Allen Renear and Amit Kumar (GSLIS / MONK project)

Lisa Hinchliffe (Learning Objects)

Betsy Kruger (Digital Content Creation / Digitization)

Chris Prom (Archives / Archon System)

Mary Stuart (History, Philosophy & Newspaper Librarian)

Miranda Remnek (Slavic Library)

Scott Wilson (Materials Chemistry Lab / Scientific content, esp. chemistry/crystals)

Michelle Wander (Morrow Plot)

Charlie Kline or Bob Booth (CITES – potential storage partners?)

Michael Grady, CIO’s Office—cyberinfrastructure research needs

Representative from the Provost’s office—Vice Provost or her designee

Jim Myers and Michael Welge, NCSA (IACAT programs)

Representative from I3

Rebecca Bryant, Assistant Dean, Graduate College
Page | 5
Download