Sladen_termproj_final.doc

advertisement
Erin M Sladen
Digital Libraries - e553
Tefko Saracevic
Final Term Project
Database Proposal and Mockup
Instructional Cooking Database
mock-up website:
http://eden.rutgers.edu/~ems295/553/term/home.html
Abstract
The goal of this paper is to present a proposal for a database of instructional cooking
materials. This topic was investigated because making, preparing, and eating food is
a universal need for people of all ages, genders, occupations, and ethnic and socioeconomic backgrounds. My hopes are that a database such as the one proposed here
would allow users to grow their understanding of food preparation, nutrition, and
kitchen tool usage by bringing various instructional cooking methods, from the
internet, video, and print, together into one searchable database. Because this
proposed database would be a large undertaking that suggests gathering many
different and varied sources together, legal and licensing fees may be high and
access restricted to paying members or institutions. This paper walks through the
steps of setting up and achieving this database, including a mockup of the proposed
database's site navigation, design and content, metadata to include, digitization
standards to initiate, and how to provide access to the database.
Keywords: cooking, instructional, access, permissions, licensing, evaluation.
2
Table of Contents
Abstract
2
1. Purpose
4
a. Mission Statement
4
2. Objectives
4
3. Content
5
a. Categories
6
b. Images
8
c. Videos
8
d. Courses
8
4. Design and Mockup
9
5. Metadata
10
6. Content Management and Searching
12
7. Legal
12
8. Digitization
13
a. Standards
13
b. Costs
15
9. Preservation
15
10. Access
16
11. Evaluation
17
12. Conclusion
19
13. Works Cited
20
3
1. Purpose
The purpose of this database, Instructional Cooking Database or ICD, is
simple: To provide, in one centralized place, a library of instructional cooking media.
Cooking is something that nearly everyone does on a regular basis. ICD will bring
together many of the varied resources available in different medias so that users can
search and access materials on any topic of their choice. This database is not meant
to be a place for recipes, although it may contain them; rather, the hope is that this
database will be a learning and instructional tool for users to consult about cooking
methods, gain experience in various areas including nutrition, watch videos from
specific instructors, and generate ideas, enabling the ordinary cook to learn and
grow.
a. Mission Statement
The purpose of ICD will be to increase access to diverse cooking instructional
materials to help grow the community's relationship with food.
2. Objectives
The objectives for this proposal are tri-fold:
a. To plan the basics of a digital library database, including legal, access,
searching, and other considerations,
b. To create a mock-up webpage design of the site, and
c. To intimately learn and discover all of the steps that go into creating and
sustaining a digital library, with hopes of one day making this database a
reality.
4
3. Content
figure 3.1
Content will be divided in many ways to make the home page easy to browse.
Figure 3.1 shows a screenshot of the home page. A simple logo is at the top of the
page indicating that this is the Instructional Cooking Database, or ICD, directly
followed by a straightforward navigation system. Aside from the home page, there
will be four other pages directing the user towards content: a. Categories, b. Images,
c. Videos, and d. Courses. A "you-are-here" indicator tells the user which page they
are on with a bolded heading and turquoise text; in Figure 3.1, the user is on the
"Home" page.
These four pages, described in further detail below, will have options to
browse to find information. This is similar to and slightly modeled off the "Browse
the Collections" section of the Perseus Digital Library. Within the ICD, users can
browse the collections by selecting a topic from the main navigation menu, then
selecting further sub-topics until they find one of interest.
All of what is described here is also embellished on the mock-up web page
for this digital library database proposal (the web page is viewable here:
http://eden.rutgers.edu/~ems295/553/term/home.html). The web page does not,
however, have the browsing capabilities I am proposing for this website. It is meant
purely as a detailed illustration of my design intentions, alongside this written
digital library proposal.
If the user does not wish to browse but would prefer to search, a searching
option will also be available to them. Please see Section 5: Metadata for more
information.
5
a. Categories
figure 3.2
The first topic, Categories, is divided into multiple sub-topics with a drop
down menu, as shown in figure 3.2. Clicking on each sub-topic from the drop down
menu will take you to a description of the sub-topic, which are described in more
detail both below and on the categories page.
Within each sub-topic will be clickable links that will take the user to digital
collections. These collections will be both browseable and searchable, and each
entry will contain a difficulty rating. For example, the sub-topic "Health Concerns"
will lead the user to multiple categories, including "Gluten Free." By clicking on the
link to "Gluten Free" the user will be able to access all related content within the
database that is tagged with this category. For more information on the tagging
system, please see Section 5: Metadata. This is a proposed concept that is not
available on the mock-up.
6
This page and the sub-topics on it should be well-maintained throughout the
life of the database by updating lists, creating new sub-topics, and removing
outdated information. As such, the current lists of topics and sub-topics are not yet
complete, and will never truly be complete. For more on maintenance please see
Section 9: Preservation.
Some examples of the sub-topics that will be available to browse are:
i. Type
This category will consist of the different types of formats availale to view
information with, such as video or images.
ii. Topic
A breakdown of categories for the many different topics that are covered
within this database. Included are holiday cooking, how to use tools (i.e. how
to sharpen a knife or season a cast iron pan), or instructions for different
meals, and more.
iii. Instructor
Various different instructors whose work is available for viewing within the
database.
iv. Affiliation
The "Affiliation" category groups information by who its original publisher or
presenter was. Some examples may be videos originally shown on The Food
Network or PBS, information originally published in the Cooks Illustrated
magazine, or information coming directly from blogs or online websites such
as YouTube.
v. Ingredient
A breakdown of topics relating to various ingredients. As has previously been
stated, this database is not meant to be a place for recipes. While this
category will house many ingredient-related recipes, it will also be a place for
explaining things like how to cut apart a whole chicken or how to create
stock from scratch.
7
vi. Health Concerns
Information related to various health concerns, such as a gluten-free diet,
food allergies, or vegetarian needs.
b. Images and Documents
The content linked to in this section will provide both browseable and
searchable static images or sets of images along side instructional directions, such
as illustrations or photographs, that display cooking instructions. This information
may come from a variety of sources, including cook books, blogs, or pamphlets.
These images will be arranged by categories similar to those listed above in the
Categories section. Each image will be its own entry into the database and be
searchable.
c. Videos
The content linked to in this section will provide instructional cooking videos
on various topics. Just like the Images, these videos will be searchable by some of
the same topics listed in the Categories section.
Two different forms of videos will be available from this page:
i. Videos that are uploaded to the database and watched while
within the database.
ii. Videos that are linked to on the database but exist externally.
These videos will either be created specifically for this database or from
other sources and uploaded and displayed through a licensing agreement. For more
on this, see Section 7: Legal. Each individual video will have it's own entry in the
database and be searchable.
d. Courses
In addition to corralling instructional cooking media that exists around the
web, I propose creating some content specifically for this database. This model is
based off Lynda Campus, an organization which creates video demos and courses to
explain technological and software-related concepts. I imagine this model being
adapted to fit in with my digital library of cooking techniques by either using
8
licensed, already created videos or by creating videos specifically for ICD. These
courses would include multiple videos on how to do specific cooking and food
related things, including in depth courses on learning the basics of a type of
international cuisine to quick tips and 5 minute lessons.
4. Design and Mock-up
View the Mock-up here: http://eden.rutgers.edu/~ems295/553/term/home.html
A mock-up was created for the purposes of allowing the reader to better
visualize my intentions and conceptualizations regarding the proposed ICD digital
library. The mock-up was created using HTML and CSS and has a workable
navigation structure with multiple pages to guide the user around the site. Visiting it
will give the reader an accurate depiction of my initial design plans.
figure 4.1
A logo, figure 4.1, was created for the database. In addition, this logo will also
guide the color scheme for the web page: a white background with black text and
embellishments in turquoise and orange. A simple navigation system guides the
user from one topic to the next, including links to jump directly to sub-topics.
There is still a lot that needs to be added and embellished on the mock-up in
order for this digital library to become a reality. There is not currently any content
available for viewing on the web page. For more information on adding content,
please see Section 6: Content Management and Searching.
The mock-up was
necessary to best present the design and structure of the proposed digital library.
9
However, the website is functional and easy to comprehend and navigate. The
finalized digital library will also follow suit.
5. Metadata
Metadata, the process of tagging information so that it is not only easily
findable later but a method for ensuring that the data concerning a piece of
information, such as title or creator, is not lost. For this database I have chosen to
use the Dublin Core metadata standards, which consist of 15 specific element
attributes:
i. Title
ii. Creator
iii. Subject
iv. Description
v. Publisher
vi. Contributors
vii. Date
viii. Type
ix. Format
x. Identifier
xi. Source
xii. Language
xiii. Relation
xiv. Coverage
xv. Rights Management
Dublin Core was chosen because of its simplicity and straightforward manner. For
each piece of information entered into the database, as many of these 15 metadata
elements as possible will be used to describe the entries included in ICD.
In many instances not all 15 elements will be available for metadata
inclusion. For example, figure 5.1 shows an example of metadata added for a specific
video, the episode "The Good Loaf" from Julia Child's television show The French
Chef. This video is one example of many types of instructional media that will exist
in ICD. I found the episode through a DVD of the show, though it has at times been
available on the PBS website, and is currently streaming on YouTube. The metadata
for this item is listed as fully as possible, though there is some overlap, especially
10
between Identifier and Source. Because this is a 40 year old television show, lots of
gaps exist and metadata is applied to the best of the creator's ability.
figure 5.1
Adding metadata is not a complex process, but it is a long and laborious
process that must usually be done by a human. Collecting content and creating
metadata for that content will be one of the biggest, most expensive and timeconsuming part of making this digital library become fully operational. It is also a
very critical part of the process in creating the database because it ensures that the
information added to ICD will not only exist safely but be findable by its users.
11
6. Content Management and Searching
Content that is placed in this database will need to be stored and managed,
which can be accomplished through a PHP server and SQL program, such as MySQL.
For the purposes of this proposal, this database storage was not produced on the
mock-up site.
Each entry into the database will have multiple fields describing it. In most
cases these fields will be similar or identical to the metadata standards described in
Section 5: Metadata. By using the metadata standards, we will be creating a double
use of metadata: they will contain information about an item and they will allow a
user to find that item. When searching, the site's search engine can look through the
metadata tags to find relevant information related to the query.
While there is no cost to use MySQL, it is a time consuming process and the
expense of time must be taken into account.
7. Legal
Because of the wide scope of this proposal, obtaining licensing agreements
from the owners of the videos, images, and content I wish to use in this database will
be one of the biggest challenges for this database. Many companies with a large
volume of instructional cooking materials have heavy copyright restrictions or fees
to view, use, and replicate the information. In order to get a feel for the various
issues surrounding the legal and copyright aspects of this proposal, I have deeply
investigated The Food Network's terms of use.
The terms of use state that while the website may be accessed and used by
users at no cost, the content represented is the intellectual property of the Food
Network and cannot be reproduced or copied anywhere. However, the terms of use
do state that linking to the website is allowed. The terms do not specify to what type
of linkages are permitted however – the excessive links planned for this database
will most likely not be allowed. A licensing agreement between the Food Network
Corporation and my ICD digital library would most likely need to be reached. I
12
attempted to contact the Food Network about the possibilities of licensing, but have
not received a response.
I will assume that for many current publications, licensing agreements will
need to be in place before I can use their materials. Because of this, it may be the
most beneficial for the ICD digital library to focus first and foremost on public
domain related items – cook books and magazines, pamphlets, utensil and cookware
instructions, and personal narrative accounts. If ICD begins by digitizing these
objects and giving proper citations to the author, it can build its database and begin
to be functional before requesting licensing agreements with large corporations.
Another option is for ICD to pull related information from the Rutgers
Libraries databases. Rutgers already has licensing agreements put into place with
many information providers and would mean that ICD would not have to worry
about these licensing. In order to for users to access the content from the ICD
database, they would have to be members of the Rutgers community and enter their
credentials before viewing material.
8. Digitization
Digitizing materials is an important part of every digital library or database.
It entails taking an object from its original form and formulating it to work in the
required formats of the intended digital library. A lot of the content I intend to use
has either already been digitized or was born-digital. Nevertheless, a breakdown of
digitization standards and costs is still appropriate. The highest possible standards
should be used when digitizing items. Conversely, costs should be appropriate to the
scope of the digitization.
a. Standards
The standards that I propose using will vary based on what type of object is
being digitized. Generally, for any static item – text, photographs, illustrations and
artwork, maps, and more – there will be a high resolution and a minimum of 8-bit
for greyscale objects and 24-bit for color objects.
13
The University of Colorado's Digital Library standards are a perfect guideline
for portraying these minimum requirements. Items that may be digitized in such a
way include old public domain or licensed cookbooks, magazine articles, or
pamphlets.
figure 8.1 (University of Colorado Digital Libraries, 2009, p. 4)
As Figure 8.1 shows, in addition to resolution and bit depth standards, TIF is the
suggested file format for these objects. These standards will ensure that all items
that are converted will be readable and/or viewable on the typical computer screen
monitor.
Similarly, the University of Colorado's standard guidelines for digitizing
audio objects suggests, as shown in Figure 8.2, a minimum sample rate of 44.1kHz
and 16-bit depth. For audio files, WAV or AIF file formats should be used.
figure 8.2 – Minimal Requirements for Digitizing Audio (University of Colorado Digital
Libraries, 2009, p. 7)
Lastly, for standards regarding video I suggest following the New York
University Library standards (De Stefano, et. al, 2013, p. 6-8):
14
i. Different File Types
1. A long term "preservation file" which will be the master
file and will not be touched or altered.
MOV file extension
uncompressed, 10-bit 4:2:2 video stream
48kHz audio stream
2. A "mezzanine file" to serve as a surrogate for the master
file upon which changes can be made.
MOV file extension
DV50 video stream
48kHz audio stream
3. And an "access file" to serve as the general use copy for
users to view.
WMV file extension
Window Media at 700 kbps video stream
44.1kHz audio stream
b. Costs
Quality scanners and equipment would be required to convert static images
into usable digital formats, are one example of the cost of digitizing for this
collection. For a digital camera, flatbed scanner, slide and film scanner, and
document scanner, the ICD digital library could potentially pay $3000 or more for
equipment alone. Additionally, there would be costs to employ workers to actively
digitize items, create metadata, and upload to the proper portions of the database.
The cost of digitization may be high due to these reasons, in addition to licensing
fees, but this is an essential part of database creation that cannot be overlooked.
9. Preservation
Preservation of the resources in this database will be critical to the long term
survival and use of the database. Preservation will include site management and
updates to be sure that the pages within the site remain relevant in terms of both
content and HTML/CSS and browser requirements.
15
Preservation will also include backing up the data stored in the digital
library. This will be done physically, by backing up data in multiple places, such as
on several different hard drives. Preservation will also be accomplished by saving
that data in multiple formats to ensure that obsolescence does not occur.
Like metadata and content management, preservation is a costly and time
consuming process, but it should not be overlooked and is necessary for the longterm survival of this and any digital library.
10. Access
The ICD digital library would be accessible by a user fee model that will be
similar to other databases. However, because of the nature of this database to
hopefully be used by the general public, I believe it is important to consider methods
that would make this fee as low and reasonable as possible.
One option would be have restricted access to the database. Some content
that is pulled from public domain or merely cataloged in the database then linked to
in it's original source, as is discussed in Section 7: Legal, should be available without
costs to anyone. Other content that is under licensing agreement will a one-time fee
per item or an umbrella yearly membership fee, which will depend on the user –
institution or individual.
For institutions that pay a yearly fee, the database will be accessible through
a link on their web page, similar to how many database are accessible via the
Rutgers Library website, requiring only for the user to sign in with his or her
credentials. For users who are accessing the database without an institution, the
mock-up web page will be turned into a landing page for users to sign in to their
membership and access the content that way.
16
11. Evaluation
Evaluation of a resource such as a digital library is an important part of its
development and maintenance. In order to be sure that the ICD digital library is up
to standards, presents its information in an appropriate way, and uses its funds in a
way that correlates to what its users require from it, I have chosen six evaluation
criteria:
i. Content:
The content in ICD should be a good representation of
the available
literature and information. It should be organized in a way that is obvious
and easy to understand to the user. It should be presented in an honest,
straightforward fashion. My plans to organize and present the information,
as detailed above in Section 3: Content, work together with my plans for
metadata and searching, see Section 5, to create easily findable and usable
content. Other things that should be evaluated are relevancy and accuracy for
each item and for the database as a whole.
ii. Technology
Do the hardware and software work for the purposes of this library? One
example is the SQL/PHP setup for adding content to the digital library. This
may be a successful means of adding content, but proper evaluation will
ensure that its performance remains up to par. Technology evaluation will
also focus on costs of effectiveness of
costs used and how easily the
technology is accessed on both the user's and creator's sides.
iii. Interface
The interface of the digital library should be usable, meaning it should
support user interaction. Users should be able to easily use a site.
Accessibility should be high, error rate should be low, and the sites
organization should support the interface. The interface should be visually
appealing and consistent throughout the library.
iv. Process and Service
17
A wide range of services should be offered to the user through the site:
straightforward navigation, searching enabled, low error rate, easy browsing,
and straightforward means of obtaining a desired resource within the
database.
v. User
The use of the ICD digital library should positively affect users. Evaluation of
this criteria should ensure that users are receiving information in the areas
they are searching that actively affects their lives. The content and
information received should positively affect their cooking and nutritional
tasks going forward.
vi. Context
The idea of an instructional cooking digital library makes sense in the current
world as presented: cooking is something that most of us do on a daily basis.
As long as this is true, the database should focus on bringing together all of
the relevant cooking methods, information about food and tool usage, and
more that fit into the context. Adding and removing appropriate content will
be important to retaining this context.
While the plans for evaluation listed here are straightforward, they are not set in
stone and should be managed and updated frequently, as the need presents itself. It
may also be necessary to evaluate costs of salaries, equipment, and other fees. The
key to successful evaluation is consistent evaluation.
Some of my proposed means of evaluating this digital library in the future
are:
a. User surveys
b. Statistical reports of downloads and download times
c. Reports of searching: when did it lead to a download? when was it
repeated multiple times to no preferred response? how often was it
abandoned?
d. Personal observations and reactions
e. Focus Groups
f. User Interviews
18
12. Conclusion
I proposed this project because cooking is something that I do daily. I find a
lot of joy in cooking, but I am also frustrated when I want to know how to do
something but cannot easily learn how. A centralized place of knowledge for all
cooking instructions would be exceptionally helpful to me and, I believe, to many
other people.
The steps outlined here are vast, laborious, and complicated. I have done the
first big step by outlining my ideas for this database, creating the HTML mock-up to
showcase my proposed design, describing various content, and explaining strategies
for the database, such as metadata, licensing, and preservation, among others.
Making the database as described here a reality would clearly be an incredible
undertaking and, at this moment, is well beyond the scope of what I am capable of
presently.
However, I do believe that I would be able to begin to materialize a functional
database by taking small steps and not attempting to fulfill all the sections listed
here at once. If I were to go forward, my first step would be to create a PHP and SQL
responsive website. From there, I would introduce content individually, starting
solely with links to Rutgers' licensed materials. This way, anyone who is a Rutgers
member would be able to access the materials. This strategy would mean I wouldn't
be stuck immediately trying to figure out handling access or licensing issues which,
because of the various high profile media available and desirable for use, would be a
monumental and exceptionally expensive task that I could not take on by myself.
This is something I sincerely hope to work on in the future going forward,
beyond the scope of this class. I have already learned so much just by setting up a
proposal and thinking deeply about how I would carry out my plans. I know that I
could learn a lot more by moving the proposal forward into the next phase.
19
13. Works Cited
Biomedical Computation Review (BCR). (2008). BCR's CDP digital imaging best
practices, Version 2.0. Retrieved from http://mwdl.org/docs/digitalimaging-bp_2.0.pdf
Bridges. (2007). 15 Dublin core element attributes. Minnesota Metadata Guidelines –
Dublin Core. Retrieved from http://mn.gov/bridges/dcore.html
Crane, G. R., ed. Browse the collections. Perseus Digital Library. Retrieved from
http://www.perseus.tufts.edu/hopper/collections
De Stefano, P., et. al. (2013). Digitizing video for long-term preservation: An RFP
guide and template. New York University Libraries. Retrieved from
http://library.nyu.edu/preservation/VARRFP.pdf
The Food Network. (2014). Terms of use. Retrieved from
http://www.scrippsnetworksinteractive.com/terms-of-use/
Gilliland, A. J. (2008). Setting the stage. Introduction to Metadata: Pathways to digital
information. Online edition. Version 3.0. Los Angeles, CA: The J. Paul Getty
Trust. Retrieved from
http://www.getty.edu/research/publications/electronic_publications/intro
metadata/setting.html
Hillmann, D. (2001). Generic examples. Metadata Dublin Core Usage Guide. Retrieved
from
http://dublincore.org/documents/2001/04/12/usageguide/generic.shtml
Ickes, M., & Gambescia, S. (2011). Abstract art: how to write competitive conference
and journal abstracts. Health Promotion Practice, 12(4), 493-496.
International Federation of Library Associations and Institutions (IFLA). (2002).
Guidelines for digitization projects for collections and holdings in the public
domain, particularly those held by libraries and archives. Retrieved from
http://www.ifla.org/VII/s19/pubs/digit-guide.pdf
Lynda.com. Software training and tutorials. Lynda Campus. Retrieved from
http://www.lynda.com
20
Metatags.org. (2014). How to use Dublin core metadata set. Dublin Core Metadata
Initiative. Retrieved from
http://www.metatags.org/dublin_core_metadata_element_set
PBS. (2013). Site terms of use. Retrieved from
http://www.pbs.org/about/policies/terms-of-use/
Quam, E. (2002). Minnesota metadata guidelines for Dublin core metadata: Training
manual. St. Paul, MN: Minnesota Department of Natural Resources. Retrieved
from http://mn.gov/bridges/bestprac/training.pdf
Reese, W. (2014). Digital libraries term project: The North American Saxophone
Alliance digital library proposal.
Saracevic, T. (2014). PowerPoint lecture: Evaluation in digital libraries. Retrieved
from
http://comminfo.rutgers.edu/%7Etefko/Courses/e553/Lectures/Lecture09
_Evaluation1.ppt
Scott, A. (2008). Planning for successful digital imaging projects. Thinking Outside
the Borders (151-156). Urbana-Champaign, IL: Mortenson Center for
International Library Programs at the University of Illinois. Retrieved from
http://www.library.illinois.edu/mortenson/book/20_digitalimaging.pdf
University of Colorado Digital Library. (2009). Digitization best practices. Retrieved
from https://www.cu.edu/digitallibrary/cudldigitizationbp.pdf
w3schools.com. RDF Dublin core metadata initiative. Retrieved from
http://www.w3schools.com/webservices/ws_rdf_dublin.asp
WGBH Boston. (1972). The French Chef.
21
Download