Chaos or control: a game of metadata content standards In which you do get to be the supreme ruler. Content standards, such as the Anglo-American Cataloging Rules for library catalog records, are one means to increase the reliability and quality of metadata. Following AACR2 makes it more likely, for example, that two catalogers will record a book’s title with the same words, punctuation, capitalization, and language. Creating such standards is a challenging task. While a basic guideline might fit most cases, it is also necessary to identify and describe foreseeable variations. In AACR2, for example, a book’s primary identifier is the “title proper.” The title proper is taken from the book’s title page. But what if the book doesn’t have a title page? What if the title on the cover and the title page are different? What if the title on the title page has a spelling error? All these possibilities are described in the cataloging rules. Your Mission In this activity, you will, in a group of 3-4 people, draft a set of content standards to describe the ingredients of recipes. You are creating a metadata catalog to describe a recipe library, and one of the catalog’s goals is to enable users to find recipes that include one or more specific ingredients. Your metadata schema, therefore, includes a field for ingredients. But how are these ingredients to be recorded in the catalog? You need to develop guidelines that enable some level of reliability in the Ingredients field. Here are some questions that you might consider: What is the source of information for the ingredients? (The Ingredients portion of the recipe, if there is one? The main text of the recipe?) Should the terms for each ingredient be standardized? Should the cataloger describe variations? In what cases? Is all text in an Ingredients area of a recipe part of the “ingredients”? Should measurements and amounts be standardized? (Or should these be included at all?) Should numbers be standardized? Should the language be standardized? Should punctuation and capitalization be standardized? Should the order of elements be standardized (that is, ingredients and amounts, if amounts are included)? To help you create these standards for ingredient description, examine the included examples of recipe texts and note the potential variation in representation for each document. For example, some recipes list water in the ingredients; others don’t. How should water (and perhaps salt and pepper!) be handled? What should a cataloger do for optional ingredients, or recipes where potential substitutions are enumerated (as in fresh or frozen spinach, but not both). While all the recipes are in English, some ingredients are described in other languages; what should a cataloger do about that? INF 384C, Fall 2009 Deliverables On the following page, create a set of written guidelines that explains how a cataloger should describe recipe ingredients. Remember that the objective motivating this description is to enable users of a recipe library to find recipes where one or more specific ingredients are used. Oh man, you mean we have to write down our guidelines? What a pain! Yes, you do have to write them down. Part of the process for creating standards is not just to establish guidelines but to determine how to communicate them so that others can interpret them consistently. Structure your guidelines so that the most basic rules are first; then consider what should be done in more complicated cases. As you create your guidelines, think about your motivating rationale: Is it to save the time of catalogers? To better facilitate user needs? To enable easier sharing of data between multiple recipe libraries? (Probably it will be some combination of these, but you should consider how some goals might be facilitated and others hindered by your potential guidelines.) After your guidelines have been drafted, we’ll pass them around from group to group so that you can see the solutions designed by your classmates as well. Glossary To help with the recipe examples, here are some translations of terms that you might encounter: Dal: generic term for a split dried bean, often a form of lentil Garam masala: a spice mixture that may include cinnamon, clove, cardamom, cumin, coriander, ginger, and black pepper Ghee: clarified butter Gram: a legume (this is actually English!) Hing: asafoetida Jeera: cumin seed Masoor dal: a red (or pink) lentil Moong dal: split mung beans; looks like a yellow lentil when the skin is removed (the skin is green) Palak: spinach Roti: basic Indian flatbread, cooked on a griddle Sambar powder: a spice mixture that may include red chili, coriander, fenugreek, and several dals (dried and ground) Toor dal: another yellow lentil Also note that British volume measures (cups, pints, etc) are different from American ones, even though they have the same names. INF 384C, Fall 2009 Content guidelines for the Ingredients field of a recipe catalog INF 384C, Fall 2009