Requirements for DC V2.0

Hi Stu, Here's my notes from the breakout session on V2 requirements. This breakout group met Tuesday afternoon. Before the meeting I had talked briefly with Cliff and Cecilia, the two basic requirements we thought we had going into the meeting were: 1) Ease of migration (aka backward compatibility with DC 1.x). 2) Avoid second-system syndrome (more of a goal than a requirement, but I digress). But when the group met and started considering the basic questions of "what problem are we trying to solve" and "what are the constraints on the solution" we quickly found that we do not have a crisply defined problem to solve. This lack of specificity in the problem description is the first order of business that needs to be addressed in the 2.0 effort. The initial purpose of the Dublin Core was "simple resource discovery". We have noted strong push-back on all three of these words. Some of the discovery scenarios that have been postulated are anything but simple. We see a desire to describe 'resources' that are a long way from the initial conception of resources as "document-like objects". Finally, purposes such as administration and rights-clearance are being added to the original purpose of 'discovery'. At the same time, the group was not able to articulate a crisp boundary between what was and was not in the scope of the 2.0 effort. The core is very general and can be used to describe a wide range of things. Using it to describe people is one example of pushing it to an extreme. So, "effectiveness" is an important notion. The DC is most effective at describing web resources that are 'document like objects' that were 'born digital'. Describing digital renditions of physical objects, or describing objects that are not like documents, is a less effective use of the DC. We can imagine a bell curve centered over "DLOs" and tailing off as we describe things that are not like documents. The abstract vs. concrete nature of a resource is another axis for effectiveness. If we go with the notion of abstract works like "Hamlet", the DC can describe them, but less effectively than if we were describing Project Gutenberg's particular instance of Shakespeare's playscript. Events, People, Places, Services, Collections, ... are some of the things people may want to describe as part of DC 2.0 records and are outside the most effective use of DC 1.0 Some of the problems that we are trying to solve were vaguely articulated: 1) interoperability between different communities 2) lessen the ambiguity of the 1.x definitions and usages 3) sounder logical model than 1.x 4) retain purpose of simple resource discovery Listing the constraints that 2.0 would have to live under seemed to be an easier job. We very quickly listed: 1) Compatibility with 1.x descriptions (where "compatibility" is only loosely defined). 2) Compatibility with the Warwick framework. (RDF and XML namespaces appear to be a reasonable implementation path that makes this feasible to meet). Random notes: One question that arose was just what we meant when we said "interoperable" descriptions. Does that mean anything more than machine-translatable field identifiers? It may be very important to reaffirm the "optional" and "repeatable" principles of the DC, or to explicitly repudiate them. We were also able to come up with a negative requirement: Perfectly reversible transformations between 1.0 and 2.0 records are explicitly NOT a requirement. Explicit Conceptual Model: A fair amoun tof our time was spent talking about explicit conceptual models. "David's green boxes" is one proposal for a conceptual model behind the Dublin Core. A lot of current thinking (e.g. IFLA, Delsey) about bibliographic description is based on such conceptual models. This seems to be inspired in very great measure by techniques from the computer science community such as Entity-Relationship modeling and Object-Oriented modeling. Basing DC 2.0 on an explicit conceptual model may present a problem with preserving backward compatibility. 2.0 would be considered NOT compatible with 1.x if there was any 1 element in 1.0 that would be split into 2 or more elements in 2.0 and there was no algorithmic procedure for deciding the proper 2.0 element to use. As an example, assume that 2.0 has an explicit notion of "abstract work". DC 1.0 "Creator" would encompass both creators of abstract works, such as William Shakespeare for Hamlet, and creators of concrete instances of the work, such as Kenneth Branaugh for one of the many movies of Hamlet. (Now, whether this is a practical problem is another story. If the percentage of DC 1.0 records that describe abstract works is low, then this is not a real problem). Summary: The major conclusion of the breakout group was that the DC 2.0 effort should begin by stating the purpose of the change, defining the problem that we are trying to solve, and taking a look at the COSTS and the BENEFITS of the change form 1.x to 2.0. This analysis might begin by considering the uses of the Dublin Core. A couple of examples: 1) We expect most DC descriptions to be obtained from more detailed information. For example, a down-translation of information from a MARC catalog. 2) We expect users to start queries with DC fields. Then, if and when their search brings them into a particular domain, their search tools will start showing more of the domain-specific information that does not have a natural fit to the core. A second conclusion was that we should not introduce feature creep into the core to address problems that are handled by the Warwick Framework. Ron Daniel Jr. DATAFUSION, Inc. 139 Townsend Street, Ste. 100 San Francisco, CA 94107 415.222.0100 fax 415.222.0150 rdaniel@datafusion.net http://www.datafusion.net

Requirements for DC V2.0

Related documents

Products

Support

Requirements for DC V2.0

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib