October2012_10_25_C... - Data Documentation Initiative

advertisement
October2012-10-25
Arofan, Olaf, Dennis, Dan S., Ornulf, Larry
Group 1 and 3 combined
Brigitte Basic objects in core as abstracts along with relationships as abstract
Risk of complication, might blow up
How does model affect bindings?
Nicer to have controlled vocabulary of named relationships having controlled vocabulary
Would be nice to have hierarchy of relationships, able to navigate up tree .
Alternative ???? allow relationships to restrict type after the fact
Dan S. identification doesn’t include type not easy to validate in Schema
Brigette look at XLink give URI for SOA
Dan remember archival nature model only with identity not resolution of URI how do you validate
B. pull back not relationship a class
Dan G concept region of GSIM more complete
Achim should be incorporated into DDI , not directly related to this modeling exercise
Group 3
Dan S. principle use least obtuse (obscure) features of UML Schema and
Term: Functional modules
Key question model things that exist relationships aren’t objects
Validation is a big problem can’t validate identification
1 Support for named relationships (property names in DDI3) within functional modules
2 Support for named relationships across functional modules
3 Support for named relationships across functional modules - wild (completely new relationship)
2 and 3 create complications for application development
Name of relationship as property
Example surveys and datasets
One binding for datasets
One binding for surveys
One binding for the combination
Is there a payoff for complexity – does the additional expressiveness add
Treating relationship as abstract class doesn’t show direction?
Lots of abstract objects in makes rdf more difficult, makes reasoning difficult
Principle adopt best practices for consuming technology
As application developer – would not resolve predicate URI
Design Principle – By default metadata should be modeled as items. Items are reusable
Design Principle: Bound artefacts should be designed to support “graceful degredation”. This stops us from building
corner cases.
Principle Generic functions should directly import the core
XML Document type for each combination of modules, plus a doctype which can
What is a reasonable number of bound document types
Model proposal
1) Unclear what the benefit is
Does not achieve type safety in XML schema easily – cannot be enforced
Why have a hierarchy of relationships
2) Model becomes much harder to understand
3) Model becomes much harder to bind
Abstract objects in code (one for each non-abstract object
4) Destroys the benefit of a modular design (core changes with each new module)
5) X-Link is XML specific and poorly supported. We also need to support other identification systems (ISO/IEC 11179 etc.)
6) How is this better than named properties?
We have created a FunctionalGroup object, named DataDescriptionGroup, in its own module.It points to the
DataDescription object in the Simple Data Description module.We are trying to support the function of describing a file
of observational data in a basic fashion ( enough metadata to produce an SPSS file for example). You have the
DataDescription object as the entry point which you learn from the processing the DataDescriptionGroup object. The
Core namespace is implicitly included.
1) Generate a root element (doctype) based on he name of the function( “______Document”) which is a subclass of the abstract core:Document object. (This extension Is reflected in the XML schema)
2) Because DataDescription is identifiable but not an item, I declare it as an element in the schema
<DataDescription>. Each of its properties is expressed as an XML attribute, a contained element or a
Reference (if a non-composition relation to an identifiable object.)
3) For each referenced item type, we create an artificial element “______Collection” to hold the metadata.
Depending on how the dependencies cross modules there could be an approach as follows:
reference objects for modules outside of those being used with external references, indicated using fixed
attributes (isExternal=”true”) . This is to be avoided if possible.
ISSUES –
1) how do you tell from the model that related modules are related? A single item for an entry point needs to exist
in the model
2) These external references in 4 above represent problems for instances for single file data exchange or
preservation
Preservation
SurveyDescriptionGroup
DataDescriptionGroup
PreservationDescriptionGroup
Core
Simple
Survey
Advanced
Survey
Simple
DS
Advanced
DS
Data set
Survey
Model
Core real objects used everywhere, when in doubt, leave it out
Base abstract
RDF Bindings
[use case as for XML] (generating turtle)
For each object in UML model, declare an OWL class and assign it a URL.
Named Graphs Best Practice
Where we have a doctype in the XML, we would use named graphs in the RDF
We might want to also use named graphs for item graphs (an item and all referenced items)
Graphs could have the same URI as their entry-point objects
ISSUE: how to deal with language translations in labels?
Do named grpahs export to doctypes in XML??
The purple problem
Artefact that just has one object
haa
Outcomes – to do:
1 respond to modeling, discuss core
2 finish XML binding discussion
Simple as possible
3 discuss rdf binding
4 discuss soa
Ornulf Properties relationships and behavior, behavior not our purview (only queries – no puts and deletes)
5 document everything
6 RDB binding
Download