Investigations on coupledResource Introduction The following sections summarize the different approaches for expressing the relationship between services and their resources. Next to the latest ISO 19139, CSW AP ISO and INSPIRE TG spec, one further specification is required to be taken into consideration: ISO 19118. ISO 19118 provides semantical definitions for object references and is the conceptual basis for ISO 19139 gco:ObjectReference. ISO 19139 uses “uuidref” and “xlink:href“ as defined by ISO 19118. The INSPIRE TG shall take this into account. Definitions by CSW AP ISO The (informative) Annex F of CSW AP ISO Specification recommends: To link a service metadata instance with a dataset metadata instance, the value of MD_DataIdentification.citation.CI_Citation.identifier.MD_Identifier.code should be equal to either SV_ServiceIdentification.operatesOn@uuidref (by reference) or SV_ServiceIdentification.operatesOn.MD_DataIdentification.citation.CI_Citation.identifier.MD_I dentifier.code (by instance) By providing the appropriate values either by reference or by instance, the relationship between a service and a dataset is modelled sufficiently. A number of inconsistencies arise from this approach: 1. Semantically, it is not correct to use the attribute @uuidref that way. As per ISO 19118, A.5.5.2, “[…] The “uuidref” attribute shall be used to refer to an object within the universe of an application domain. […]”. By example, this means that an elements defined as: <first id="i05" uuid="dce:F6A120B3"> … </first> can be referenced elsewhere by a new element like this: <second uuidref="dce:F6A120B3"/> Hence, the approach given above links a “uuidref” attribute with the element value of an identifier class, which is not correct. 2. It is not clear how to deal with instances of MD_DataIdentification that only provide an RS_Identifier (which can be provided in theory). How shall the codespace value be considered for the coupling? The CSW specification is not clear at that point. 3. The connections between the metadata documents are not self-contained. This means that one will always need a CSW service to find dataset coupled with a service and vice versa. For that reason the CSW AP ISO spec defines additional queryables: a. OperatesOn b. OperatesOnIdentifier c. OperatesOnName 4. From an infrastructural perspective, one crucial point is that subsequent queries must always be executed within the same search scope. Example: a user finds a specific service metadata document using a distributed query. The service documents indicated some coupled datasets, identified by appropriate “operatesOn” attribute values. To search for these dataset, the user must execute any subsequent query as a distributed query. If the query will be processed only on local level, it is likely that the coupled datasets will not be found, since they are hosted elsewhere. Definitions by INSPIRE TG/IR The INSPIRE Metadata IR recommends to implement the requirement defined in the INSPIRE metadata IR as follows: <srv:operatesOn xlink:href="http://vapxgeodev.jrc.ec.europa.eu/geonetwork/srv/eng/csw?SERVICE=CSW&amp;VERSION=2.0.2&amp;REQUE ST=GetRecordById&amp;ID=f9ee6623-cf4c-11e19100017085a97ab&amp;OUTPUTSCHEMA=http://www.isotc211.org/2005/gmd&amp;ELEMENTSETNAME=fu ll#lakes"/> The implementation is “by reference”, using the xlink:href attribute and a GetRecordById-Request against a CSW instance hosting the dataset record(s). As an alternative, a list of unique resource identifiers can be given, but there is no example provided using that aspect. I see the following implications using this approach: 1. The idea of providing a “by reference” connection is to link an XML element with another (external) XML element. In our case this means: link an SV_ServiceIdentification instance with (one or more) MD_DataIdentification instances. However, the given approach links an SV_ServiceIdentification instance with a GetRecordByIdResponse instance, which is the response of the CSW request. 2. Using the above encoding, the SV_ServiceIdentification instance is tightly coupled with the host providing the CSW service. What happens, if the host name changes? However, this problem has already been addressed by the linked data community and might be solved by convention (see: http://www.w3.org/TR/ld-bp/#HTTP-URIS and http://www.w3.org/TR/webarch/#URI-persistence) 3. If we interpret it very strictly, the approach is not compliant with the INSPIRE IR. From a conceptual point of view, the INSPIRE metadata IR states that “If the resource is a spatial data service, this metadata element identifies, where relevant, the target spatial data set(s) of the service through their unique resource identifiers (URI). The value domain of this metadata element is a mandatory character string code, generally assigned by the data owner, and a character string namespace uniquely identifying the context of the identifier code (for example, the data owner).” The IR requirement is to identify “the target spatial data set(s) of the service through their unique resource identifiers”. The approach given above uses the fileIdentifier of the MD_Metadata instance containing the MD_DataIdentification instance, not the unique resource identifiers of the datasets. Conclusions According to the IR text, everything that is required to comply with the requirements is to provide a list of unique resource identifiers, defined a code/namespace tuples, along with the spatial data service metadata. That can be done by using plain CSW AP ISO techniques. But what further alternative do we have (bearing in mind all the shortcomings listed above)? Option one: use plain CSW approach Each dataset metadata document has to provide a unique resource identifier, which can be referenced by a service metadata document. Example: MD_Identifier.code = urn:inspire:dataset:…:abcdefg:4.1.2 SV_ServiceIdentification.operatesOn@uuidref = urn:inspire:dataset:…:abcdefg:4.1.2 CSW queryables are used to resolve this relationship and help to search for connected documents. Pros: in line with the OGC CSW AP ISO spec Cons: Not self-contained, you need a CSW to find related records Not fully compliant with ISO 19118 semantics, since the uuidref element is not supposed to reference an element value, but rather the uuid attribute of the target element. Option two: use xlink:href as is The INSPIRE metadata TG will recommend to use the “xlink:href” approach using the GetRecordById request as proposed by the latest version. The alternative, using a list of unique resource identifier, will be deleted. Pros: best practice at the moment Cons: Not strictly compliant with INSPIRE IR The reference to the dataset is directly given but contained in a GetRecordById - Response Bound to conventions regarding stable URLs Option three: use xlink:href with GetRecords To take into account unique resource identifiers as required by the INSPIRE metadata IR, we could use a GetRecord request using the ResourceIdentifier queryable as defined by “7.2.4 Additional search properties” in the CSW AP ISO spec,. Pros: Cons: more in line with INSPIRE IR since the unique resource identifiers are references and used GetRecords is only optional via HTTP GET. This means that the INSPIRE Discovery Service TG must be extended to make the GET binding mandatory for INSPIRE.