PID Collections WG Case Statement

advertisement
Comments on PID Collections WG Case Statement
Adam Farquhar
Note, the OAB requested that I provide feedback reflecting the OAB perspective on the PID Collections WG Case
Statement. I will try to focus on issues related to potential impact and adoptability in an organisational context, rather than a
technical perspective, although there is some inevitable overlap as some technical aspects may present barriers to adoption
and re-use.
The comments below may seem a bit harsh, but they are meant to be constructive. There is no widely adopted approach in
this space (for data) and there does appear to be some demand for one. There is definitely room for RDA to contribute.
Comments:
1. The value proposition could be strengthened substantially. The ‘adoption plan’ text suggests several clear
examples of community need that are easy to understand and compelling. They could be distilled and brought
forward to provide a crisper value proposition.
2. The examples in the adoption plan are largely within single communities. I would recommend
highlighting/evidencing (a) that some communities have strong established practices supporting
compound/aggregation objects; (b) that some communities have valued tools and services built around these
aggregated objects; and (c) that some communities would further benefit from aggregated objects that cross
boundaries and are problematic due to the existing aggregation approaches. Points (a) and (b) suggest that there
is value in addressing the problem; (c) suggests that a generalised solution would be of value.
3. There are un-evidenced claims made about benefits. It would be good to at least point to some community
member(s) or studies who are asking for unification or a vendor/service provider who perceives market advantage
(and why they do not pursue it on their own). I agree - particularly for cross-community interactions. However, we
could assume that anything that promotes common practices, particularly within single communities will be a
benefit.
4. The engagement with existing work seems weak. While text expresses an intent not to propose an alternative, it
then goes on to state it will propose an API and implementation that strongly overlaps with existing deployed APIs,
systems, and services. I would recommend that the co-chairs actively look at engagement with existing efforts and
service providers (especially ones with cross-disciplinary and multi-national services) – who may be interested in
better supporting the use-cases that emerge through the WG.
5. The connection to fragment identifier schemes seems weak. At one level, it seems difficult to distinguish between
‘collection identifier’ and ‘fragment identifier’ schemes. They both refer to a whole and have methods for
referencing the parts. To make this case statement compelling, it should provide some strong differentiation with
existing schemes.
6. In order to be broadly applicable, it is essential for a data collection/aggregation approach to deal with issues
related to conditions of use, privacy, data protection, and related topics. While there are lots of great open data,
there are also a lot of data that are not available on such terms. This includes, for example, information subject to
data protection regulations. While it may not be the focus of work in the first period, any framework that does not
extend to cover these sometimes complex issues will have limited applicability.
7. The sustainability and adoption plan is weak. I would recommend explicit activity within the WG to engage with
service providers.
More comments below
1.
The distinction between the concrete objectives/ outcomes/ outputs and the desired, ‘bonus’ outcomes/ outputs
could be made clearer. For example, under the heading “Goals and Work Plan”, it states “…..With respect for
such a unifying API and the community use case, added-value tools should be discussed that offer direct benefits
to community end-users…..” Will these tools be highlighted or made available along with the API and
demonstrator?
Also, the distinction between the processes and the outcomes/ outputs could be made clearer. Are the proposed
outcomes as follows?
A report summarising the collection models, helping communities to understand and refine their
collection usage scenarios
API and demonstrator
An easily adoptable model for identifying and managing collections of data objects via the combination of
use case scenarios; An API; and A reference implementation of that API
Are the following processes only?
Fragment identifier issues will be addressed (as part of the process to identify other methods to relate
objects to each other in general object or identifier graphs).
Added value tools that offer direct benefits to community end-users will be identified/ discussed (as part
of the process to develop a unifying API).
Implement and deploy the API (as part of the adoption plan process).
Register (in a Type Registry), the most essential typing mechanisms that can be used to implement
collections.
2.
I agree with concerns over sustainability - the adoption phase may need to be staggered – starting work with single
communities first, and then with multi-disciplinary communities at a later stage. Also, if the API and demonstrator
are going to be delivered in Month 18, it means that the adoption of these outputs will take place after the WG
ends. Perhaps an exception could be made for this WG – where this WG runs for 2 years? This WG might benefit
from an additional 6 months to gather feedback from the adopters (and perhaps provide technical support to
adopters).
3.
Is there a difference (technical and / or social) between “collections” and aggregations”? As mentioned in Adam’s
point 2.a. above… some communities have strong established practices supporting compound/aggregation
objects. I’m not sure about the term “collection” – should the title of the WG be “PID Aggregations”?
Download