Comments on PID Collections WG Case Statement Adam Farquhar Note, the OAB requested that I provide feedback reflecting the OAB perspective on the PID Collections WG Case Statement. I will try to focus on issues related to potential impact and adoptability in an organisational context, rather than a technical perspective, although there is some inevitable overlap as some technical aspects may present barriers to adoption and re-use. The comments below may seem a bit harsh, but they are meant to be constructive. There is no widely adopted approach in this space (for data) and there does appear to be some demand for one. There is definitely room for RDA to contribute. Comments: 1. The value proposition could be strengthened substantially. The ‘adoption plan’ text suggests several clear examples of community need that are easy to understand and compelling. They could be distilled and brought forward to provide a crisper value proposition. 2. The examples in the adoption plan are largely within single communities. I would recommend highlighting/evidencing (a) that some communities have strong established practices supporting compound/aggregation objects; (b) that some communities have valued tools and services built around these aggregated objects; and (c) that some communities would further benefit from aggregated objects that cross boundaries and are problematic due to the existing aggregation approaches. Points (a) and (b) suggest that there is value in addressing the problem; (c) suggests that a generalised solution would be of value. 3. There are un-evidenced claims made about benefits. It would be good to at least point to some community member(s) or studies who are asking for unification or a vendor/service provider who perceives market advantage (and why they do not pursue it on their own). I agree - particularly for cross-community interactions. However, we could assume that anything that promotes common practices, particularly within single communities will be a benefit. 4. The engagement with existing work seems weak. While text expresses an intent not to propose an alternative, it then goes on to state it will propose an API and implementation that strongly overlaps with existing deployed APIs, systems, and services. I would recommend that the co-chairs actively look at engagement with existing efforts and service providers (especially ones with cross-disciplinary and multi-national services) – who may be interested in better supporting the use-cases that emerge through the WG. 5. The connection to fragment identifier schemes seems weak. At one level, it seems difficult to distinguish between ‘collection identifier’ and ‘fragment identifier’ schemes. They both refer to a whole and have methods for referencing the parts. To make this case statement compelling, it should provide some strong differentiation with existing schemes. 6. In order to be broadly applicable, it is essential for a data collection/aggregation approach to deal with issues related to conditions of use, privacy, data protection, and related topics. While there are lots of great open data, there are also a lot of data that are not available on such terms. This includes, for example, information subject to data protection regulations. While it may not be the focus of work in the first period, any framework that does not extend to cover these sometimes complex issues will have limited applicability. 7. The sustainability and adoption plan is weak. I would recommend explicit activity within the WG to engage with service providers. More comments below 1. The distinction between the concrete objectives/ outcomes/ outputs and the desired, ‘bonus’ outcomes/ outputs could be made clearer. For example, under the heading “Goals and Work Plan”, it states “…..With respect for such a unifying API and the community use case, added-value tools should be discussed that offer direct benefits to community end-users…..” Will these tools be highlighted or made available along with the API and demonstrator? Also, the distinction between the processes and the outcomes/ outputs could be made clearer. Are the proposed outcomes as follows? A report summarising the collection models, helping communities to understand and refine their collection usage scenarios API and demonstrator An easily adoptable model for identifying and managing collections of data objects via the combination of use case scenarios; An API; and A reference implementation of that API Are the following processes only? Fragment identifier issues will be addressed (as part of the process to identify other methods to relate objects to each other in general object or identifier graphs). Added value tools that offer direct benefits to community end-users will be identified/ discussed (as part of the process to develop a unifying API). Implement and deploy the API (as part of the adoption plan process). Register (in a Type Registry), the most essential typing mechanisms that can be used to implement collections. 2. I agree with concerns over sustainability - the adoption phase may need to be staggered – starting work with single communities first, and then with multi-disciplinary communities at a later stage. Also, if the API and demonstrator are going to be delivered in Month 18, it means that the adoption of these outputs will take place after the WG ends. Perhaps an exception could be made for this WG – where this WG runs for 2 years? This WG might benefit from an additional 6 months to gather feedback from the adopters (and perhaps provide technical support to adopters). 3. Is there a difference (technical and / or social) between “collections” and aggregations”? As mentioned in Adam’s point 2.a. above… some communities have strong established practices supporting compound/aggregation objects. I’m not sure about the term “collection” – should the title of the WG be “PID Aggregations”?