Altmetrics Public Comments Received Through July 21

advertisement
Public Comments on NISO Altmetrics White Paper received from June 9-July 18, 2014
These comments are also available via
http://www.niso.org/apps/group_public/document.php?document_id=13295&wg_abbrev=altmetrics
510
Cynthia Hodgson
chodgson@niso.org NISO
6/10/14 6:26 Editorial
References
There is currently no reference section in the white paper.
Following the public comment period, we plan to add a references section.
511
Gregor McDonagh
grmc@nerc.ac.uk
NERC
6/10/14 9:41 N/A
personal feedback via Powerpoint
Picking up on issues with traditional citations analysis and scope
of usage, and options going forward to address cultural resistance.
512
Laurel Haak l.haak@orcid.org
6/12/14 5:46
Substantive
General Comments and use of PIDS
ORCID
Overall, this paper is a balanced summary of the state of the art of altmetrics. Well done!
A few comments, by section:
Research Outputs: conference presentations and papers are absolutely critical for researchers in
computer sciences, maths, engineering...
Data Quality: The emphasis on resolution of the source document is important and
fundamental. This is not just about a digital document identifier, however. It also must encompass
identifiers for the contributor and related organization. Also, it is important to provide guidelines on how
to cite things that are not journal articles. figshare does this well for items deposited there, we can do
better in the community with citation guidelines for blogs and other communication products.
Groups: One plea here, please use the parlance ""ORCID record"" (not profile). More
substantively, it seems this section would benefit from an action item/next step in the area of persistent
identifiers, namely how PIDs can be a major assist in grouping documents/people/orgs.
513
Pat Loria
ploria@csu.edu.au
Charles Sturt University Library
6/17/14 0:49 N/A
Feedback on Altmetrics White Paper
Firstly, I would like to applaud NISO for attempting to develop standards for such a nascent, yet
significant field. In my opinion, altmetrics would be better described as ""alternative impact metrics""
rather than ""alternative assessment metrics"". The latter implies quality assessment, while the former
more accurately describes what they in fact measure - impact. To attempt to call them something else in
order to engender wider acceptance and adoption risks the creation of competing terms, as altmetrics is
what they have become known as by their developers and users. And I believe it is quite helpful to have a
reference built into the name that distinguishes them from their traditional counterparts, yet highlight their
complementarity with other metrics. Furthermore, the nature of altmetrics suggests they should be defined
in as inclusive a manner as possible, and not in a prescriptive way. This is one of the defining features of
altmetrics, compared with the narrower citation metrics. In essence, altmetrics measure the engagement of
online communities with a wide variety of digital research outputs. In other words, they should not be
merely defined as ""alternative to the established citation counts"" (page 4 of white paper). The types of
research outputs they measure and the types of metrics they employ should also not be prescriptive.
Altmetrics measure both scholarly and social impact. Scholarly impact is sometimes quantitatively
defined as number of citation counts in scholarly books and journals. However, there is no end to the
number of ways that social impact may be defined or counted, and therefore output types and metric types
should be open and inclusive to reflect the wide diversity of possibilities for communicating and engaging
with research.
I am not a fan of aggregated metrics, such as the h-index or the journal impact factor, with all of
their inherent faults, not the least of which is their inability to reveal contextual time-related artifact-level
impact data. A defining feature of altmetrics is that they gather artifact-level metrics, which users can use
and value as they wish, without being subjected to imposed value weightings for various outputs or
metrics, thus maximizing the number of user-defined applications. Specifying use casesmay inadvertently
serve to undermine the value of potential applications. According to page 8 of the white paper, critics of
altmetrics argue that altmetrics correlate poorly with scholarly citations. If we accept this critique as true,
then it is equally true that scholarly citations correlate poorly with altmetrics. While altmetrics include
scholarly citation counts (via APIs with Scopus and other scholarly citation applications), what this
reveals is that altmetrics measure a much wider range of outputs and impacts, including social impact,
which scholarly citations do not measure. This is why altmetrics adherents highlight the complementarity
of the new metrics to the more traditional ones. They reveal different flavors of impact, or impact
occurring in different sectors beyond the academy, which can include government, industry and public
impact.
Promoting the use of persistent identifiers may be problematic, due to the difficulties associated
with imposing public compliance with industry standards. The public will engage with and make
reference to research outputs in all manner of ways, and it would be best to develop standards and
processes that are able to effectively capture this diversity of engagement. The proposed stakeholder
groups do not include third-party systems developers, such as institutional repository developers,
research information management developers and researcher profiling developers.
Due to the early uptake and popularity of altmetrics in Australia, another potential partner for
NISO in the standardization process is Standards Australia.
Thank you for the opportunity to comment on the standardization process of this significant
development in the field of bibliometrics.
Pat Loria
514
Mustapha Mokrane mustapha.mokrane@icsu-wds.org
6/18/14 20:31 Substantive
Dataset are traditional research outputs
ICSU WDS
I was rather surprised to read in the project description "this project will explore potential assessment
criteria for non-traditional research outputs, such as data sets..." in opposition to the "traditional" articles
and books! In the White Paper itself, however, you mention sometimes "New forms of scholarly outputs,
such as datasets posted in repositories" and that ""There seems to be consensus that research datasets and
scientific software should be included in the list of valuable research outputs".
This shows in my opinion the perpetuation of an unfortunate and artificial divide created between
papers and datasets when they used to be one unit! This has certainly resulted in the devaluation of
datasets as research output, as they became "supplementary material" to articles. Datasets are research
outputs and are as valuable as articles for the simple reason that the article has no value without the
underlying datasets.
515
Peter Kraker pkraker@know-center.at
Know-Center
7/1/14 7:31
Substantive
Data Quality
The importance of openness and transparency
This white paper is a concise yet comprehensive summary of the various issues surrounding altmetrics.
The members of this working group have obviously gone to great lengths to prepare this report and I
would like to congratulate them on their effort.
In my opinion there is, however, one important issue that has not been raised in this report: the inherent
biases of altmetrics. Altmetrics are usually created as a by-product of the acitivity of a certain community,
e.g. the members of a social reference management system, the Twitter users in a certain discipline etc.
These communities are usually not representative of the basic population in a discipline/research
community. Therefore, the altmetrics created from this activity carry an inherent bias, e.g. towards a
certain age group, geographic region etc. This has also been reported in the literature, see [1] and [2].
Of course, biases affect all scientometric analysis. In citation analysis, the criteria for the inclusion of
authors and papers in the analysis have an impact on the result. Therefore, the question is not how to
avoid biases, but how to make these biases visible and reproducible. In my opinion, the only way to deal
with this issue properly is to make the underlying data openly available. This way, the properties of the
sample are intersubjectively reproducible. Open data would also give more context to the aggregated
measures discussed in the report. Furthermore, open datasets would make it easier to uncover gaming
(and therefore possibly less appealing).
In my opinion, openness and transparency should therefore be strongly considered for altmetrics
standards.
[1] Bollen, J., & Sompel, H. Van De. (2008). Usage Impact Factor : The Effects of Sample Characteristics
on Usage-Based Impact Metrics. Journal of the American Society for Information Science, 59(1998),
136--149.
[2] Kraker, P. (2013). Visualizing Research Fields based on Scholarly Communication on the Web.
University of Graz. Available from http://media.obvsg.at/p-AC11312305-2001
517
Marcus Banks mab992@yahoo.com Independent consultant
7/3/14 14:41 Substantive
Grouping and Aggregation; Context
12-Oct
Nomenclature, contributorship/context
Thank you for preparing this excellent and comprehensive white paper. It definitely captures the tenor of
conversations at the NISO altmetrics meeting held in San Francisco last fall.
Two observations and one request.
The request first--please use the same numbering of action items in the body of the text as in the
beginning of the text, rather than starting over with # 1 in every new set of action items. Consecutive
numbering throughout please! This would make it easier to see at a glance which action items are
associated with which broad category of recommendations. Thanks for considering this.
On to the observations:
1.Nomenclature: I agree that "altmetrics" is not an apt term anymore, as we've moved past the "alt" stage.
How about "digital scholarship metrics?" This is less catchy but more descriptive and more current.
2. Contributorship/context: The need for a typology of contributorship roles, so that contributors at
various levels of intensity get proper credit, is pressing. Likewise, contextual clues--not all references to
other work are positive--are vital. That said, this information would necessarily be at a level of granularity
that cuts against the desire for one simple number that explains everything. The forest vs. the trees.
Proposed solution below.
To resolve the tension between the desire for granularity regarding contributorship roles and context, and
the simultaneous desire for a simple number "to rule them all," I propose that NISO develop standards
that facilitate analysis at multiple levels. The standards should allow for both broad/high level and
deep/granular exploration.
518
Paola De Castro
paola.decastro@iss.it Istituto Superiore di Sanità /EASE Council
7/4/14 6:58
Substantive
Stakeholders perspectives
12
Editors as missing stakeholders
Editors, as gatekeepers of information to be published in scholarly journal articles, are missing
stakeholders in this draft including "researchers, institutions, funders, publishers and general public".
Editors strive to guarantee the quality of information to be published in their journals, which then will be
evaluated through different metrics.
Therefore we suggest to consider also the role and perspectives of editors and editors’ associations (like
the European Association of Science Editors, EASE), striving for quality in scientific publications and
supporting editorial freedom, research integrity and ethical principles in publications. We believe that the
issue of quality cannot be disregarded when considering any form of alternative metrics.
Recognizing the need to improve the ways in which the output of scientific research is evaluated, many
journal editors and editors’ associations (including EASE) signed The San Francisco Declaration on
Research Assessment (DORA), a worldwide initiative covering all disciplines, mentioned in the NISO
draft. The DORA includes a set of recommendations to improve the ways in which the output of scientific
research is evaluated by funding agencies, academic institutions, and other parties. It is noteworthy that
the DORA was preceded by the EASE Statement on Inappropriate Use of Impact Factors, published by
the EASE, as early as in 2007, to alarm the scientific community with drastic examples of inappropriate
use of IF. EASE was one of the initial signers organizations of the DORA.
519
Judy Luther judy.luther@informedstrategies.com Informed Strategies
7/6/14 12:21 N/A
Scientists or researchers?
Throughout the document terminology refers to scientists which typically does not include arts and
humanities although there are references to books and to performances. "Researcher" is a more inclusive
term.
520
Lorrie Johnson johnsonl@osti.gov / U.S. Department of Energy / Office of Sci & Tech Info.
7/11/14 8:04 N/A
Funding agency perspective
Thank you for an informative white paper, and for the opportunity to comment. The Department of
Energy funds over $10 billion per year in energy-related research, and we are pleased to see that
“Funders” have been identified as an important stakeholder in the development of altmetrics
measures. As a funding agency, the ability to assess the impact and influence of the Department’s
research programs is important, for traditional text-based scientific and technical information, as well as
for new and emerging forms, such as multimedia, software tools, and datasets. This white paper, and the
potential action items defined therein, address many aspects of interest to DOE, including the
determination of differences in practices between various disciplinary fields; the recognition of contextual
and qualitative facets, such as how research is spreading into other disciplines; and the assessment of
long-term economic impact and benefit to tax-payers. We look forward to further development of
standards and best practices in this area.
521
Ferdinando Pucci
pucci.ferdinando@mgh.harvard.edu Massachusetts General Hospital
7/16/14 16:24 N/A
Fork Factor: an index of research impact based on re-use
I am very happy to know about the NISO effort to improve and standardize the current ways to assess
scholarship. As a postdoctoral fellow, I believe there is an urgent need to increase reproducibility of
science, especially in the biomedical field.
The intent of this comment is to propose to the NISO Alternative Assessment Metrics Project committee
a novel index to measure the impact of research, i.e. the fork factor (FF). I recently developed the ideas
behind the FF, and you can find a blog post describing them
at: https://www.authorea.com/users/6973/articles/8213/_show_article
Briefly, the main advantages of the FF are:
it can distinguish positive from negative citations (research evaluation issue, page 7-8)
it is a metric for nanopublication (research output, page 6) as well as for journal articles
it is based on a solid, well established versioning infrastructure (GitHub)
it promotes the publication of negative results (journalization of science issue, page 6)
it increases reproducibility by immediately spotting bad science (journalization of science issue,
page 6)
In addition, I believe that the FF will allow a smooth transition from journal articles to nanopublication,
which will considerably increase the speed of scientific discovery (considering that a biomedical paper in
high impact journals can take 4 to 6 years to be published, and that researchers prune away all the
negative results and dead end investigations).
Nonetheless, journal articles may still be considered for later publication, after the research story line
shapes up, which may allow researchers to also target a broader/lay audience.
In conclusion, I would love to contribute as working group member.
Thank you
523
Richard O'Beirne
richard.obeirne@oup.com
Oxford University Press
7/17/14 9:47 N/A
General comments
Comments are in a personal capacity from a publisher's point of view.
[p. 2] Potential Action Items. Some of these should be regarded as best practice which any organization
which considers itself to be an academic publisher is expected to support -- regardless of whether it's in
the context of altmetrics.
I'd say the following fall into this category:
4,5 - agreeing then supporting a taxonomy of academic Research Output types
10 - Promote and use persistent identifiers
12 (possibly) - data normalization (e.g. COUNTER rules)
13 - standardized APIs
18 - define and use contributorship roles
[p. 9] I think we need to get used to the idea that altmetrics - or at least some of the components of what
make up altmetrics - are inherently messy, change rapidly, are unreliable and to some degree should be
considered transient/disposable. Maybe 'indicator' rather than 'metric' is more appropriate.
[p. 10] Grouping and aggregation is indeed complex, but I would focus NISO's efforts on modularization
of specific metrics almost as a 'raw material', leaving aggregation questions to the altmetrics provider.
[p.14] Publishers also want to (have an obligation to) demonstrate to their authors the reach of their
research in terms of readership etc.
[p. 15] Fully agree with the three bullet points re prioritization (and I'd add the 'best practice' items
above.)
Many thanks for the work NISO has coordinated and put into the white paper. In the increasingly interconnected academic publishing world, standards, and engagement between standards organizations and
all service providers is essential. Keep up the good work!
524
Anna Maria Rossi
annamaria.rossi@iss.it Italian National Institute of Health-Publishing Unit
7/17/14 10:58 Editorial –
Bioresource Research Impact Factor (BRIF)
The BRIF Journal editors subgroup, including researchers and experts with editorial competencies
working at the Istituto Superiore di Sanità (ISS-Italian National Institute of Health) and at the Institut
National de la Santé et de la Recherche Médicale (INSERM-French National Institute of Health and
Medical Research ) would recommend to consider the development of a new metrics, the Bioresource
Research Impact Factor (BRIF), in the NISO Alternative Assessment Metrics (Altmetrics) Project.
The BRIF is an ongoing international initiative aiming to develop a framework in order to facilitate
accurate acknowledgement of resource use in scientific publications and grant applications via unique
resource identifiers, and to measure the impact of such resources through relevant metrics (algorithm).
Bioresources include both biological samples and their derivatives (e.g. blood, tissues, cells, RNA, DNA)
and/or related data (associated clinical and research data) stored in biobanks or databases.
An increasing proportion of biomedical research relies on biosamples and much of our medical
knowledge is acquired with the aid of bioresources collections. Sharing bioresources has been recognised
as an important tool for the advancement of biomedical research. A major obstacle for sharing
bioresources is the lack of acknowledgements of efforts directed at establishing and maintaining such
resources. Thus, the BRIF main objective is to promote the sharing of bioresources by creating a link
between their initiators or implementers and their impact on scientific research.
A BRIF would allow to trace the quantitative use of a bioresource, the kind of research using it, and
people and institutions involved. The idea is to construct a quantitative parameter, similar to the well
known journal Impact Factor (IF), that will recognise the most influential bioresources for the biomedical
scientific community and to measure their impact on research production through relevant metrics.
The BRIF Journal editors subgroup works in relation with science journals editors and is leading the
development of a bioresource citation guideline in the scientific literature. A proposal for a specific
guideline was posted in the Reporting Guidelines under development section of the EQUATOR Network
(October 2013).
Please consult:
www.gen2phen.org/groups/brif-bio-resource-impact-factor
Equator network [http://www.equator-network.org/library/reporting-guidelines-underdevelopment/#19]
Mabile L, Dalgleish R, Thorisson GA, Deschênes M, Hewitt R, Carpenter J, Bravo E, Filocamo
M, Gourraud PA, Harris JR, Hofman P, Kauffmann F, Muñoz-Fernàndez MA, Pasterk M, CambonThomsen A; BRIF working group: Quantifying the use of bioresources for promoting their sharing in
scientific research. Gigascience 2013, 2(1):7.
De Castro P, Calzolari A, Napolitani F, Rossi AM, Mabile L, Cambon-Thomsen A, Bravo E.
Open Data Sharing in the Context of Bioresources. Acta Inform Med 2013, 21(4):291-292.
Bravo E, Cambon-Thomsen A, De Castro P, Mabile L, Napolitani F, Napolitano M, Rossi AM:
Citation of bioresources in biomedical journals: moving towards standardization for an impact evaluation.
European Science Editing 2013, 39:36-38.
Cambon-Thomsen A, De Castro P, Napolitani F, Rossi AM, Calzolari A, Mabile L, Bravo E:
Standardizing Bioresources Citation in Scientific Publications. International Congress on Peer Review
and Biomedical Publication: 8-10 September 2013; Chicago, USA
Anna Maria Rossi
on behalf of the BRIF Journal editors subgroup :
Paola De Castro1 , Elena Bravo 2, Anne Cambon-Thomsen2, Alessia Calzolari 1, Laurence Mabile 2,
Federica Napolitani1 , Anna Maria Rossi1
1Istituto Superiore di Sanità, Rome, Italy
2 UMR U 1027, Inserm, Université Toulouse III - Paul Sabatier, Toulouse, France"
525
Andrew Sandland
andysandland@yahoo.co.uk n/a
7/18/14 7:15 Editorial
General comments - "(Publisher point of view, though all comments personal)
P.7 you point to the conflation of discovery and evaluation use cases, and I think the paper misses a
chance to distinguish the conflation of ‘article level metrics’ (ALMs) as opposed to ‘alternative metrics’.
The former can be somewhat traditional measures and metrics, but are non-aggregated measures that
perform much of the discovery functions described.
Discovery enhancement via ALMs (including altmetrics) is essential to the long-term outlook for highvolume publishing outlets and some of the purer elements of the Open Access ethos-- the idea that journal
brand and aggregated metrics are less valuable than publishing and letting the market -- via appropriate
filters (rather than the editors that put value behind the journal brand) -- show the way to the most
‘valuable’ articles. The paper misses this stake for volume publishers in the stakeholder perspective
(many different publishers are now volume publishers/technical review publishers via at least a few
products).
P.5: The term ‘social media metrics’ is inherently limiting, and denies some of the more targeted or
bespoke measures of impact that would be more relevant to stakeholders such as funders. Similarly
‘altmetrics’ is a nascent terminology that would pre-date a period when these become ‘metrics’, and ideas
of ‘alternativeness’ should probably be dropped.
P16: The outreach section doesn’t seem to be within the remit of the NISO project
P8. Use of these metrics in tenure and promotion decisions -- I’m not a researcher, but I have heard much
about the use and misuse of impact factor and the unforeseen consequences that this has had on academic
culture. Altmetrics are still metrics, and though a more diverse base for judgement is perhaps a good
thing, pouring on ‘more metrics’ to this process is perhaps a case of not learning our lessons from prior
experience. While it is nice for alternative research outputs to be formally recognised, do the majority of
academics across the majority of disciplines really want to have their work put into these checks and
balances as they have the IF?
General: IF developed as the product of a single commercial outlet that now holds the monopoly on this
metric. Many of the inputs to altmetrics, and some of the providers themselves, are private, for profit
enterprises (measures of Facebook, Twitter, Mendeley, and providers like Digital Science’s
Altmetric.com). Inclusivity in these metrics must be fluid and and barrier-less, and the mechanism by
which they are implemented must be open to inputs from potentially competing commercial interests.
Otherwise they stand the danger of becoming walled gardens, following the direction of the singleprovider status of IFs.
P6: The argument regarding personality types -- and the idea that scientists/researchers should be
expected to have a Twitter profile. Notwithstanding dedication to single commercial entities like Twitter,
there is a degree to which funding agencies can and could dictate these activities. If they wish to
demonstrate impact and social engagement via these statistics, I can envisage it being a requirement -part of the job -- to publicise as well as conduct the research. "
526
Micah Altman escience@mit.edu
MIT 7/18/14 13:10
Substantive
- Scientific Basis of Altmetric Construction
Scholarly metrics should be broadly understood as measurement constructs applied to the domain of
scholarly/research (broadly, any form of rigorous enquiry), outputs, actors, impacts (i.e. broader
consequences), and the relationships among them. Most traditional formal scholarly metrics, such as the
H-Index, Journal impact Factor, and citation count, are relatively simple summary statistics applied to the
attributes of a corpus of bibliographic citations extracted from a selection of peer-reviewed journals. The
Altmetrics movement aims to develop more sophisticated measures, based on a broader set of attributes,
and covering a deeper corpus of outputs.
As the Draft aptly notes, in general our current scholarly metrics, and the decision systems around them
are far from rigorous: ""Unfortunately, the scientific rigor applied to using these numbers for evaluation is
often far below the rigor scholars use in their own scholarship.”[1]
The Draft takes a step towards a more rigorous understanding of alt metrics. It’s primary contribution is to
suggest a set of potential action items to increase clarity and understanding.
However, the Draft does not yet identify either the key elements of a rigorous (or systematic) foundation
for defining scholarly metrics, their properties, and quality. Nor does the Draft identify key research in
evaluation and measurement that provide a potential foundation. The aim of these comments is to start to
fill this structural.
Informally speaking, good scholarly metrics are fit for use in a scholarly incentive system. More formally,
most scholarly metrics are parts of larger evaluation and incentive systems, where the metric is used to
support descriptive and predictive/causal inference, in support of some decision.
Defining metrics formally in this way also helps to clarify what characteristics of metrics are important
for determining their quality and usefulness.
- Characteristics supporting any inference. Classical test theory is well developed in this area. [2] Useful
metric supports some form of inference, and reliable inference requires reliablilty.[3] Informally, good
metrics should yield the similar results across repeated measurements of the same purported
phenomenon.
- Characteristics supporting descriptive inference. Since an objective of most incentive systems is
descriptive, good measures must have appropriate measurement validity. [4] In informal terms, all
measures should be internally consistent; and the metric should be related to the concept being measured.
- Characteristics supporting prediction or intervention. Since objective of most incentive systems is both
descriptive and predictive/causal inference, good measures must aid accurate and unbiased inference. [5]
In informal terms, the metric should demonstrably be able to increase the accuracy of predicting
something relevant to scholarly evaluation.
- Characteristics supporting decisions. Decision theory is well developed in this area [6]: The usefulness
of metrics is dependent on the cost of computing the metric, and the value of the information that the
metric produces. The value of the information depends on the expected value of the optimal decisions that
would be produced with and without that information. In informal terms, good metrics provide
information that helps one avoid costly mistakes, and good metrics cost less than the expected of the
mistakes one avoids by using them.
- Characteristics supporting evaluation systems. This is a more complex area, but the field of game theory
and mechanism design are most relevant. Measures that are used in a strategic context must be resistant
to manipulation -- either (a) requiring extensive resources to manipulate, (b) requiring extensive
coordination across independent actors to manipulate, or by (c) inventing truthful revelation. Trust
engineering is another relevant area -- characteristics such as transparency, monitoring, and punishment
of bad behavior, among other systems factors, may have substantial effects. [8]
The above characteristics comprise a large part of the scientific basis for assessing the quality and
usefulness of scholarly metrics. They are necessarily abstract, but closely related to the categories of
action items already in the report. In particular to Definitions; Research Evaluation; Data Quality; and
Grouping. Specifically, we recommend adding the following action items respectively:
- [Definitions] Develop specific definitions of altmetrics that are consistent with best practice in the
social-science field on the development of measures
- [Research evaluation] - Promote evaluation of the construct and predictive validity of individual
scholarly metrics, compared to the best available evaluations of scholarly impact.
- [Data Quality and Gaming] - Promote the evaluation and documentation of the reliability of measures,
their predictive validity, cost of computing, potential value of information, and susceptibility to
manipulation based on the resources available, incentives, or collaboration among parties.
[1] NISO Altmetrics Standards Project White Paper, Draft 4, June 6 2014; page 8
[2] See chapter 5-7 in Raykov, Tenko, and George A. Marcoulides. Introduction to psychometric theory.
Taylor & Francis, 2010.
[3] See chapter 6 in Raykov, Tenko, and George A. Marcoulides. Introduction to psychometric theory.
Taylor & Francis, 2010.
[4] See chapter 7 in Raykov, Tenko, and George A. Marcoulides. Introduction to psychometric theory.
Taylor & Francis, 2010.
[5] See Morgan, Stephen L., and Christopher Winship. Counterfactuals and causal inference: Methods
and principles for social research. Cambridge University Press, 2007.
[6] See Pratt, John Winsor, Howard Raiffa, and Robert Schlaifer. Introduction to statistical decision
theory. MIT press, 1995.
[7] See ch 7. in Fudenberg, Drew, and Jean Tirole. ""Game theory, 1991."" Cambridge,
Massachusetts (1991).
[8] Schneier, Bruce. Liars and outliers: enabling the trust that society needs to thrive. John Wiley & Sons,
2012.
Submitter Proposed Solution:
"The above characteristics comprise a large part of the scientific basis for assessing the quality and
usefulness of scholarly metrics. They are necessarily abstract, but closely related to the categories of
action items already in the report. In particular to Definitions; Research Evaluation; Data Quality; and
Grouping. Specifically, we recommend adding the following action items respectively:
- [Definitions] Develop specific definitions of altmetrics that are consistent with best practice in the
social-science field on the development of measures
- [Research evaluation] - Promote evaluation of the construct and predictive validity of individual
scholarly metrics, compared to the best available evaluations of scholarly impact.
- [Data Quality and Gaming] - Promote the evaluation and documentation of the reliability of measures,
their predictive validity, cost of computing, potential value of information, and susceptibility to
manipulation based on the resources available, incentives, or collaboration among parties. "
527/8
Cameron Neylon
cneylong@plos.org
PLOS 7/21/14
Substantive
- PLOS Response to NISO Altmetrics White Paper
*Executive Summary*
PLOS welcomes the White Paper which offers a coherent statement of the issues and challenges facing
the development, deployment, adoption and effective use of new forms of research impact indicators.
PLOS has an interest in pursuing several of the actions suggested, with an interest in three areas:
1.Actions to increase the availability, diversity and quality of new and emerging indicators of research use
and impact
2.Community building to create shared, public and open infrastructures that will enhance trust in, critique
of, and access to useful information
3.Expansion of existing data sources and tools to enhance the availability of indicators on the use and
impact of research data and other under-served research output types
PLOS proposes that the Potential Action Items be prioritised by the community as well as ordered in
terms of dependencies and timelines. We propose some prioritisation below. In terms of timelines we
suggest that a useful categorisation would be to divide actions into "near-term", i.e the next 6-12 months,
"within-project" and "beyond-project".
NISO should focus its efforts on those actions that are a good fit for the organisation's scope and remit.
These fill focus on tractable best practice developments alongside coordinating community developments
in this space. The best fit actions for NISO are:
1. Develop specific definitions for alternative assessment metrics
4. Identify research output types that are applicable for the use of metrics
14. Develop strategies to increase trust, e.g. openly available data, audits or a clearinghouse
17. Identify best practices for grouping and aggregation by journal, author, institution, and funder
One aspect missing from the report is a discussion of data licensing. This is directly coupled to the issue
of availability but also underpins many aspects of our comments. Downstream service provision as well
as wider critique and research will be best supported by clear, consistent and open terms of usage data.
Identifying how to achieve this will be an ongoing community challenge.
*Categorising and Prioritising the Potential Action Items*
The following builds on our submission to the HEFCE Enquiry on Metrics in Research Assessment. Our
fundamental position is that these new indicators are currently useful as evidence to support narratives.
They may be useful in the mid-term as comparators within specific closely related sets of work but even
in the long term it is unlikely that any form of global analysis across the research enterprise could rely
credibly on mechanistic analysis.
This position is built on our understanding and expertise in what currently available indicators are useful
for and what further work is required for their sensible use in a range of use cases. Currently the available
underlying data is not generally available, is not of high enough consistency and quality and has not been
subject to sufficient critical analysis to support quantitative comparisons across research outputs in
general.
To enable deep critical analysis of what the data can tell us requires access to a larger corpus of consistent
and coherent data. This data should be available for analysis by all interested parties to build trust and
enable critique from a wide range of perspectives. Therefore our priorities are: first to make more data
available; second to drive an increase in the consistency and comparability of that data through
comparison, definition development and ultimately agreement on standards; third to create the
environment in which interested parties can compare, analyse, aggregate and critique both the underlying
data, emerging standards, aggregation analysis processes, and the uses to which indicators and aggregate
measures are put.
Our experience is that widespread adoption can only be built on community trust in frameworks, systems
and uses of these new sources of information. That community trust can only be built in turn by using
reliable and widely available data to support conversations on how that information might be used.
Throughout all of this a powerful mechanism for ensuring an even playing field for innovative service
providers, enabling scholarly critique and ensuring transparency is to ensure that the underlying data that
supports the development of metrics is openly available.
*Suggestions on prioritisation*
We propose priorities at the level of the categories used in the White Paper.
The Potential Actions listed under Data Quality and Gaming speak most directly to the question of data
availability and consistency. In parallel with this initial actions under Discovery and Evaluation will
define the use cases needed for the next stage of analysis. Much of these actions are also possible
immediately and several are underway.
We suggest that those actions listed under Research Outputs, Grouping and
Aggregation and Context form the next tier of priorities, not because these are less important but because
they will depend on wider data availability to be properly informed. They will also build on each other
and therefore should be taken forward together.
Work on Definitions, Stakeholder Perspectives and Adoption is necessary and elements of this should be
explored. Success in these areas will be best achieved through ensuring stakeholder engagement in the
development activities above. Trust and wider engagement will be best achieved through adopting open
approaches to the technical developments which includes an active outreach component.
PLOS is keen to engage in concrete actions that will help to deliver community infrastructures and wider
data availability. We are already engaged in projects that are investigating share community spaces and
infrastructure (Crossref DOI Event Tracker Working Group) exploring new data sources and forms of
indicator, as well as ongoing work on community building and advocacy (European Research Monitoring
Workshop - report forthcoming).
*Timing*
Several potential actions are feasible to undertake immediately or are already underway. Engaging with
those initiatives already promoting unique identifiers and building shared infrastructure will be valuable.
The Crossref Working Group should be engaged as a locus for data availability and experimentation. The
Data Citation Interest Group hosted by FORCE11 is another point of contact which is already engaged
with the Research Data Alliance.
In the medium term a stable working group locus should be established to provide a centre for ongoing
conversations. This could be based within a NISO context or could take advantage of the FORCE11
infrastructure. There should be coordination with the European Commission Study on 'Science 2.0' which
proposes an observatory of changing practices in the research enterprise. Within the scope of the NISO
project such a group could seek to develop Best Practice statements in this space using the Data Citation
Principles as a template. Any such group should be broad based and include a range of stakeholders
including those beyond the technical development community.
We regard the development of "capital S Standards" as only feasible for a small subset of indicators
within 2014-16 timeframe. It may be possible to define Standards development for some data collection
practices and definition of data sources/API usage such as Mendeley reader counts or Facebook counts.
However as all such efforts take time it is important to lay the groundwork for such Standards to be
developed as and when they can. The working group should therefore focus on producing Best Practice
and "small s standards" as and when possible so as to build the wider conversation towards wide adoption
of consistent community practice that can ultimately be codified.
Download