World Benchmark Calculations for IntERAct

advertisement
World Benchmark Calculations for IntERAct
Citations per paper for world benchmarking
We can use the Scopus Custom Data set to determine year-specific, FoR-specific world benchmark
citations per paper. These benchmark citations per paper need to be calculated for each publication
year 2005-2010 for each (relevant) 4-digit FoR code (designated cppFoR,y).
For a particular 4-digit FoR code, and a given year in the ERA window, we need to identify the
outputs in the Scopus Custom Dataset that satisfy the following eligibility criteria:
1. have the particular publication year (based on element publicationyear in the Scopus
Custom Dataset).
2. are in journals with that 4-digit FoR permitted on the 2012 ERA journal list (based on ISSN
and/or title matching).
3. are indexed with a document type of journal article, conference paper, or review (based on
the element citation type having a value of ar, cp or re in the Scopus Custom Dataset – these
are the Scopus article types the ARC used in ERA 2010 to determine the world benchmarks).
If c is the sum of citation counts for the publications that satisfy 1, 2 and 3 above, and n is the
number of publications that satisfy 1, 2 and 3 above, then the benchmark for the particular FoR for
that publication year (cppFoR,y) would be:
cppFoR,y = c/n
That is, the citations per paper benchmark in a particular year and FoR is the sum of citations to the
publications in that FoR published in that year divided by the total number of publications.
We need to do the above for each publication year 2005, 2006, 2007, 2008, 2009, 2010 for each FoR
in which citation analysis is being used.
Relative Citation Impact (RCI)
RCI for an individual paper
For an individual indexed journal article published in year y with a citation count of cja as at the date
we capture static citation counts and calculate benchmarks, the Relative Citation Impact for that
article in a particular FoR (i.e. the RCIja,FoR) would be:
RCIja,FoR = cja/cppFoR,y
If the apportionment of the journal article to a particular FoR is a, then the apportioned Relative
Citation Impact for that FoR (aRCIja,FoR) is:
aRCIja,FoR = RCIja,FoR * a
Average Apportioned RCI(world) for an FoR
The Average Apportioned RCI for a FoR is the average of the apportioned RCIs for each paper in a
given FoR (i.e. the sum of all of the apportioned RCI’s divided by the total apportioned count of
indexed articles in the FoR). In the ERA 2010 Citation Benchmarking Methodology document, this
was referred to as the Average RCI (world). In this current document, it is referred to as the Average
Apportioned RCI, AaRCI. (This is to differentiate it from the average of the RCIs, calculated without
taking into account apportionment, designated ARCI).
1
Table 1 shows each of these calculations for a hypothetical FoR to which 10 papers are assigned.
(The number of outputs is too small to be statistically meaningful; it is for illustrative purposes only.)
Paper Static citation
count cja
(from Scopus Custom
Dataset)
1
2
3
4
5
6
7
8
9
10
2
3
3
4
3
5
6
4
5
6
Total
apportionment:
Total number
of articles:
Apportionment Publication World
a
year y
benchmark
(from IntERAct)
(from
cppFoR,y
0.8
0.5
1.0
0.3
1.0
0.7
0.8
0.2
1.0
1.0
7.3
10
IntERAct)
(calculated
from Scopus
Custom
Dataset)
2005
2005
2006
2006
2007
2007
2007
2008
2008
2009
2.3
2.3
3.1
3.1
3.8
3.8
3.8
4.2
4.2
4.2
RCIja,FoR =
cja/cppFoR,y
Apportioned
RCI =
aRCIja,FoR
= RCIja,FoR * a
RCI
Class
0.869565
0.695652
1.304348
0.652174
0.967742
0.967742
1.290323
0.387097
0.789474
0.789474
1.315789
0.921053
1.578947
1.263158
0.952381
0.190476
1.190476
1.190476
1.428571
1.428571
Total
8.485873
aRCIFoR
AaRCIFoR
1.16
(world)
Total RCIFoR 11.68762
II
III
II
III
I
III
III
II
II
III
ARCIFoR
1.17
Table 1: RCI calculations for a hypothetical FoR with 10 publications assigned with various
apportionments.
RCI Classes
We can use the RCI calculated for each article (RCIja,FoR) and assign a RCI Class based on the seven
classes of RCIs used by the ARC in ERA 2010. The range of RCIs matching each RCI Class is shown in
the first 2 columns of Table 2.
An RCI Class Profile can then be compiled by counting the number of apportioned articles within
each RCI Class. As an example, the RCI classes for each of the 10 papers in the hypothetical FoR are
shown in the last column of Table 1. (Note: the RCI class assignment for an individual publication is
based on the article’s RCI not the apportioned RCI. However, as described, the profile is done by
counting the apportioned articles within each RCI class.)
Based on the information in Table 1, the RCI class profile for this FoR would be as shown in Table 2.
2
Class
RCI range
Apportioned
% of
No. of indexed
apportioned
articles
indexed articles
0
0
0
0%
I
0.01 – 0.79
1.0
13.7%
II
0.80 – 1.19
3.0
30.1%
III
1.20 – 1.99
3.3
45.2%
IV
2.00 – 3.99
0
0%
V
4.0 – 7.99
0
0%
VI
>=8
0
0%
Table 2: RCI Classes and RCI Class profile for the hypothetical FoR with publications described in
Table 1.
Notes on the proposed methodology for world benchmark calculations
We are assuming that benchmarking will be based on outputs indexed by Scopus as articles, reviews
or conference papers as per the 2010 benchmarking methodology.
The proposed methodology for determining the year-specific benchmarks does not use any data
from MD or 2-digit-coded journals. In ERA 2010, the ARC actually used the data submitted to identify
MD and 2-digit journals that behaved more like 4-digit journals. If ≥ 25% of all articles submitted to
ERA 2010 in a particular MD or 2-digit coded journal was assigned a particular 4-digit FoR code AND
this constituted more than 50 apportioned items, that MD or 2-digit journal contributed to the
relevant FoR code world benchmark. The ARC applied this methodology to 157 journals in ERA 2010.
The way in which these journals were coded for the purpose of the benchmark calculations is
presented in Appendix 2 of the ERA 2010 Benchmarking Methodology document. As an example, the
journal Applied Physics Letters was given 02, 09 and 01 on the ERA 2010 RJL, but contributed to the
benchmarks in FoR 0204. (I wondered whether the ARC used this to refine the FoR codes provided
for these journals on the 2012 list, but "Applied Physics Letters" still has 09 and 02 in the 2012 list.)
Obviously, we don't have access to this data for the 2012 submission, since it will be based on the
actual FoR code allocations institutions use in the 2012 submission. Further, the 66% rule may be a
complicating factor - as an example, if journal articles in a particular journal are consistently reassigned to a particular non-listed FoR using the 66% rule then the ARC might choose to use that
journal in the benchmark calculations for the non-listed FoR.
In short, whilst we can calculate reliable benchmarks based on the Scopus Custom Dataset they will
not be exact matches of the benchmarks used by the ARC. This is because of the MD and 2-digit
journal issue above, as well as the fact that the "capture date" of the citation counts we use to
determine benchmarks will be some months earlier than 1 March 2012 used by ARC.
For the Average Apportioned RCI calculations we have assumed that apportionment will be used in
the same way in ERA 2012 as in ERA 2010. The ARC has not made any statement regarding the
citation benchmark methodology to be adopted in ERA 2012. Hence it is proposed that we present
both the Average RCIs and Average Apportioned RCIs in IntERAct.
3
Proposed Stages for Benchmark Calculations for IntERAct
Stage 1
a. Match the ERA 2012 Journal List to the Scopus Custom Data Set
A critical first step is to match the Scopus Custom Data to the ERA 2012 Journal List. This can be
done on the basis of ISSNs as well as journal title, but will require some manual effort as well.
For example, the British Medical Journal, indexed by Scopus as “BMJ”, is on the ERA Journal List
as “BMJ: British Medical Journal”. This would not match on title. Further, the ISSNs on the ERA
2012 Journal list do not include the ISSN in the Scopus journal record for BMJ, so ISSN-matching
would also fail. These would need to be manually matched.
b. Calculate world benchmarks for 4-digit FoRs and extract static citation counts for UQ
indexed papers
 Calculate year-specific, FoR-specific world benchmark citations per paper (cppFoR,y) at the
4-digit FoR level.
 Extract static citation counts from the Scopus Custom Dataset for each publication in
eSpace with an EID.
 Calculate RCIs and aRCIs for each indexed article and present in IntERAct.
 Calculate Average RCIs (ARCI) and Average Apportioned RCIs (AaRCI) for each 4-digit FoR
where citation analysis is used.
 Make data available in IntERAct.
IntERAct: Screenshots in Appendix
Stage 2
a. MD and 2-digit journal considerations
Include the journals listed in appendix 2 of the 2010 benchmarking methodology
document - look at the 2010 MD and 2-digit list (Appendix 2), work out which ones are
still MD/2-digit in 2012, assign them as per the re-mapped FoR codes used in the 2010
exercise, and re-calculate year-specific, FoR-specific world benchmark citations per
paper (cppFoR,y) at the 4-digit FoR level. Identify which 4-digit benchmarks are changed
because of the inclusion of these MD/2-digit journals, and decide whether to update 4digit benchmarks based on this.
b. 2-digit benchmark calculations
Calculate year-specific, FoR-specific world benchmark citations per paper (cppFoR,y) at the
2-digit FoR level.
Where the number of indexed journal articles indexed in Scopus for a given FoR across
all ERA years is <800, a 2-digit level benchmark was used in ERA 2010. In 4-digit FoRs
where the combined outputs 2005-2010 number <800, compare the 2-digit benchmark
with the 4-digit benchmark and decide which to use.
c. Centile Analysis
The world centile thresholds for a particular FoR code and publication year can be
derived by determining the number of citation counts required to be in the top 1, 5, 10,
25 and 50 percent of all articles in the Scopus Custom Dataset satisfying the eligibility
criteria outlined in 1, 2 and 3 above.
Put together a “Percentiles Table”, showing the minimum number of citations a paper
needs to meet the percentiles within each 4-digit FoR, by year of publication. This
comes with the caveat that our calculations will not be same as the ARC’s.
4
Stage 3
1. Australian HEP Benchmarks
It has been decided not to pursue Australian Benchmark Calculations, as we cannot accurately
calculate Australian HEP benchmarks. For ERA, these will be calculated on the basis of the papers
actually submitted by Australian HEPs to ERA, determined by the staff census date approach rather
than addresses on papers.
However, options were discussed and these included the use of “addresses on papers” as proxies.
A number of possibilities were considered: three options are described here. One is to follow the
same methodology as for the world benchmarks but restrict to those items in the Scopus Custom
Dataset with Australia listed in the address (as an affiliation country). This would include all outputs
with Australian research organisations listed on publications, including CSIRO etc and not just the
HEPs.
The second method would be to search for the outputs of each of the Australian HEPs separately
based on organization in the Scopus Custom Dataset.
The third method would be to derive Go8 benchmarks (plus, perhaps, some of the others who did
well in ERA 2010 like Griffith). We would then calculate HEP Go8 benchmarks based on address
searches of the Scopus Custom Dataset. This may be useful in FoRs where the Australian RCI is
higher than the world benchmark.
5
APPENDIX: Suggestions for IntERAct

6
Add column here for Average Apportioned RCI
- AaRCI.
Only have rows for the following:




UQ Total Indexed (number of pubs)
UQ Total Indexed apportioned number
UQ Total Cites (Scopus)
AaRCI - Average Apportioned RCI (taking apportionment
into account
Include additional columns here for
RCIja,FoR and aRCIja,FoR for each paper.
The column headings would be
RCI, aRCI
7
One or more RCIs will need to be presented here
as they will be FoR-code specific. For an output
assigned 100% to one FoR, only one RCI is
required. Where assignment is to more than one
FoR, all relevant RCIs will need to be presented.
Present article-level RCIs and apportioned RCIs
(aRCI)
Present as: FoR code, RCIja,FoR, aRCIja,FoR
e.g. 0302, 2.3, 1.8
Column Heading should be:
“FoR, RCI, aRCI”
8
This is the article-level apportioned
RCI (i.e. aRCIja,FoR)
Column heading should be:
“aRCI”
This is the article-level RCI: that is, it is RCIja,FoR
It will show the relative performance of the article across all possible
FoRs.
Column heading should be: “RCI”
9
For each assigned FoR, have columns for:


RCIja,FoR(article-level RCI)
aRCIja,FoR (article-level apportioned RCI)
Column headings for the above would be “RCI” and “aRCI”
(“World RCI” column should be relabelled “RCI”, and add
another column for “aRCI”)
Remove World cpp, and all columns with “National”
benchmark information.
10
Definitions:
Acronym
FoR
a
cpp
IntERAct Label
FoR
%
Description
Field of Research
apportionment
Citations per paper
cpp
c
Scopus Cites
AaRCI
Citation count of
individual paper
Average Apportioned
RCI (taking
apportionment into
account)
AaRCI
RCIja,FoR
RCI
aRCIja,FoR
11
aRCI
Article-level RCI
(based on the citations
per paper benchmark for
papers in that FoR
published in the same
year as the article of
interest)
Article level apportioned
RCI
Definition/Formula
Derived from Scopus
Custom Dataset; this is
year- and FoR-dependent
Based on static citation
count in Custom Data Set
the sum of all of the
article-level apportioned
RCI’s divided by the total
apportioned count of
indexed articles in the FoR
(In ERA 2010 this was
referred to as the Average
RCI (world)
RCIja,FoR = cja/cppFoR,y
aRCIja,FoR = RCIja,FoR * a
Download