Life and Environmental Sciences Division, University of Oxford

advertisement
Annexe B
Life and Environmental Sciences Division, University of Oxford
General observations
(a)
endorsement of the reasons for reviewing the form of assessment;
(b)
the combination of philosophical and practical considerations which inform the
consultation document is welcomed;
(c)
the RAE and its funding consequences have created an ethos where too much attention
is focused on research, and the status of teaching is devalued; any future review should
be more broadly based;
(d)
continuing belief in the importance of peer review in the process of assessment and
concern that no algorithm based solely on quantitative metrics would substitute for expert
judgement of the value of published research. It is accepted that this is time-consuming
and could be better organised (e.g. to prevent the disruptive effects of removing cited
publications from public use), but expert judgment is a key element of the assessment
process;
(e)
that a quinquennial exercise is as frequent as can be supported without the exercise
consuming disproportionate amounts of time and money and without too much research
effort being tailored towards its needs;
(f)
that institutions should retain considerable discretion in composing their submissions, so
as to reflect their distinctive qualities;
(g)
that the number of grades is too few, leading to very large shifts in funding as a result of
marginal changes (for example between 5 and 5*);
(h)
there is too much inconsistency between panels and between exercises, for example
with respect to:
(i) the percentage of 5 and 5* departments (a three-fold difference); what appears to
be a judgment of absolute strength, on an international or national standard, is in
fact a judgment of relative strength;
(ii) the extent of grade inflation from exercise to exercise; it is hard to believe that
relative improvement is really so variable from subject to subject;
(iii) the weight given to certain metrics such as research income and research
assistant numbers;
(i)
the standard of international benchmarking needs to be refined.
Group 1: Expert Review
(j)
Assessments should continue to be retrospective and by experts, the fundamental
objective should be to consider the quality of published output;
(k)
concern about the arbitrariness of dividing research activity by UoA; some method of
indicating the presence and quality of complementary disciplines in the same institution
would give some idea of “disciplinary ecology” as well as the achievements of individual
entities (for example the Oxford panel-member for Archaeology – an Egyptologist – was
actually assessed in a different unit, Oriental Studies; the fact that archaeologists exist
outside UoA 58 is worth recognising, especially if quantitative measures of “critical mass”
within an institution come to be used);
-2(l)
the exercise undervalues subjects which are by their nature interdisciplinary, activities
which are at the margins of individual units, and research which is collaborative between
institutions, by its focus on ‘agenda setters’ within RAE units on an institution by
institution basis;
(m) some panel members serve for too long and may in consequence wield influence on a
discipline;
(n)
the contribution made by younger staff is undervalued by the emphasis on established
international reputation; this will contribute an ageing effect to an already ageing
academic population.
Group 2: Algorithm
(o)
If a metric algorithm is employed, there will be an inevitable drift to precisely calculable
indices. (It is not clear how one might measure reputation based on “surveys”,
Bibliometry, student numbers, and external research income, are more easily
quantifiable). This will result in an impoverishment of discrimination. The introduction of
numerical targets (for easily quantified variables) has already reduced the scope for
clinical judgement in the NHS. The mistake should not be repeated. It would encourage
number-chasing at the expense of genuine value, and originality would be the casualty;
(p)
the metrics used should be more transparent and should give due weight to long-term
projects and major works of scholarship;
(q)
these metrics should include measures of reputation based on surveys (but note the
concern under (o)), external research income, bibliographic measures (refined to cover
the total corpus of material produced by a department), research student numbers, and
numbers of postdoctoral research assistants.
Group 3: Self Assessment
(r)
While self assessment may be morally uplifting for individuals, it is less so for groups;
there is no substitute for an outside view.
Group 4: Historical Ratings
(s)
The usefulness of a historical rating system depends on the objective in producing the
ratings; infrastructural advantages are probably good predictors of present performance,
but inherently unfair to those without them (the “added value” debate in schools ratings).
They thus tend to reinforce current inequalities. It is a matter of policy whether the
objective is to build on strength or to produce new candidates for future greatness.
Group 5: Cross-cutting Themes
(t)
What should an assessment of the research base be used for? Not to construct ‘league
tables’: such tables are constantly in danger of being misused, if taken out of context;
(u)
Frequency of assessment? As rarely as possible: digging up plants to examine their
roots does not promote growth;
(v)
What is excellence? Probably only something which is recognised long after a work is
published; and almost certainly not something done specifically with a “research
assessment exercise” in mind, since in many areas this simply produces a spate of overinflated and under-prepared publications. But peer-review comes as close as possible to
identifying the current value of work done. (Creativity and applicability are two criteria for
recognising good work; independent research dimensions need to be assessed
separately).
-3(w)
Should assessment determine the proportion of the available funding directed to each
subject? Probably not. Comparisons between panels are invidious and will embed
existing inconsistency. Quality against an international standard and strategic judgment
seem more calculated to promote excellence than metrics based on external funding,
which simply reinforces the already successful;
(x)
Should all institutions be assessed in the same way? There is no point in comparing
institutions which are self-evidently different. Many of the present problems stem from
attempting to do so following the abolition of the binary line;
(y)
Should each subject be assessed in the same way? Practitioners are best placed to
judge the most sensitive means of assessment: subject communities should have as
much autonomy as possible;
(z)
How much discretion should institutions have in assembling submissions? They should
retain considerable discretion. Individual institutional control allows provision for
difference; the one size fits all approach has obvious disadvantages in doing this;
(z1)
How can the exercise best support quality of treatment for all groups of staff? It should
recognise that these quinquennial exercises most visibly disadvantage the original
thinker or the project which matures over a longer term than that of the frequency of the
exercise;
(z2)
What are the most important features of an assessment process? That it should be
simple, flexible, not burdensome; and not simply number driven.
Additional issues
These were identified in the division’s analysis of the last RAE, some of which have a bearing
on the design of a possible successor scheme, including
(aa) the apparent desire of the assessors to see even non laboratory subjects (e.g.
Anthropology, Geography) based on strategic research groups and not simply
collaboration;
(bb) the need for clarity and consistency over which staff are to be submitted and which
excluded; this would avoid much ‘jockeying’ for position through manipulation of returns;
(cc) the weight given to major written outputs – such as monographs – in comparison to
research papers.
Download