Are altmetrics going to change the way we assess research outcomes?
A lot of the focus around altmetrics has been tilted towards social media. And that’s interesting, but it’s a rather superficial proxy for really understanding whether research is having an impact on important societal problems such as changing the practice of the criminal justice system. What’s happening in social media can give us some useful information, but I think it would be dangerous to link funding to those indicators.
In general, we are at a very early stage in developing effective indicators for societal impacts. There’s room for developing newer and more helpful impact metrics.
What is the Metric Tide?
The Metric Tide was the final report of an independent review of the role of metrics and quantitative indicators in the management and assessment of UK research. It was commissioned by the UK government and I chaired it. I worked with a group of 12 experts – scientists, social scientists, bibliometricians, research funders – for about 18 months, and the report was published in the summer of 2015. At that time, there was growing discussion across the global research community about metrics and their uses. DORA (San Francisco declaration on research Assessment) and the Leiden manifiesto are two initiatives that had taken this discussion forward.
Why did the UK government commission this work?
The narrow, specific reason was linked to the Research Excellence Framework (REF). Every 5-6 years, the REF assesses the UK national research system based on peer-review subject panels and allocates about a third of the public research budget across universities and across all disciplinary areas. In 2014, the government wanted to look at whether the whole exercise could be done in a more efficient way by just using metrics, so the Metric Tide review was initiated.
The broader aspect was the greater significance attached to quantitative indicators and metrics of various kinds in the management of research, in the allocation of funding, in the assessment of individuals and research groups in universities. We wanted to look at that broad phenomenon in a more holistic way and see what this “raising tide” of metrics means for the research culture, research practice, and the way we govern and direct our science and research system. The report also generated interest outside the UK.
One of the outcomes of the report is that we need more metrics, but they have to be responsible. What does responsible metrics mean?
We came up with this term “responsible metrics” to convey both the possibilities and the pitfalls of metrics usage. We are all aware of the many instances in which certain indicators get used inappropriately in research assessment and management processes. The most obvious and egregious example is the misuse of journal impact factors. We know from a large volume of empirical work that the correlations between the quality of an individual paper and the impact factor of the journal it was published in are poor. And yet we constantly see impact factors used in inappropriate ways.
Responsible metrics are used in a sensible, robust way that can be a valuable part of the management of the research system. But we need to be very alert to the context in which they are used.
What should responsible metrics look like?
Data should be as robust as possible. We want to make sure there is enough coverage of the different disciplines and that different research outcomes are accounted for. And we need humility in the way we use metrics: they should support but not supplant peer evaluation. Academic research is a complicated endeavour by its nature and you can achieve a more nuanced assessment of research with some combination of metrics and peer review.
In addition, there are other factors such as transparency, i.e., that those being assessed understand the nature of the measurements and indicators that are being used to assess their work. And we also need diversity: a diverse set of indicators and research outcomes – from papers to exhibitions to data sets – but also of different career paths.
What would be good examples of non-responsible metrics versus responsible metrics?
An example of a bad practice could be the ResearchGate score. The website ResearchGate is used by many academics as a convenient way to share their work with peers. The site also awards you a score, but it’s very unclear on which algorithm basis the score is calculated. That is not a responsible metric. The other obvious example would be many international university and research rankings, which are methodologically and statistically dubious.
An example of a good practice in recruitment or assessment of individuals, e.g. for promotion, would be to ask researchers to highlight in a narrative way the two or three contributions to research that they consider to be the most important in their career to date and why. And then the panel can read that work. It doesn’t matter what journals the articles were published in. You are bringing more qualitative, evaluative dimensions to that process.
What about the concern that peer review might be very vulnerable to intrinsic and systemic biases?
Ideally, you need a mix of quantitative indicators and qualitative expert judgement. Peer review is not perfect; we are all aware of its weaknesses. But, at the same time, it’s rather like democracy: it’s the least bad system we have developed as the academic community to govern ourselves.
Peer review, when it’s done well, is formative as well as summative, i.e. we are not just trying to evaluate but also to improve the quality of each other’s work, whereas metrics are most commonly only summative.
But it’s true that metrics can also act as a more objective and positive countervailing force in places with a culture of patronage or nepotism or sexism. And this would be, in fact, a responsible use of metrics.
Are you seeing any rapid change following the Metric Tide and other related initiatives?
There has definitely been very visible and interesting discussion and a resulting awareness about this topic over the last 5 or 6 years. And that’s to be welcomed. But it would be naïve to say that the tide has completely turned. We are in a period of transition, of contestation and debate. I expect it will take some time for the different actor systems to align and take action. And it’s not in any way certain that all will resolve in the optimal way.