The quality and impact/significance of your research is usually evaluated based on where you publish. The advent of new outlets for your scholarly work has raised some interesting issues about how this is done.
A blog exchange about Melville scholarship (read the comments, and also see this discussion of that post) highlights the particular issues surrounding blogs as an outlet for scholarship. Note the reference in the comments to “responsible bloggers”, which implies that blogging per se is considered an irresponsible medium. And at the same time, the recognition that much academic scholarship is done in such a way that it is not surprising academics don’t make these kinds of connections and contributions. “The academic failure to think.” Indeed.
It strikes me that there is also an assumption that bloggers (responsible or otherwise) are not academics, and that academics are not bloggers. This is not true though I suspect more academics would blog if they were clearer about how it will be evaluated by those who make decisions about the value of their work.
“New” forms of publishing
It would be a mistake to think this is just about blogs. Or even to think that it is about whether or not a publication is peer-reviewed. There are some very interesting discussions taking place about the value of peer-review itself and whether blogging could be treated as a kind of post-publication peer review.
In many disciplines there have been debates about the value of peer-reviewed journals that only publish online with no print edition. And the rapid increase in the number of new peer-reviewed journals (with or without print editions) frequently raises questions when publications are being evaluated.
It has been said that academia is conservative. When it comes to evaluating scholarship there does seem to be a conservative bias towards publishing in venues that have been around for a long time and are well known to your peers. Anything even vaguely new seems to cause all kinds of difficulties.
What are we trying to measure?
There is a question here of both what is being evaluated and the indicators that are used in that evaluation. I’m referring here to the primary contexts in which your work will be evaluated: hiring, promotion, grant applications, and the like.
Usually 2 things are being measured: quality and impact/significance. Both of these are abstract, so they need to be operationalized such that concrete evidence can be used as a kind of proxy for the thing itself. In the social sciences, we call this thing an “indicator”. (Apologies if you know all this stuff but I’m assuming the humanities folks didn’t necessarily get this language rammed down their throats during their training.)
In many academic evaluation processes the relationship between the indicators and the abstract concepts we are trying to measure has become obscured. It is not surprising that over many years of using particular indicators, the focus of discussion has shifted to the indicators themselves with little thought given to what they indicate.
Are we using the best indicators?
Whenever we operationalize an abstract concept and develop indicators, we have to ask ourselves if those indicators are valid and if they are reliable.
A valid indicator is one that measures the thing you want to measure and not something else. To use an unrelated example, if you want to know how much water is in a glass, the height of the water is not a valid indicator because the circumference of the glass will affect the relationship between the volume of water and the height. You should actually measure the volume, perhaps by pouring it into a calibrated measuring cup.
A reliable indicator is one which will give you the same measurement each time. Using a calibrated ruler to measure length is more reliable than pacing off the room, for example.
The questions we need to ask ourselves about how we evaluate the quality and impact of scholarship are:
- What indicators are we using for each concept?
- Are we clear about which concept we are measuring?
- Are these indicators valid?
- Given the changes in the publishing environment, are these indicators capturing all instances of quality and/or impact?
- Have the changes in the publishing environment affected the validity of particular indicators?
- Are there other possible indicators that could be used?
- Are these indicators reliable?
- Does a particular indicator always denote the same level of quality or impact?
- If not, can it be modified in some way (perhaps paired with another indicator) to increase its reliability?
- Are there other possible indicators that would be more reliable?
Let’s talk about this
I’d love to hear what you think. Please contribute in the comments. Try to be respectful of other commentors and feel free to engage with them as well as what I’ve said above.
Is there anything missing on my list?
What are your concerns about how your work is being evaluated?
What makes a blogger “responsible”? How does that differ for scholarly blogs?
I’m assuming we all agree that quality and impact are what we should be evaluating? Is that true? Or is the confusion even at this abstract level?
What other questions does this raise for you?
I’m going to write a few more posts taking this framework and looking in more detail at both the existing processes and some of the changes that are challenging the validity and reliability of that evaluation process.
The information in this and related posts has been incorporated into Scholarly Publishing (A Short Guide), available in eBook and paperback.
Edited March 30, 2017. Information about Scholarly Publishing (A Short Guide) added 8 October 2019.