Dr. Mark Read

Research Fellow, Charles Perkins Centre, The University of Sydney
Research Gate
My Research Gate

Who should publish scientific papers?

A week ago the journal Science published the results of a year long sting-operation on the world of open-access publishing. Open-access is (perhaps, was) touted as the gold standard of scientific publishing: the authors pay the fees for publication, after which the manuscript is freely available to anyone. There is appeal to this model, a paper’s potential readership is not restricted to those with a subscription to a journal, or willing to pay to see the article (sums of around $30/article are common).

The alternative model of academic publishing is the pay-for-access model, employed to by the majority of journals. It is neatly summed up by Curt Rice in a Guardian article: the journals pay the author nothing, they pay the editor nothing, they pay the three or more reviewers who scrutinise the work nothing. These journals do incur some type-setting and infrastructure costs. They then charge the academic authors, editors, reviewers – people who provided so much for free, and without whom the journal would have nothing – through the nose for access. The general public, who pay for both the work and its quality control, have to pay again if they wish to see it. Some perceive ethical concerns with this model, and some years ago there was a movement amongst academics to boycott journals published by Elsevier, one of the biggest publishing houses, which was consistently reporting profits of hundreds of millions of dollars: about 30% of revenue. The sentiment was that Elsevier was leaching huge sums of public money for comparatively little work. The boycott movement demanded that pay-for-access journals make publications freely available after 6 months. They appear to have been successful; having recently reviewed Nature’s license to publish it seems authors retain copyright and can post their as-published paper versions online after 6 months.

Perhaps in recognition of the ethical dilemmas above, there are now several funding bodies mandating that publications arising from research they fund be published in open access journals (e.g., NIH and Wellcome Trust). Science’s article has unearthed an exponential growth in the number of “predatory” journals taking author’s publication fees whilst having in place none of the quality controls essential for a healthy scientific discipline. Many of these journals go to great lengths to appear as having roots in western scientific communities, presumably to engender trust. Science’s sting is cunning, and the results are shocking. Variations on a bogus paper, intentionally littered with scientific and ethical errors that any half-competent peer-reviewer should identify, are submitted to over 300 open-access journals. Of 304 submissions, 157 are accepted, 98 are rejected, with the remaining 49 having not yet reached a decision. Of the 255 papers that underwent the entire editing process en route to acceptance, 60% showed no sign of peer review.

There are other stunning results in Science’s paper, but you get the gist: these are scam journals. Their motivations are either to make profit, or earn ill-deserved academic merit for either authors or editors. The interesting question for me is how academic work should be published. Open access is a great ideology, but the incentives don’t stack up – journals have too great an incentive to accept work submitted to them, because that’s how they make money. The pay-for-access system has issues too, explored above. This is an economics problem: how to organise academic publishing such that the incentives line up with the best interests of science.

Its worth pondering what the concept of a journal contributes to science in the first place. Firstly, it is supposed to ensure a high quality of work is published; it is a gatekeeper, affording confidence that results constitute a worthwhile contribution. Secondly, it is a collection. An issue of a journal brings the science to you, rather than you having to go and dig it out. Everything in the journal will relate to some theme or field, and with a subscription to a particular journal scientists can stay up to date with research relevant to them. Third, it constitutes a community platform and provides a forum for discussion and debate, facilitating the emergence of themes and perspectives.

It strikes me that most of these things can be achieved using services now available on the internet: databases and indexes, and social media. PubMed sends me weekly emails containing articles selected on the basis of search terms I have entered. Using a service like this, one can scope out their own specific ‘journal’, which in fact pulls contributions from thousands of journals. Social networking services such as ResearchGate allow one to follow researchers of interest or in particular labs, keeping them up to date with publications. Further, it facilitates discussions through messaging boards specific to particular fields. It is not hard to imagine that readers could rate particular papers, much like amazon reviews, and that popular or interesting manuscripts find their way to the top of the pile in this manner, instead of being in particular journals. Internet-based services could facilitate discussions and questions relating to a particular publication, as you see on many modern day blogs and news websites. One would only hope for less vulgar and more intelligible content than what I have seen on YouTube comment feeds.

If the online facilities described above can fulfil the ‘community’ and ‘collection’ aspects of traditional journals, that leaves only the issue of peer review. It strikes me that the best people to manage the peer review process, and accredit those papers that pass, might be the funding bodies themselves. They are tax-payer owned, and not-for-profit. They are not subject to as much inter-body competition; in the UK, government-run funding bodies do not tend to overlap a great deal in scope (the case with cross-disciplinary research is more complex…), instead focussing on a specific discipline e.g. the physical or social sciences. Hence, they are not under pressure to accept a particular proportion of submissions for fear of losing public money to another agency; I cannot imagine the government giving the social sciences research council a chunk of physical science budget simply because it accepts a greater proportion of submissions and appears to be funding better science. Funding bodies would not be unduly benefiting from the efforts of publicly-funded researchers; they are the ones typically funding them. There are other non-government funding bodies, some are relatively small charities, and I do not propose that all of these necessarily manage their own peer review process; if the government-run funding bodies required some cost contribution for the reviewing process, I imagine this could be accommodated in a grant award. Funding bodies now typically permit applicants to request money for publication costs, and this money could just as easily be used to pay for a publicly-run peer review process.

This vision is far from complete, and more comprehensive thought is needed than will go into one blog post. However, I believe that a future scientific system devoid of journals in the traditional sense we have now is completely plausible, and may be much healthier for science.

(The article that spawned all this: J Bohannon. Who’s Afraid of Peer Reivew. Science, 342(6154), 60-56; 2013.)


How well should scientific code be scrutinised?

A friend, Dr. Kieran Alden, forwarded me a Nature article today describing how Mozilla (who make the Firefox browser, amongst other things) are looking through 200-line snippets of code from 9 articles published in the journal PLoS Computational Biology. They are investigating the worth of having competent professional coders, people primarily trained in best coding practice, checking the code of scientists (who are often not formally trained at all) when it is published.

This is a debate that I gather has been gaining some momentum, I keep stumbling onto it. Should code be published, and how well should it be reviewed? I understand the desire of researchers to want to keep their code to themselves. From a research perspective there’s a lot of up-front cost to developing this sort of code, and once someone has it they can build on it and run experiments with almost no buy-in at all. Research code is very cheap to execute, and very difficult to get right in the first place. Certainly, there are a lot of labs who won’t share code until their first round of results emanating from it have already been published – it’s too easy for competing labs to get ahead based on your effort.

But not sharing code can arguably hold back science, with labs investigating similar phenomena having to develop their own code from scratch. This is not very efficient, and it seems that the incentives are wrong if this is the behaviour that the powers-that-be have produced. But that’s another story. The article briefly skirts around two facets of the debate: bad code can be misleading, and harm a discipline rather than informing it; and researchers who are already unsure about publishing code may be put off completely if it gets reviewed as well as published.

For me this isn’t even a debate, its a no-brainer. As a computer scientist by training who has developed several scientific simulations I am very familiar with the complexities of simulating complex domains. I smiled at the paper’s implicit suggestion that commercial software is the way forward – I’ve used linux, windows, mac, firefox… so much of this stuff is full of bugs. But that’s not the point. In research we’re simulating things we often don’t understand very well, which can be tricky to check for bugs: are the weird behaviours an interesting and unexpected result representative of the science, or coding errors? And that’s if you spot them; research labs certainly don’t have the budgets to spend on testing that commercial enterprise does.

For me it boils down to this: what do you think would happen if I, a computer scientist, tried to run experiments in a sterile biological lab with no formal training? Would you trust the results? You might find me drinking coffee out of a petri-dish. And then, when I came to publish, I was vague about my methods – “Just trust me, I did everything right… doesn’t matter how.” Science has demanded a stringent adherence to and reporting of experimental protocol in most disciplines. I don’t see how scientific results coming from a computer are any different. I think code should be published, and reviewed. The journal(s) that make this their policy (particularly the reviewing part) will earn a reputation for publishing robust, high quality work. Researchers have to publish in good journals to progress, so they will have to adhere to the policy – that or accept publication in a “lesser” journal. With spurious results coming from bad code holding such potential to mislead and set back a scientific field, I think it’s highly worthwhile that journals employ experienced coding professionals to check code (as suggested in the article). But really, this is too late in the process, what we really need is these stringent standards being employed during the research and development stage, so as to catch and fix errors as they occur, not killing research tracks years after they start when errors are caught only at the publication stage.

(The original article: EC Hayden. Mozilla plan seeks to debug scientific code. Nature 501: 472, 2013.)

I’ve read something. It made me think.

Christmas being time off, I started reading Ben Goldacre’s new book Bad Pharma. I hugely enjoyed Bad Science, which was simultaneously humorous and informative. Bad Pharma concerns bad practice in the pharmaceutical industry, but also highlights problems more general to academia, particularly negative results not being published. This is very common to computer science too, where there is a perception that unless your work demonstrates something new or better than what previously existed, its not worthy of publication (there’s a frustrating personal thread to this, but I’ll keep shut about that). There’s nothing wrong with publishing negative results, it can highlight how not to do something, and can save other people huge amounts of time in not going down dead ends already explored but not published. It can also highlight challenging areas in a field.

Anyway, I digress. Bad Pharma was of particular interest to me because my interest in modelling the immune system could well take me close to the pharma industry. I’m not put off, though I appreciate knowing the sorts of things that go on. As Goldacre highlights near the end, the industry is generally populated with good people, but the industry is structured in such a way that they can collectively do harm. Society sets out the rules of the pharma game, what can and cannot be done. The pharma industry is an industry, and as with many companies their main motivation is to generate profit for their share holders. They shouldn’t do things that are illegal, and if society wants to keep them honest they should police and regulate. Many of the horrors in the book are as much a failing of good regulation (and government) as they are failings of the industry. There are parallels with other industries that have recently come to light. We lambast Starbucks, Google and Amazon for paying almost no tax. But they’ve done nothing illegal, if anything they’ve excelled at what they’re supposed to do: maximize profits for their shareholders. We scowl at the investment banking industry for taking huge risks which culminated in a global financial meltdown (I appreciate this is a complex problem), but again, in most cases they were acting within the law. And where does this pressure to take risks ultimately come from? Who’s money was being invested to generate a return? Pension funds and personal savings. I feel again that a good chunk of the blame can be laid at the regulator’s doors for failing to set appropriate rules for the banking game. The libor-rigging scandal did nothing to make banks more palatable to the general public, but individuals within the banking industry had raised concerns about this system to the regulators years ago. More recently the papers and the Leveson inquiry. Papers are under pressure to invade people’s privacy and get big scoops because that sells papers. We say “bad tabloid!”, but they wouldn’t do this if there wasnt a demand for those sorts of story among the general public in the first place.

Where am I going with all of this? Most of these problems are complex, and there’s plenty of failure to spread around all parties. And if there’s one thing I can be pretty sure of, its that my work is not going to impact any of this. Perhaps I should get back to it… I have my own complex robotics and immune-related problems to solve.