Dr. Mark Read

Research Fellow, Charles Perkins Centre, The University of Sydney
Research Gate
My Research Gate

How well should scientific code be scrutinised?

A friend, Dr. Kieran Alden, forwarded me a Nature article today describing how Mozilla (who make the Firefox browser, amongst other things) are looking through 200-line snippets of code from 9 articles published in the journal PLoS Computational Biology. They are investigating the worth of having competent professional coders, people primarily trained in best coding practice, checking the code of scientists (who are often not formally trained at all) when it is published.

This is a debate that I gather has been gaining some momentum, I keep stumbling onto it. Should code be published, and how well should it be reviewed? I understand the desire of researchers to want to keep their code to themselves. From a research perspective there’s a lot of up-front cost to developing this sort of code, and once someone has it they can build on it and run experiments with almost no buy-in at all. Research code is very cheap to execute, and very difficult to get right in the first place. Certainly, there are a lot of labs who won’t share code until their first round of results emanating from it have already been published – it’s too easy for competing labs to get ahead based on your effort.

But not sharing code can arguably hold back science, with labs investigating similar phenomena having to develop their own code from scratch. This is not very efficient, and it seems that the incentives are wrong if this is the behaviour that the powers-that-be have produced. But that’s another story. The article briefly skirts around two facets of the debate: bad code can be misleading, and harm a discipline rather than informing it; and researchers who are already unsure about publishing code may be put off completely if it gets reviewed as well as published.

For me this isn’t even a debate, its a no-brainer. As a computer scientist by training who has developed several scientific simulations I am very familiar with the complexities of simulating complex domains. I smiled at the paper’s implicit suggestion that commercial software is the way forward – I’ve used linux, windows, mac, firefox… so much of this stuff is full of bugs. But that’s not the point. In research we’re simulating things we often don’t understand very well, which can be tricky to check for bugs: are the weird behaviours an interesting and unexpected result representative of the science, or coding errors? And that’s if you spot them; research labs certainly don’t have the budgets to spend on testing that commercial enterprise does.

For me it boils down to this: what do you think would happen if I, a computer scientist, tried to run experiments in a sterile biological lab with no formal training? Would you trust the results? You might find me drinking coffee out of a petri-dish. And then, when I came to publish, I was vague about my methods – “Just trust me, I did everything right… doesn’t matter how.” Science has demanded a stringent adherence to and reporting of experimental protocol in most disciplines. I don’t see how scientific results coming from a computer are any different. I think code should be published, and reviewed. The journal(s) that make this their policy (particularly the reviewing part) will earn a reputation for publishing robust, high quality work. Researchers have to publish in good journals to progress, so they will have to adhere to the policy – that or accept publication in a “lesser” journal. With spurious results coming from bad code holding such potential to mislead and set back a scientific field, I think it’s highly worthwhile that journals employ experienced coding professionals to check code (as suggested in the article). But really, this is too late in the process, what we really need is these stringent standards being employed during the research and development stage, so as to catch and fix errors as they occur, not killing research tracks years after they start when errors are caught only at the publication stage.

(The original article: EC Hayden. Mozilla plan seeks to debug scientific code. Nature 501: 472, 2013.)

Confidence in Simulations for Science

Last week I attended the excellent SummerSim conference in Toronto, which focuses on all domains of simulation application. An item high on my research agenda in recent years, and that of other researchers at York, has been establishing confidence that results of complex system simulations are adequate representations of those systems. Simulation is used to investigate biological, sociological, political, financial systems, and more. We typically simulate these systems to aid in understanding them, however if they are not well understood to begin with, how do you create representative simulations? It is this bootstrapping problem that techniques such as the CoSMoS process and argumentation structures seek to address. I have looked at this problem from domain modelling, statistical, and calibration perspectives.

Myself and others at York were curious as to whether similar issues were being identified by simulation practitioners in other disciplines. As such, we organised a workshop examining “confidence in simulations for science” workshop at SummerSim, hoping to draw representatives from the eclectic¬†mix of disciplines represented. Our format was perhaps unusual; we did not wish to run a tutorial on techniques being developed at York, not did we want full paper presentations requiring peer review. This was a very preliminary examination of how far the issues we have been facing permeate through simulation endeavour in general. We put out a call for abstracts, hoping mostly to catch people who would already be attending the conference; I did not believe that many would find the finances to visit a conference when there was no publication to be gained (austere times that we live in). We received a number of very encouraging emails from people in the artificial life and synthetic biology community who sadly could not attend for our workshop clashed with other conferences they were already committed to. We received one abstract submission which was later retracted as the author could no longer attend the conference. We needed a plan B. We had prepared introductory material to the problem, and slides posing questions to the audience that we hoped would fuel a discussion. Fearing that we may be faced with a wall of blank stares, we prepared a lot of slides on activities at York addressing these questions, and hoped that we would not need to use them.

We were correct. The workshop was run in two 90 minute sessions. In the first session we had the three organisers: Paul Andrews, Kieran Alden and myself (Jon Timmis was sadly unable to attend the conference) and three participants. The size worked well, it was a very comfortable sized group for discussions. In the second half we were joined by others and had 7 participants. The feedback was very positive: we did generate a lot of discussion; the problems we have identified are pertinent to other simulation domains, and have not been solved elsewhere; the articulation of the problems was praised; the format was also well received, and we were given useful suggestions for how to improve it in the future; and we were informed that a sole outlet dedicated to these issues (rather than addressing them solely by-discipline) manner was useful. We shall have to think carefully about what the outputs of the workshop will be, there seems to be scope for a continuation. This shall be decided in the coming week, when Kieran (who is currently on holiday) returns with the minutes from the workshop. At the very least we have a mailing list of interested parties who can stay in contact and work together on this.

This was my first experience of chairing a workshop, and there are some lessons to take away. First, you will rarely struggle to generate discussions at a conference, though it helps to have a plan on how to lead in (I believe that Paul knew this already, having run numerous CoSMoS workshops in the past, but its one thing to be told and another to witness it firsthand). Second, try to ensure that the workshop’s details appear next to the main conference timetable in the programme. All conference attendees were given a staple bound programme that fell open in the centre at some coloured pages detailing where and when sessions were taking place. Alas the workshops were listed on page 3 or so, and I suspect many never saw them. Third, our initial suspicions that short un-refereed abstracts will not attract attendees to a conference was largely correct (though we did generate interest ¬†in a few parties who could not attend). As such, a conference like SummerSim is appropriate, as it captures such a wide range of simulation practitioners. I thoroughly enjoyed running the workshop, and will seek to do it again in the future. I encourage all aspiring academics to give it a shot!

… and Toronto is well worth visiting …