Software as an academic publication

Roger Peng ** 2018/05/03

Software has for while now played a weird an uncomfortable role in the academic statistics world. When I first started out (circa 2000), I think developing software was considered “nice”, but for the most part, was not considered valuable as an academic contribution in the statistical universe. People were generally happy to use the software and extol its virtues, but when it came to evaluating a person’s scholarship, software usually ranked somewhere near the bottom of the list, after papers, grants, and maybe even JSM contributed talks.

Journals like the Journal of Statistical Software tried to remedy the situation by translating the contribution of software into a commodity that people already understood: papers. (Okay, thinking of papers as “commodities” is probaby worth another blog post.) The idea with JSS, as I understood, was that you could publish a paper about your software, or maybe even a manual, and then that would be peer-reviewed in the usual way. If it passed peer-review, it would be published (electronically) and then you could list that paper on your CV. Magic! The approach was inefficient from the start, because it placed an additional burden on software developers. Not only did they have to write (and support) the software, but they had to write a separate paper to go along with it. Writers of theorems were not saddled with such requirements. But at the time of its founding around 2004, the JSS approach was perhaps a reasonable compromise.

These days, I think many people would be comfortable seeing a piece of software as a per se academic contribution. A software package can, for the most part, stand alone and people will recognize it. I personally do not think authors should be forced to write a separate paper, essentially translating the software from code into English or whatever natural language. The Journal of Open Source Software is a move in this “opposite” direction, where no paper is required, beyond the software itself and a brief summary of the software’s purpose and how to use it. This process is similar to ideas that people have had regarding the publishing of data: why not just let people who collect data publish the dataset alone and get credit for doing so? I think the JOSS approach is basically the way to go for academic software contributions, with one small caveat.

Evaluation of most software places a heavy emphasis on usefulness. The instructions for authors on the JSS web site indicate how each paper and software would be evaluated for publication:

The review has two parts: both the software and the manuscript are reviewed. The software should work as indicated, be clearly documented, and serve a useful purpose. Reviewers are instructed to evaluate both correctness and usefulness. Special emphasis is given on the reproducibility of the results presented in the submission.

JOSS takes a slightly different approach but ultimately has a similar standard. The software

Should be a significant contribution to the available open source software that either enables some new research challenges to be addressed or makes addressing research challenges significantly better (e.g., faster, easier, simpler)

In both cases, there is an emphasis on usefulness which, as a goal, is difficult to argue against. However, if you look at the goals of most academic journals, usefulness is not amongst the criteria for evaluating contributions. If it were, you’d have to delete most of the journals in existence today. This is a frequent argument that basic scientists and theorists have with policymakers—whether their work is “useful” is not the point. Advancing the state of knowledge about the world is important in and of itself and the usefulness of that knowledge may be difficult to ascertain in the short run. Journals typically strive to publish papers that represent an advance in knowledge in a particular field. Often there is a vague mention of “impact” of the work, but how wide that impact is likely to be will depend on the nature of the field.

What exactly is the advance in knowledge that is obtained when software is developed and distributed? I don’t ask this question because I don’t think there is an advance. I ask it because I think there is definitely knowledge that is gained in the process of developing software, but it’s usually not communicated to anyone. This knowledge is typically not obtained via the scientific process. Rather it is often gained through personal experience. We have well-established methods for distributing software once it is developed, but we do not have well-established venues of distributing any knowledge that is obtained, particularly any informal knowledge or personal stories.

If you’ve ever seen someone give a presentation or talk about some software they’ve written, you know it’s not the same as reading the manual for the software. One reason why is that the presenter will describe the process through which they went about writing the software and, usually as asides, mention some lessons learned along the way. I think these “lessons learned” are critically important, and make up a relevant contribution to the scientific community. If I mention that I started out writing this software using R’s S4 class/method system but realized it was too complicated and annoying and so went back to S3, that’s a useful lesson learned. As a fellow developer, I might reconsider starting my next project with S4. However, unless we take a critical eye to the git logs for a given software package, we would never know this by simply using the software. It would seem as if the developer went with S3 from the get go for unknown reasons.

I think it would be nice if software publications included a brief summary of any lessons learned in the process of developing the software. Many developers already do this via blog posts or similar media. But many do not and we are left to wonder. Maybe they did some informal user testing and found that people preferred one interface over another interface. Anything that other people (and developers) might find useful, beyond the software itself. It might be a paragraph or even just a set of bullet points. It might be more but I’m not inclined to require any specific length or format. To be clear, this is not equivalent to a scientific study, but it’s potentially useful information nonetheless.

One good example of something like this is the original 1996 R publication in the Journal of Computational and Graphical Statistics by Robert Gentleman and Ross Ihaka. In fact, the abstract for the article says it all:

In this article we discuss our experience designing and implementing a statistical computing language. In developing this new language, we sought to combine what we felt were useful features from two existing computer languages. We feel that the new language provides advantages in the areas of portability, computational efficiency, memory management, and scoping.

With many popular software packages there are often blog posts that people write to describe how they use the software for their particular purpose. These posts will sometimes praise or criticize the software and in aggregate provide a good sense of the user experience. The kind of information I’m talking about gives us insight into the developer experience. For any reasonably well-developed piece of software, there are always some lessons learned that the author ultimately keeps to themselves. I think that’s a shame and it would be nice if we could all learn something from their experience.