Against Winner-Take-All Attribution

This is the anti-Wolfram.

I did not design or write the Stan language. I’m a user of Stan. Lots of people designed and wrote Stan, most notably Bob Carpenter (designed the language and implemented lots of the algorithms), Matt Hoffman (came up with the Nuts algorithm), and Daniel Lee (put together lots of the internals of the program). Also Jiqiang Guo and Ben Goodrich (Rstan), Michael Betancourt (improvements in Nuts), Jonah Gabry (Shinystan), and lots of others. As always, it’s hard to write these lists because whenever you stop, you’re excluding others.

Anyway, my primary role in Stan is “user.” The role of user is important. But I did not design the language, I did not write the language, and I did not come up with most of the algorithms that it uses. Other people did that. We have a great team (including lots of people who have no connection to Columbia University) and it’s great that we can work together in this way.

[edit from Bob Carpenter:

We have 30+ core developers now, which I figure is about 10 full-time equivalents. Also, dozens of people not on that list make contributions.

If you want to see what exactly people contributed in the way of code, see the stan-dev GitHub organization.

It’s harder to trace the ideas, but almost all of the bigger ideas and implementations were highly collaborative. It’s one of the most fun aspects of the project. I still love watching Andrew “use” Stan in our various applied projects.

Ben Goodrich did all of our earlier multivariate stuff, including the LKJ prior, the Cholesky factor codings with Jacobians etc. It was a real hurdle for me to try to understand this stuff and help get it implemented in Stan (you can see the math in the chapter of the reference manual on transformed variables). We were a very small team at the time (basically me, Matt, Daniel Lee, and Ben).

Some of the biggies not on Andrew’s list include our differential equation functionality (the first crude version of which was built by me and Daniel Lee, and since largely driven by Sebastian Weber, Michael Betancourt and more recently, Charles Margossian and Yi Zhang, along with ongoing input from Daniel Lee).

Then there’s our multi-core functionality (largely driven by Sebastian Weber with contemporary API design by Sean Talts). That just came out in Stan 2.18.

The other biggie recently is our GPU functionality (largely driven by Steve Bronder, Erik Strumbelj, and Rok Cesnovar, with a revised API design by Sean Talts [I hope you see the trend here—Sean’s also taking over the design of the Stan 3 language]). This will be in Stan 2.19, but already exists on branches and is beginning to be merged into the math library development branch on GitHub.

Another biggy which was contentious when we introduced it was all the diagnostics, the HMC-specific versions of which were driven by Michael Betancourt (yes, you can think him for the divergence warnings [that wasn’t sarcastic—you really should thank him for sticking to his guns and insisting we include it even if scared off users]). More recently, the gang (Michael, Andrew Gelman [we have two Andrews], Aki, Sean, Dan Simpson [we have two Dan’s]) came out with the simulation-based calibration algorithm (again, not Stan-specific, but a fantastic idea that required a lot of cleanup).

Then there’s model comparison advice, which isn’t really about Stan per se, but is largely derived from work by Aki Vehtari et al. with Jonah doing most of the coding and writing up of results.

Marcus Brubaker implemented L-BFGS and did a lot of the early efficient matrix implementations and helped with our low-level memory design at the assembly language generation level. Alp Kucukelbir designed ADVI and had a lot of help from Dustin Tran and Daniel Lee on implementation.

We also need a big shout out for Daniel Lee and Sean Talts, without whom our builds and continuous integration systems would’ve crumbled to dust ages ago. This is super thankless work that doesn’t take up a lot of their time in any given month, but which adds up to a huge effort over time.

We also have Allen Riddell (and more recently Ari Hartikainen) working on PyStan, and a host of other people working on the other (simpler) interfaces. Michael Betancourt built most of CmdStan. Ari and Allen are also working on an http server version of Stan.

Rob Trangucci’s done a lot of great work on making matrix operations more efficient. Krzysztof Sakrejda’s been doing a ton of work on interface and I/O design with Allen and Sean.

And of course, we now have Aki Vehtari (the closer) guiding our work on GPs (along with Rob Trangucci, Michael Betancourt, Andrew, etc.) and various sorts of diagnostics (with Jonah Gabry and Sean Talts and Dan Simpson). Not to mention, Aki organized StanCon Helsinki, which was a blast (sorry I didn’t have time to sit down and talk with everyone—I hadn’t even met half a dozen of our developers who were there, so I was very busy catching up). Dan Simpson’s curently working on adding efficient marginalizations for GLMs a la INLA and Matthias Vákár is working on efficient GLM functions (which Rok and Erik are then going to supercharge with GPUs).

We’ve also been concentrating a lot on workflow, diagnostics, etc., with extensive effort from Andrew Gelman, Michael Betancourt, Aki Vehtari, Jonah Gabry, and Dan Simpson, along with the rest of us trying to implement the recommendations and keep up with best practices. Michael’s case studies, in particular, should be required reading. I believe Jonah and crew are in Cardiff at the moment “reading” their RSS paper on Bayesian workflow. We plan to turn that into a book length monograph with full details and a couple of long examples.

Most recently, Ben Bales has been knocking every pitch thrown to him out of the park. He’s built the really neat adjoint-Jacobian product formulation of multivariate autodiff and figured out all the parameter pack stuff that’s going to let us massively simplify a ton of our functional interfaces. Mitzi Morris has been paying down all the technical debt I incurred writing the first version of the parser.

And let’s not forget Paul Bürkner, who built brms, which is another interface like rstanarm (which I believe was largely built by Ben Goodrich and Jonah Gabry, but even I can’t keep track of who actually builds what).

I could go on for pages, but I hope everyone gets the picture that this is a hugely collaborative effort!

end edit]