Book Review: Computer Age Statistical Inference

A new book, Computer Age Statistical Inference: Algorithms, Evidence, and Data Science by Bradley Efron and Trevor Hastie, was released in July this year. I finished reading it a few weeks ago and this is a short review from the point of view of a machine learning researcher.

Living in Cambridge I indulge myself every once in a while by taking a break at the Cambridge University Press bookstore at the market square, located just opposite of King’s College it is the oldest book shop in England. Besides having an excellent collection of mathematics and computer science books, at the entrance of the shop they showcase new releases from Cambridge University Press. Most of these new books fall outside my interest, but what a pleasure it was to discover a new bold book on the broad theme of statistics in the modern age, written by two experts in the field! I took a look at the table of contents and a minute later purchased the book.

The book examines statistics broadly through three lenses.

First, it tells the history of the field of statistics, often with interesting remarks about the prevalent views at the time a method was invented. Second, correlated with the chronological order, the authors classify methods by their use of computation. Classic methods use few to none computation but often leverage asymptotic arguments. Newer methods are increasingly realistic in their assumptions but rely on heavy use of machine computation. Third, the flavour of the presented methods is interpreted as Fisherian, frequentist, or Bayesian.

The terminology in the book is easily accessible to a person with basic statistics training, perhaps with the exception of the word “Inference” in the title. In the book the authors use “inference” to describe the means by which legitimacy of statistical results can be established. This sense is different from the common use of the word in the machine learning community, where it would usually refer in a broad sense to “perform computation of consequences given a model and observations”.

From a machine learner’s perspective the most interesting parts of the book are the wide applicability of the empirical Bayes methodology, which is demonstrated in a number of generally relevant applications including large-scale testing and deconvolution.

Another benefit for someone with a machine learning background is the modern view on classic methods such as resampling methods (bootstrap and jackknife), a readable motivation for topics and applications which are popular in statistics but not popular in machine learning (survival analysis, large-scale testing, confidence intervals, etc.), and the historical remarks and subjective commentary on developments in the field.

The subjective commentary in the Epilogue makes predictions about the field of statistics and data science as a whole, with the main trends being a branching out into applications and an increased reliance on computation.

The book is a wonderful book and many readers will enjoy reading it, as I did. There are only two minor points where I feel the book could be improved.

First, while the authors readily acknowledge that many topics could have been added to the book, I feel that certain topics should have been included due to their broad applicability and heavy use of computation in many successful models: variational Bayesian inference, approximate Bayesian computation (ABC), kernel methods more generally, and Bayesian nonparametrics. Perhaps variational inference and kernel methods have not reached the core statistics community yet, but ABC and Bayesian nonparametrics originate with them and are only possible because of the massive computation available today.

Second, in the description of solutions to statistical problems throughout the book there is a strong emphasis on empirical Bayes and the bootstrap.

If you enjoy statistics, computation, or machine learning, get the book! The breadth of topics and the independence between the chapters will make it easy for you to find something interesting.

Acknowledgements. Thanks to Diana Gillooly for corrections.