Highlights of 2018

We end 2018 with a round-up of some of the research, talks, sci-fi, visualizations/art, and a grab bag of other stuff we found particularly interesting, enjoyable, or influential this year (and we’re going to be a bit fuzzy about the definition of “this year”)!

Research

In addition to our own research, on recommendation engines, multi-task learning, and federated learning, we found three other themes particularly interesting.

At NIPS in December 2017, Ali Rahimi (and Ben Recht) delivered an address that asserted that modern deep learning is more like alchemy than science. We won’t attempt to paraphrase their short talk, but many of us found it compelling, and it’s certainly worth watching or reading. This lead to much discussion in the deep learning community, and the appearance of a subdiscipline that treats deep learning as an observational science (see e.g. How AI Training Scales and How does batch normalization help optimization?).

We published our own research on interpretability in 2017. The area we focused on has developed rapidly, with tools like Anchors and Shap (see this back issue of our newsletter for more) now the state-of-the-art in black box interpretability. And to the extent interpretability will help deep learning become more scientific and less alchemical, we loved The Building Blocks of Interpretability. But our favorite interpretability work of 2018 questions the entire premise of this family of methods. Cynthia Rudin’s Please Stop Explaining Black Box Models for High Stakes Decisions is a bracing and highly recommended read.

The final theme we found exciting in 2018, and will be keeping an eye on in 2019, is transfer learning applied to NLP. Sebastian Ruder’s NLP’s ImageNet moment has arrived sets the scene really well for 2019, and highlights some of the projects we’re most interested in, which are well covered in Thomas Wolf’s The Current Best of Universal Word Embeddings and Sentence Embeddings. We also wrote a blog post about transfer learning, and a newsletter about its application to NLP in particular.

Talks

Ex-Clouderan Josh Wills’s ten minute talk on Visibility and Monitoring for Machine Learning Models was our favorite talk of 2018. The highlight of the talk was the koan-like “You should deploy [a model] never or prepare to deploy it over and over and over and over and over again, repeatedly forever, ad infinitum”.

Josh Wills (Image credit: Launch Darkly and the Test In Production Meetup

Hillel Wayne’s Beyond Unit Tests: Taking Your Testing to the Next Level was an engaging, opinionated and slightly mind-bending view of the relationship between traditional unit/integration testing and formal methods.

Highlighting a talk from 2015 feels a little like cheating, but in a year that saw the implementation of the GDPR, and our own research into federated learning, we rewatched Maciej Ceglowski’s 2015 Strata keynote Haunted by Data. His “don’t collect it, don’t store it, don’t keep it” takeaways feel like better advice than ever. Suresh Venkatasubramanian’s 2018 blog post on regulation of the tech industry vs ethical education is an interesting addendum to Ceglowski’s talk.

Art and sci-fi

In 2018 the machine learning community rediscovered the well-trodden issue of authorship in modern art thanks to Christie’s auction house and the Obvious Collective.

But Marco Klingeman’s explorations of the landscapes and fauna of BigGAN were the most successful AI-insipred art (and scifi!) we saw in 2018.

Everything else

Published in 1986, The Making of the Atomic Bomb by Richard Rhodes is perhaps not as cutting edge as some of the other things on this list. But we found it interesting for two reasons: first, it’s an interesting story about the management of research in a non-academic context, which is a topic we can’t get enough of at Cloudera Fast Forward Labs. And second, it’s a sobering look at the way researchers attempt (and in many cases fail) to grasp and control the impact of their inventions. The relevance to machine learning research is obvious.

Our favorite periodical was (and is!) Logic. If you’re a follower of Cloudera Fast Forward Labs, you’ll certainly enjoy this interview with an anonymous data scientist from their 2017 debut issue, but everything they’ve published since has been equally worthwhile, and relevant to anyone working in tech.

Finally, this was the best Halloween costume.

Onwards to 2019!