Introduction
Two papers released on arXiv, "Operator Variational Inference" and "Model Criticism for Bayesian Causal Inference"
Two papers of mine were released today on arXiv.
Once Again: Prefer Confidence Intervals to Point Estimates
Today I saw a claim being made on Twitter that 17% of Jill Stein supporters in Louisiana are also David Duke supporters. For anyone familiar with US politics, this claim is a priori implausible, although certainly not impossible.
Interacting with ML Models
The main difference between data analysis today, compared with a decade or two ago, is the way that we interact with it. Previously, the role of statistics was primarily to extend our mental models by discovering new correlations and causal rules. Today, we increasingly delegate parts of our reasoning processes to algorithmic models that live outside our mental models. In my next few posts, I plan to explore some of the issues that arise from this delegation and how ideas such as model interpretability can potentially address them. Throughout this series of posts, I will argue that while current research has barely scratched the surface of understanding the interaction between algorithmic and mental models, these issues will be much more important to the future of data analysis than the technical performance of the models themselves. In this first post, I’ll use a relatively mundane case study – personalized movie recommendations – to demonstrate some of these issues, keeping in mind that the same issues impact models in more serious contexts like healthcare and finance.
Random forest interpretation – conditional feature contributions
In two of my previous blog posts, I explained how the black box of a random forest can be opened up by tracking decision paths along the trees and computing feature contributions. This way, any prediction can be decomposed into contributions from features, such that (prediction = bias + feature_1contribution+..+feature_ncontribution).
AI ‘judge’ doesn’t explain why it reaches certain decisions
The Guardian reports on a recent paper by University College London researchers that are using artificial intelligence to predict the outcome of trials at the European Court of Human Rights.
DynamoDB Learnings
At Hinge, we have been using Dynamodb in production for more than 8 months and we just relaunched two weeks ago with full capacity. I want to share couple of learnings and why it made sense for us to store ratings in DynamoDB since I own the rating processing in the application. We are processing millions of ratings per day, upto so far, DynamoDB is holding pretty good so far. They are also crucial for our recommender to get smarter, so care is very much needed for ratings.
Clustering Zeppelin on Zeppelin
Intro to Implicit Matrix Factorization: Classic ALS with Sketchfab Models
Last post I described how I collected implicit feedback data from the website Sketchfab. I then claimed I would write about how to actually build a recommendation system with this data. Well, here we are! Let’s build.
Recurrent Neural Network Gradients, and Lessons Learned Therein
I’ve spent the last week hand-rolling recurrent neural networks. I’m currently taking Udacity’s Deep Learning course, and arriving at the section on RNN’s and LSTM’s, I decided to build a few for myself.