Interacting with ML Models

The main difference between data analysis today, compared with a decade or two ago, is the way that we interact with it. Previously, the role of statistics was primarily to extend our mental models by discovering new correlations and causal rules. Today, we increasingly delegate parts of our reasoning processes to algorithmic models that live outside our mental models. In my next few posts, I plan to explore some of the issues that arise from this delegation and how ideas such as model interpretability can potentially address them. Throughout this series of posts, I will argue that while current research has barely scratched the surface of understanding the interaction between algorithmic and mental models, these issues will be much more important to the future of data analysis than the technical performance of the models themselves.  In this first post, I’ll use a relatively mundane case study – personalized movie recommendations – to demonstrate some of these issues, keeping in mind that the same issues impact models in more serious contexts like healthcare and finance.

The most common way that you delegate a part of your thought process to an algorithmic model these days is probably with personalized recommendation systems. For example, the ratings you see on a movie streaming site are often calculated based on what you’ve watched or liked in the past. Such a system might look at things like the genre of each movie, its actors and director, or how much “viewers similar to you” liked it.

But when you actually select a movie, you consider a number of factors that aren’t part of the model, such as the kind of mood you’re in, how much time you have or who you’re watching it with. The video streaming site could try to account for some of these by adding additional factors into the model, but they can never get all of them. No matter how complex they make the model, there will always be factors that could not have been anticipated when it was trained. If you completely delegate the decision to the model, picking the highest rated movie without accounting for these external factors, you’ll probably be in for a rude surprise.

Your mental models, on the other hand, can adapt to account for new and unexpected factors when you make the decision. So when you select a video, you have to combine the algorithmic recommendation with your own mental model of what type of movie you would like in the current context. Your mental model will include complex relationships between some of the factors used in the algorithmic model and the contextual factors that aren’t included. The better you can understand how the algorithmic model used the different factors to arrive at its prediction, the better equipped you will be to adjust the algorithmic recommendation based on contextual factors.

Imagine you see a musical comedy that is rated 4.2 out of five stars. From the number alone, you don’t know if that score is for its music or for its humor. If you’re in the mood for a comedy, you don’t want to pick a movie that isn’t very funny, but has great music. So with just the number, you’ll probably have to come up with your own estimation of how much you’ll like the movie, ignoring the algorithmic rating entirely. You effectively have to choose between using the algorithmic model without context or using your mental model without the help of the algorithm.

For the rating to be useful, it needs to come with additional hints about how it was calculated. For example, the result might point to a similar movie that you previously watched or rated highly. Or it might point to the factor that most contributed to the rating, such as the genre or the lead actor. While neither of these completely explain how the rating was calculated, they give you some amount of insight, which you can use to mentally adjust the rating based on additional context. You still don’t want to delegate the entire decision to the model, but you can delegate a part of the thought process. A model that can produce such insights is often called interpretable, though this term is used with a wide range of meanings in the literature.

In this example, even without hints for interpreting the predictions, you can probably still gain some information from the algorithmic model because you have a very good mental model of the types of movies that you like. But if you’re trying to understand a more complex and less intuitive system or situation – financial markets, human health, politics – you will have a less reliable mental model and will need to rely much more heavily on whatever information you can get from the algorithmic model. If we want algorithmic models to be successful in these types of contexts, we need to be able to present their predictions in ways that allow users to seamlessly and accurately interpret them, so they can delegate more of the decision making process while minimizing the risk of a nasty surprise.

In some sense, an interpretable model pokes holes in the barrier between the algorithmic model and your mental model. The ideal, of course, would be to break down the barrier entirely, so that you can fully incorporate the information from the algorithmic model into your mental model’s assessment. That’s probably impossible, but I’m convinced that we can poke significantly larger holes than have been made so far.

In my next few posts, I will discuss a number of different ways that researchers have tried to understand what interpretability means and to develop interpretable models. This is a subtle problem at the boundary between psychology and technology, with many directions that are waiting to be explored. I’m very excited to see how this field develops over the next few years.

Like this:

Like Loading…

Related