MRP (multilevel regression and poststratification; Mister P): Clearing up misunderstandings about

Someone pointed me to this thread where I noticed some issues I’d like to clear up:

David Shor: “MRP itself is like, a 2009-era methodology.”

Nope. The first paper on MRP was from 1997. And, even then, the component pieces were not new: we were just basically combining two existing ideas from survey sampling: regression estimation and small-area estimation. It would be more accurate to call MRP a methodology from the 1990s, or even the 1970s.

Will Cubbison: “that MRP isn’t a magic fix for poor sampling seems rather obvious to me?”

Yep. We need to work on both fronts: better data collection and better post-sampling adjustment. In practice, neither alone will be enough.

David Shor: 2012 seems like a perfect example of how focusing on correcting non-response bias and collecting as much data as you can is going to do better than messing around with MRP.

There’s a misconception here. “Correcting non-response bias” is not an alternative to MRP; rather, MRP is a method for correcting non-response bias. The whole point of the “multilevel” (more generally, “regularization”) in MRP is that it allows us to adjust for more factors that could drive nonresponse bias. And of course we used MRP in our paper where we showed the importance of adjusting for non-response bias in 2012.

And “collecting as much data as you can” is something you’ll want to do no matter what. Yair used MRP with tons of data to understand the 2018 election. MRP (or, more generally, RRP) is a great way to correct for non-response bias using as much data as you can.

Also, I’m not quite clear what was meant by “messing around” with MRP. MRP is a statistical method. We use it, we don’t “mess around” with it, any more than we “mess around” with any other statistical method. Any method for correcting non-response bias is going to require some “messing around.”

In short, MRP is a method for adjusting for nonresponse bias and data sparsity to get better survey estimates. There are other ways of getting to basically the same answer. It’s important to adjust for as many factors as possible and, if you’re going for small-area estimation with sparse data, that you use good group-level predictors.

MRP is a 1970s-era method that still works. That’s fine. Least squares regression is a 1790s-era method, and it still works too! In both cases, we continue to do research to improve and better understand what we’re doing.