A potential big problem with placebo tests in econometrics： they’re subject to the “difference between significant and non-significant is not itself statistically significant” issue

In econometrics, or applied economics, a “placebo test” is not a comparison of a drug to a sugar pill. Rather, it’s a sort of conceptual placebo, in which you repeat your analysis using a different dataset, or a different part of your dataset, where no intervention occurred.

For example, if you’re performing some analysis studying the effect of some intervention on outcomes during the following year, you run a placebo test by redoing the analysis but choosing a year in which no intervention was done. Or you redo the analysis but choosing an outcome that should be unrelated to the intervention being studied.

Ummm, what’s a precise definition? I can’t find anything on wikipedia, and I can’t find “placebo” in Mostly Harmless Econometrics . . . hmmm, let me google a bit more . . . OK, here’s something:

A placebo test involves demonstrating that your effect does not exist when it “should not” exist.

That’s about right.

The problem comes in how the idea is applied. What I often see is the following: In the mean analysis the key finding is statistically significant, hence publishable and treated as real. Then the placebo controls show nothing statistically significant, they just look like noise, and the researcher concludes that the main model is fine. But if the researcher is not careful, this process runs into the “difference between significant and non-significant is not itself statistically significant” issue.

I don’t have any easy answers here, in part because placebo tests are often imprecise, in that there are various ways in which correlations between the “placebo treatment” and the outcome, or between the treatment and the “placebo outcome” can leak into the data. Hence a positive “placebo effect” does not necessarily invalidate a causal finding, and one would typically not expect the true “placebo effect” to be zero anyway—that is, with a large enough sample size, one would expect to reject the null hypothesis on the placebo. But in any case I think there are serious problems with the standard practice in which the researcher hopes to not reject the placebo and then take this as evidence supporting the favored theory.

P.S. The above cat doesn’t care at all about the placebo effect. Not one bit.