Some clues that this study has big big problems

Paul Alper writes:

This article from the New York Daily News, reproduced in the Minneapolis Star Tribune, is so terrible in so many ways. Very sad commentary regarding all aspects of statistics education and journalism.

The news article, by Joe Dziemianowicz, is called “Study says drinking alcohol is key to living past 90,” with subheading, “When it comes to making it into your 90s, booze actually beats exercise, according to a long-term study,” and it continues:

The research, led by University of California neurologist Claudia Kawas, tracked 1,700 nonagenarians enrolled in the 90+ Study that began in 2003 to explore impacts of daily habits on longevity. Researchers discovered that subjects who drank about two glasses of beer or wine a day were 18 percent less likely to experience a premature death, the Independent reports. Meanwhile, participants who exercised 15 to 45 minutes a day cut the same risk by 11 percent. . . . Other factors were found to boost longevity, including weight. Participants who were slightly overweight — but not obese — cut their odds of an early death by 3 percent. . . . Subjects who kept busy with a daily hobby two hours a day were 21 percent less likely to die early, while those who drank two cups of coffee a day cut that risk by 10 percent.

At first, this seems like reasonable science reporting. But right away there are a couple flags that raise suspicion, such as the oddly specific “15 to 45 minutes a day”—what about people who exercise more or less than that?—and the bit about “overweight — but not obese.” It’s harder than you might think to estimate nonlinear effects. In this case the implication is not just nonlinearity but nonmonotonicity, and I’m starting to worry that the researchers are fishing through the data looking for patterns. Data exploration is great, but you should realize that you’ll be dredging up a lot of noise along with your signal. As we’ve said before, correlation (in your data) does not even imply correlation (in the underlying population, or in future data).

The claims produced by the 90+ Study can also be criticized on more specific grounds. Alper points to this news article by Michael Joyce, who writes:

their survey [found] that drinking the equivalent of two beers or two glasses of wine per day was associated with 18% fewer deaths, it also found that daily exercise of around 15 to 45 minutes was only associated with 11% fewer premature deaths. TechTimes opted to blend these two findings into a single whopper of a headline: Drinking Alcohol Helps Better Than Exercise If You Want To Live Past 90 Years Old Not only is this language unjustified in referring to a study that can only show association, not causation, but the survey did not directly compare alcohol and exercise. So the headline is very misleading. . . . Other reported findings of the study included: – being slightly overweight (not obese) was associated with 3% fewer early deaths – being involved in a daily hobby two hours a day was associated with a 21 % lower rate of premature deaths – drinking two cups of coffee a day was associated with a 10% lower rate of early death But these are observations and nothing more. Furthermore, they are based on self-reporting by the study subjects. That’s a notoriously unreliable way to get accurate information regarding people’s daily habits or behaviors. Just after we published this piece we heard back from Dr. Michael Bierer, MD, MPH — one of our regular contributors — who we had reached out to for comment . . .:

Observational studies that demonstrate benefits to people engaged in a certain activity — in this case drinking — are difficult to do well. That’s because the behavior in question may co-vary with other features that predict health outcomes. For example, those who abstain from alcohol completely may do so for a variety of reasons. In older adults, perhaps that reason is taking a medication that makes alcohol dangerous; such as anticoagulants, psychotropics, or aspirin. So not drinking might be a marker for other health conditions that themselves are associated — weakly or not-so-weakly — with negative outcomes. Or, abstaining may signal a history of problematic drinking and the advice to cut back. Likewise, there are many health conditions (like liver disease) that are reasons to abstain. Conversely, moderate drinking might be a marker for more robust health. There is an established link between physical activity and drinking alcohol. People who take some alcohol may simply have more social contacts than those who abstain, and pro-social behaviors are linked to health.

P.S. I’d originally titled this post, “In Watergate, the saying was, ‘It’s not the crime, it’s the coverup.’ In science reporting, it’s not the results, it’s the hype.” But I changed the title to avoid the association with criminality. One thing I’ve said a lot is that, in science, honesty and transparency are not enough: You can be a scrupulous researcher but if your noise overwhelms your signal, and you’re using statistical methods (such as selection on statistical significance) that emphasize and amplify noise, that you can end up with junk science. Which, when put through the hype machine, becomes hyped junk science. Gladwell bait. Freakonomics bait. NPR bait. PNAS bait.

So, again:

(1) If someone points out problems with your data and statistical procedures, don’t assume they’re saying you’re dishonest.

(2) If you are personally honest, just trying to get at the scientific truth, accept that concerns about “questionable research practices” might apply to you too.