Integrating collection, analysis, and interpretation of data in social and behavioral research
Andrew Gelman, Department of Statistics and Department of Political Science, Columbia University
The replication crisis has made us increasingly aware of the flaws of conventional statistical reasoning based on hypothesis testing. The problem is not just a technical issue with p-values, not can it be solved using preregistration or other purely procedural approaches. Rather, appropriate solutions have three aspects. First, in collecting your data there should be a concordance between theory and measurement: for example, in studying the effect of an intervention applied to individuals, you should measure within-person comparisons. Second, in analyzing your data, you should study all comparisons of potential interest, rather than selecting based on statistical significance or other inherently noisy measures. Third, you should interpret your results in the context of theory, background knowledge, and the data collection and analysis you have performed. We discuss these issues on a theoretical level and with examples in psychology, political science, and policy analysis.
Here are some relevant references:
Some natural solutions to the p-value communication problem—and why they won’t work.
Honesty and transparency are not enough.
The connection between varying treatment effects and the crisis of unreplicable research: A Bayesian perspective.
And this:
No guru, no method, no teacher, Just you and I and nature . . . in the garden. Of forking paths.
The talk will be Tuesday, December 4, 2018, 12:00pm, in A32 Peretsman Scully Hall.