Rolf Zwaan (who we last encountered here in “From zero to Ted talk in 18 simple steps”), Alexander Etz, Richard Lucas, and M. Brent Donnellan wrote an article, “Making replication mainstream,” which begins:
Many philosophers of science and methodologists have argued that the ability to repeat studies and obtain similar results is an essential component of science. . . . To address the need for an integrative summary, we review various types of replication studies and then discuss the most commonly voiced concerns about direct replication. We provide detailed responses to these concerns and consider different statistical ways to evaluate replications. We conclude there are no theoretical or statistical obstacles to making direct replication a routine aspect of psychological science.
The article was published in Behavioral and Brain Sciences, a journal that runs articles with many discussants (see here for an example from a few years back).
I wrote a discussion, “Don’t characterize replications as successes or failures”:
No replication is truly direct, and I recommend moving away from the classification of replications as “direct” or “conceptual” to a framework in which we accept that treatment effects vary across conditions. Relatedly, we should stop labeling replications as successes or failures and instead use continuous measures to compare different studies, again using meta-analysis of raw data where possible. . . . I also agree that various concerns about the difficulty of replication should, in fact, be interpreted as arguments in favor of replication. For example, if effects can vary by context, this provides more reason why replication is necessary for scientific progress. . . . It may well make sense to assign lower value to replications than to original studies, when considered as intellectual products, as we can assume the replication requires less creative effort. When considered as scientific evidence, however, the results from a replication can well be better than those of the original study, in that the replication can have more control in its design, measurement, and analysis. . . . Beyond this, I would like to add two points from a statistician’s perspective. First, the idea of replication is central not just to scientific practice but also to formal statistics, even though this has not always been recognized. Frequentist statistics relies on the reference set of repeated experiments, and Bayesian statistics relies on the prior distribution which represents the population of effects—and in the analysis of replication studies it is important for the model to allow effects to vary across scenarios. My second point is that in the analysis of replication studies I recommend continuous analysis and multilevel modeling (meta-analysis), in contrast to the target article which recommends binary decision rules which which I think are contrary to the spirit of inquiry that motivates replication in the first place.
Jennifer Tackett and Blake McShane wrote a discussion, “Conceptualizing and evaluating replication across domains of behavioral research,” which begins:
We discuss the authors’ conceptualization of replication, in particular the false dichotomy of direct versus conceptual replication intrinsic to it, and suggest a broader one that better generalizes to other domains of psychological research. We also discuss their approach to the evaluation of replication results and suggest moving beyond their dichotomous statistical paradigms and employing hierarchical / meta-analytic statistical models.
Also relevant is this talk on Bayes, statistics, and reproducibility from earlier this year.