From 34f99997e7182af828f8f938ef13622ceebc51bf Mon Sep 17 00:00:00 2001 From: Lukas Wallrich Date: Mon, 29 Jun 2026 00:20:56 +0200 Subject: [PATCH 1/2] Broaden and caveat the replication-rate evidence (Ch1, #14) The replication-rate discussion previously rested on a single, narrow education paper (Perry et al. 2022). Added a short, caveated passage on replication success rates drawing on multiple large-scale projects across fields: Open Science Collaboration 2015 (psychology), Klein et al. 2014 (Many Labs), Camerer et al. 2016 (experimental economics), Camerer et al. 2018 (Social Sciences Replication Project), and Errington et al. 2021b (cancer biology). The caveat notes that reported success rates vary by field, study selection, definition of replication, and success criterion (significance vs effect size vs subjective judgement). Also broadened the prevalence claim beyond education by adding Makel et al. 2012 (psychology) alongside Perry et al. 2022. New bib keys: CamererEtAl2016 (doi 10.1126/science.aaf0918) and CamererEtAl2018 (doi 10.1038/s41562-018-0399-z), both Crossref-verified. Reused existing keys: OpenScienceCollab2015, KleinEtAl2014, ErringtonEtAl2021b, MakelEtAl2012, PerryEtAl2022. --- background.qmd | 4 +++- references.bib | 21 +++++++++++++++++++++ 2 files changed, 24 insertions(+), 1 deletion(-) diff --git a/background.qmd b/background.qmd index 604c9bb..80ea153 100644 --- a/background.qmd +++ b/background.qmd @@ -12,6 +12,8 @@ other science, which does not conform to this requirement*.” – Repeatability is the cornerstone of many sciences: Although scientific claims are often assumed to be robust, without explicit reproduction and replication — that is, retesting a hypothesis with the same (reproduction) or different (replication) data — it remains unclear whether this assumption holds. +Large-scale replication projects have begun to examine how often published findings hold up on retesting, with sobering but also highly variable results. In psychology, the Reproducibility Project successfully replicated roughly a third of the studies it examined when success was judged by statistical significance [@OpenScienceCollab2015], while coordinated multi-laboratory efforts have shown that replicability varies markedly from one effect to another [@KleinEtAl2014]. Comparable projects in experimental economics [@CamererEtAl2016], across the social sciences more broadly [@CamererEtAl2018], and in preclinical cancer biology [@ErringtonEtAl2021b] have each confirmed some original findings while failing to reproduce others. These headline figures should be read with caution, however, because reported success rates vary widely across fields and depend heavily on how studies are selected, how a replication is defined, and which criterion of success is applied, whether statistical significance, the size of the original effect, or a subjective judgement by the replicating team. Any single percentage therefore reflects a particular set of methodological choices as much as the underlying reliability of a field. + Cumulative science without repetition is costly: building on unreliable findings leads to research waste and misdirected effort. The aim of this guide is to empower researchers to conduct high-quality reproductions and replications and thereby contribute to making their fields of research more cumulative and robust. Issues of replicability have been discussed across many disciplines, such as psychology [@OpenScienceCollab2015], economics [@DreberJohannesson2024], biology [@ErringtonEtAl2021b], marketing [@UrminskyDietvorst2024], linguistics [@McManus2024], computer science [@HummelManner2024] and epidemiology [@LashEtAl2018] and the number of replications has been rising sharply (see @fig-replication-growth, which shows replications only as there is no comprehensive database for reproductions yet). ```{r} @@ -26,7 +28,7 @@ counts <- read.csv("data/flora_replication_counts.csv") plot_replication_growth(counts) ``` -While the number of replication and reproduction studies has increased, the overall proportion of them is still very small, with reviews finding that replications make up well below 1% of published papers [@PerryEtAl2022]. Moreover, much of the guidance on replications is still being developed [@ClarkeEtAl2026] and in narrow parts of science, which leads to fragmentation, siloing, and potentially inconsistent information. +While the number of replication and reproduction studies has increased, the overall proportion of them is still very small, with reviews finding that replications make up only a small fraction, on the order of one percent or less, of published papers [@MakelEtAl2012; @PerryEtAl2022]. Moreover, much of the guidance on replications is still being developed [@ClarkeEtAl2026] and in narrow parts of science, which leads to fragmentation, siloing, and potentially inconsistent information. Here we attempt to integrate useful guidelines [e.g., @BlockKuckertz2018; @JekelEtAl2020] into a comprehensive overview that diff --git a/references.bib b/references.bib index 22d468a..f916340 100644 --- a/references.bib +++ b/references.bib @@ -288,6 +288,27 @@ @article{CalderEtAl1981 doi = {10.1086/208856} } +@article{CamererEtAl2016, + author = {Camerer, C. F. and Dreber, A. and Forsell, E. and Ho, T.-H. and Huber, J. and Johannesson, M. and Kirchler, M. and others}, + title = {Evaluating replicability of laboratory experiments in economics}, + journal = {Science}, + volume = {351}, + number = {6280}, + pages = {1433-1436}, + year = {2016}, + doi = {10.1126/science.aaf0918} +} + +@article{CamererEtAl2018, + author = {Camerer, C. F. and Dreber, A. and Holzmeister, F. and Ho, T.-H. and Huber, J. and Johannesson, M. and Kirchler, M. and others}, + title = {Evaluating the replicability of social science experiments in Nature and Science between 2010 and 2015}, + journal = {Nature Human Behaviour}, + volume = {2}, + pages = {637-644}, + year = {2018}, + doi = {10.1038/s41562-018-0399-z} +} + @article{CarterEtAl2019, author = {Carter, E. C. and Schönbrodt, F. D. and Gervais, W. M. and Hilgard, J.}, title = {Correcting for bias in psychology: A comparison of meta-analytic methods}, From 4de17b7b3f7ca20f85bd6668dcf791d3bdf2f492 Mon Sep 17 00:00:00 2001 From: Lukas Wallrich Date: Mon, 29 Jun 2026 17:43:54 +0200 Subject: [PATCH 2/2] Address codex review: scope replication-rate claims (#14) --- background.qmd | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/background.qmd b/background.qmd index 80ea153..473ec44 100644 --- a/background.qmd +++ b/background.qmd @@ -12,7 +12,7 @@ other science, which does not conform to this requirement*.” – Repeatability is the cornerstone of many sciences: Although scientific claims are often assumed to be robust, without explicit reproduction and replication — that is, retesting a hypothesis with the same (reproduction) or different (replication) data — it remains unclear whether this assumption holds. -Large-scale replication projects have begun to examine how often published findings hold up on retesting, with sobering but also highly variable results. In psychology, the Reproducibility Project successfully replicated roughly a third of the studies it examined when success was judged by statistical significance [@OpenScienceCollab2015], while coordinated multi-laboratory efforts have shown that replicability varies markedly from one effect to another [@KleinEtAl2014]. Comparable projects in experimental economics [@CamererEtAl2016], across the social sciences more broadly [@CamererEtAl2018], and in preclinical cancer biology [@ErringtonEtAl2021b] have each confirmed some original findings while failing to reproduce others. These headline figures should be read with caution, however, because reported success rates vary widely across fields and depend heavily on how studies are selected, how a replication is defined, and which criterion of success is applied, whether statistical significance, the size of the original effect, or a subjective judgement by the replicating team. Any single percentage therefore reflects a particular set of methodological choices as much as the underlying reliability of a field. +Large-scale replication projects have begun to examine how often published findings hold up on retesting, with sobering but also highly variable results. In a sample of psychology studies from selected journals, the Reproducibility Project successfully replicated roughly a third of the studies it examined when success was judged by statistical significance [@OpenScienceCollab2015], while coordinated multi-laboratory efforts have shown that replicability varies markedly from one effect to another [@KleinEtAl2014]. Comparable projects in experimental economics [@CamererEtAl2016], among social science experiments published in *Nature* and *Science* [@CamererEtAl2018], and in preclinical cancer biology [@ErringtonEtAl2021b] have each produced mixed evidence, supporting some original findings while leaving others unsupported or inconclusive. These headline figures should be read with caution, however, because reported success rates vary widely across fields and depend heavily on how studies are selected, how a replication is defined, and which criterion of success is applied, whether statistical significance, the size of the original effect, or a subjective judgement by the replicating team. Any single percentage therefore reflects a particular set of methodological choices as much as the underlying reliability of a field. Cumulative science without repetition is costly: building on unreliable findings leads to research waste and misdirected effort. The aim of this guide is to empower researchers to conduct high-quality reproductions and replications and thereby contribute to making their fields of research more cumulative and robust. Issues of replicability have been discussed across many disciplines, such as psychology [@OpenScienceCollab2015], economics [@DreberJohannesson2024], biology [@ErringtonEtAl2021b], marketing [@UrminskyDietvorst2024], linguistics [@McManus2024], computer science [@HummelManner2024] and epidemiology [@LashEtAl2018] and the number of replications has been rising sharply (see @fig-replication-growth, which shows replications only as there is no comprehensive database for reproductions yet).