We would like our results to be as fully reproducible as possible:
A. Reproducibility is one of the pillars of science
B. Reproducibility may greatly benefit you
A result is reproducible when the same analysis steps performed on the same dataset consistently produces the same answer.
Research results are replicable if there is sufficient information available for independent researchers to make the same findings using the same procedures.
In computational sciences - such as statistics - simply having the data and code means that the results are not only replicable, but fully reproducible.
R
scriptsReproducible research is not the norm:
74% of
R
files failed to complete without error
Course aims include the development of a publication-ready reproducible research compendium that contains:
In our course, students are taught various tools and languages, such as Quarto markdown, version control with git, and reproducible environments for R
with renv
.
Students develop fundamental knowledge and understanding in the state of the art in statistical markup languages and reproducible programming and development
They can determine the most effective markup strategies to address a typesetting problem
They can efficiently organize a reproducible programming and development process
They can produce repositories up to the standards of international programming and coding conventions and initiatives
They can produce publications up to the typesetting standards of international peer-reviewed journals
R
codeA research compendium is a collection of all digital parts of a research project including data, code, texts…
The collection is created in such a way that reproducing all results is straightforward1
The compendium serves as a means for distributing, managing, and updating the collection2
A basic research compendium is just a folder…
compendium/
├── data
│ └── my_data.csv
├── analysis
│ └── my_script.R
├── requirements.txt
└── README.md
… but it can become extensive…
|
├── paper/
│ ├── paper.qmd
│ └── references.bib
|
├── figures/
|
├── data/
│ ├── raw_data/
│ └── clean_data/
|
└── templates
└── journal_template.csl
…or even executable!
|
├── _targets.R
├── R/
│ ├── functions_data.R
│ ├── functions_analysis.R
│ ├── functions_visualization.R
└── data/
└── input_data.csv
Research Data Management Support workshop:
Adapted from The Turing Way
Markup Languages and Reproducible Programming in Statistics team (2024). Course materials. URL: www.gerkovink.com/markup
Utrecht University (2024). Course description. URL: https://osiris-student.uu.nl/#/onderwijscatalogus/extern/cursus?cursuscode=202000010&taal=en&collegejaar=huidig
The Turing Way Community (2022). The Turing Way: A handbook for reproducible, ethical and collaborative research (1.0.2). DOI: 10.5281/ZENODO.3233853
Utrecht University (2023). Best Practices for Writing Reproducible Code. URL: utrechtuniversity.github.io/workshop-computational-reproducibility
Utrecht University (2023). Writing Reproducible Manuscripts in R & Python. URL: utrechtuniversity.github.io/workshop-reproducible-manuscripts
Eglen, S., & Nüst, D., (2024). CODECHECK. URL: codecheck.org.uk
Hanne Oberman, hanneoberman.github.io/presentations