More than ninety percent of published Jupyter notebooks do not state dependencies on external packages. This makes them non-executable and thus hinders reproducibility of scientific results. We present SnifferDog, an approach that 1) collects the APIs of Python packages and versions, creat- ing a database of APIs; 2) analyzes notebooks to determine candidates for required packages and versions; and 3) checks which packages are required to make the notebook executable (and ideally, reproduce its stored results). In its evaluation, we show that SnifferDog precisely restores execution environments for the largest majority of notebooks, making them immediately executable for end users.
History
Preferred Citation
Jiawei Wang, Tzu-yang Kuo, Li Li and Andreas Zeller. Restoring reproducibility of Jupyter notebooks. In: International Conference on Software Engineering (ICSE). 2021.
Primary Research Area
Secure Connected and Mobile Systems
Name of Conference
International Conference on Software Engineering (ICSE)
Legacy Posted Date
2022-10-13
Open Access Type
Green
Presentation Type
Presentation (no conference)
BibTeX
@inproceedings{cispa_all_3827,
title = "Restoring reproducibility of Jupyter notebooks",
author = "Wang, Jiawei and Kuo, Tzu-yang and Li, Li and Zeller, Andreas",
booktitle="{International Conference on Software Engineering (ICSE)}",
year="2021",
}