# enceladus_bayesian_methanogenesis

**Repository Path**: mirrors_ndmitchell/enceladus_bayesian_methanogenesis

## Basic Information

- **Project Name**: enceladus_bayesian_methanogenesis
- **Description**: Fork of https://gitlab.com/antonin.affholder/enceladus_bayesian_methanogenesis
- **Primary Language**: Unknown
- **License**: MIT
- **Default Branch**: main
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2025-03-19
- **Last Updated**: 2026-03-29

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# Code for inferring habitability of Enceladus


This python code contains in-house modules to perform computation of the biogeochemistry model of Enceladus based on a thermodynamical model of hydrogenotrophic methanogens described in 

Affholder, A., Guyot, F., Sauterey, B. et al. *Bayesian analysis of Enceladus’s plume data to assess methanogenesis.* Nat Astron __5__, 805–814 (2021). https://doi.org/10.1038/s41550-021-01372-6.

## Installation
You can clone this repository or download it as a .zip file to unzip in your working directory.
Pseudo-data can be generated by running a python script from the command line, but the code for its analysis is in jupyter notebooks

Developed using python 3.7.5, requires packages  [`numpy `](https://numpy.org/doc), [`scipy`](https://www.scipy.org/),  [`pyabc`](https://pyabc.readthedocs.io/en/latest/) and [`scikit-learn`](https://scikit-learn.org/stable/index.html) for computation, and `matplotlib` and `seaborn` for plotting. Jupyter notebooks were generated using [Jupyter lab](https://jupyter.org/) 2.1.0.

Tested on MacOS Mojave 10.14.6 and Linux Ubuntu 19.10.

## Overview

### Scripts
scripts to run the simulations and make some figures
- `batch_runs.py`script is a command line executable that generates pseudo-data tables based on priors to feed ABC-RF analysis.
- `data_setup_scripts.py` helps you turn simulation outputs into data that can be plotted and interpreted.
- `makefigures.py` makes the statistical figures (including some that ended up in the article)
- `likelihoods.py` makes the likelihood figures

### Code ("pyEnceladus")
Core code functions as well as basic data for Enceladus and hydrogenotrophy are in the pyEnceladus module.
- biosims.py contains functions related to running simulations
- data_htv.py contains data of hydrogenotrophic methanogens enzymes and thermodynamics
- physical.py contains functions related to the physical model of mixing in the ML and plume composition
- plot_abc.py contains functions and scripts to plot ABC-related figures
- plot_tools.py contains functions to plot.
- stats_analysis.py to estimate likelihoods.
- simulation_qstar.py contains the function that runs one simulation
- universal_htv.py contains functions related to the biological model, in particular to solve the composition of a water mass at steady-state with the theoretical population.

### Notebooks
This code relies extensively on notebooks. They serve the purpose of running the ABC-RF analysis as well as plotting figures.
- `biogeomodel.ipynb` contains illustrations of the biological and physical model
- `poolABC.ipynb` contains the ABC-RF analysis of the standard dataset for P(I|H)=0.5
- `highermethane.ipynb` contains a replicate of the approach in poolABC and bootstraps based on priors allowing higher methane content of the hydrothermal fluid.
- `sensitivity.ipynb` contains a replicate of the approach in poolABC and bootstraps, while drawing model parameters at random

### Data
- `demo` contains dummy priors that are used in `biogeomodel.ipynb`
- `taubner2018` contains data from Taubner et al 2018, used in particular to estimate tau
- `higher_methane` contains simulations for the 'higher methane' setup
- `hydrothermal_methane` contains simulations for the 'standard' setup
- `sensitivity` contains simulation data for the sensitivity analysis
- `standard_priors.csv`    contains the bounds of (log-) uniform laws defining the prior space for the standard run.
- `moremethane_priors.csv` contains the bounds of (log-) uniform laws defining the prior space for the 'higher methane' run.
- `hydrothermal_methaneseed.pck` contains the pickled seed for the standard run
- `hydrothermal_methane.log` is the log of the standard run, showing that it took around one (1) hour of computation.
- `higher_methaneseed.pck` contains the pickled seed for the extended CH4 run
- `higher_methane.log` is the log of the extended CH4 run.
- `tree.dot` is an example of a decision tree


## Usage
### Generate simulation files
`./batch_runs.py <prior_bdries_file> <project_folder_name> -d <list of d values to test separated with spaces> -b`
```
usage: batch_runs.py [-h] [-n NSIM] [-s SEED] [-b] [--no-pbar] [--save-raw] [--prior PRIOR] [--dirname DIRNAME] -d DLIST [DLIST ...] -p PLIST [PLIST ...]

    Runs simulations in a batch for various values of the death rate
                                     

optional arguments:
  -h, --help            show this help message and exit
  -n NSIM, --nsim NSIM
  -s SEED, --seed SEED
  -b, --progress-bar    displays a progress bar
  --no-pbar
  --save-raw            save raw sumstats
  --prior PRIOR         filename of priors
  --dirname DIRNAME     output dirname
  -d DLIST [DLIST ...], --list-d DLIST [DLIST ...]
                        <Required> list of d values
  -p PLIST [PLIST ...], --list-p PLIST [PLIST ...]
                        <Required> list of d values
```
The standard dataset, for which main results was generated with call to the `batch_runs`script with signature :
```./batch_runs --prior <mypath>/standard_priors.csv --dirname <mypath>/hydrothermal_methane -n 50000 -p 0.5 -d 0.03 --save-raw```

For the extended CH4 dataset :
```./batch_runs --prior <mypath>/moremethane_priors.csv --dirname <mypath>/higher_methane -n 50000 -p 0.5 -d 0.03 --save-raw```

The exact same pseudo-data should be obtainable by running `batch_runs` specifying the appropriate pickle file as the seed (hydrothermal_methaneseed.pckl and higher_methaneseed.pckl).

Data for the sensitivity analysis was generated "on the go" from the notebook.

### Run analysis
Analysis is performed in the jupyter notebooks and can therefore be run by manually executing code in notebook cells.


## Authors
Antonin Affholder (1,4), François Guyot (2), Boris Sauterey (1), Régis Ferrière (1,3), Stéphane Mazevet (4).
## Contributors
Antonin Affholder (1,4), antonin.affholder@biologie.ens.fr
## Affiliations
- (1) Institut de Biologie de L’École Normale Supérieure (IBENS), Université Paris Sciences et Lettres, 75005 Paris, France.
- (2) Institut de Minéralogie, Physique des Matériaux et Cosmochimie (IMPMC), Muséum National d’Histoire Naturelle (MNHN), CNRS,
75005 Paris, France.
- (3) International Center for Interdisciplinary Global Environment Studies (iGLOBES), CNRS, ENS-PSL University, University of Arizona,
Tucson, AZ 85721, USA.
- (4) Institut de Mécanique Céleste et Calcul des Éphémérides (IMCCE) , Observatoire de Paris, PSL, CNRS, 75014 Paris , France

## References
Scikit-learn: Machine Learning in Python, Pedregosa et al., JMLR 12, pp. 2825-2830, 2011.

Klinger, E., Rickert, D., & Hasenauer, J. (2018). pyABC: distributed, likelihood-free inference. Bioinformatics, 34(20), 3591-3593.

Pudlo, P., Marin, J. M., Estoup, A., Cornuet, J. M., Gautier, M., & Robert, C. P. (2016). Reliable ABC model choice via random forests. Bioinformatics, 32(6), 859-866.

## Cite this work
Affholder, A., Guyot, F., Sauterey, B. et al. *Bayesian analysis of Enceladus’s plume data to assess methanogenesis.* Nat Astron __5__, 805–814 (2021). https://doi.org/10.1038/s41550-021-01372-6.