DEXOM in python
The package can be installed using pip: pip install dexom-python
git clone https://forge.inrae.fr/metexplore/cbm/dexom-pythonpoetry install (if poetry is
already installed in your python environment) or pip install -e .-h option to
display help messages.Requirements
Python 3.7 - 3.9
CPLEX 12.10 - 22.10
Installing CPLEX
Free license (Trial version): this version is limited to 1000 variables and 1000 constraints, and is therefore not useable on larger models
Academic license: for this, you must sign up using an academic email address. - after logging in, you can access the download for “ILOG CPLEX Optimization Studio” - download version 12.10 or higher of the appropriate installer for your operating system - install the solver
setup.py file appropriate for your OS and python versionpython "C:\Program Files\IBM\ILOG\CPLEX_Studio1210\python\setup.py" install
and/or pip install cplex==12.10 (with the appropriate CPLEX
version number)Available functions
These are the different functions which are available for context-specific metabolic subnetwork extraction
apply_gpr
gpr_rules.py script can be used to transform gene expression
data into reaction weights.--convert flag will instead create semi-quantitative
reaction weights with values in {-1, 0, 1}. The default proportion of
these three weights is {25%, 50%, 25%}, it can be adjusted with the
--quantiles parameter.iMAT
imat_functions.py contains a modified version of the iMAT
algorithm as defined by (Shlomi et
al. 2008).The remaining inputs of imat are: - epsilon: the activation
threshold of reactions with weight > 0 - threshold: the activation
threshold for unweighted reactions - full: a bool parameter for
switching between the partial & full-DEXOM implementation
In addition, the following solver parameters have been made available
through the solver API: - timelimit: the maximum amount of time
allowed for solver optimization (in seconds) - feasibility: the
solver feasibility tolerance - mipgaptol: the solver MIP gap
tolerance
epsilon > threshold > ub*feasibility (where ub is the maximal
upper bound for reaction flux in the model).create_new_partial_variables function.
In this version, binary flux indicator variables are created for each
reaction with a non-zero weight.parsimonious_imat function, which first maximizes
the original iMAT objective, then minimizes the absolute sum of all reaction fluxes,
thus producing a parsimonious flux distribution.enum_functions
Four methods for enumerating context-specific networks are available: -
rxn_enum_functions.py contains reaction-enumeration (function name:
rxn_enum) - icut_functions.py contains integer-cut (function
name: icut) - maxdist_functions.py contains
distance-maximization (function name: maxdist) -
diversity_enum_functions.py contains diversity-enumeration (function
name: diversity_enum)
prev_sol: an imat
solution used as a starting point (if none is provided, a new one will
be computed)obj_tol: the relative tolerance on the imat objective value for
the optimality of the solutionsmaxiter: the maximum number of iterations to run - full: set
to True to use the full-DEXOM implementationicut: if True, an integer-cut constraint will be applied to
prevent this enumeration to produce duplicate solutionsParallelized DEXOM for computation clusters
The folder dexom_python/cluster_utils/ contains batch scripts which
can be used for running dexom_python functions on a slurm cluster, as
well as a snakemake workflow which can be used to launch enumeration
functions in multiple jobs.
cluster_install_dexom_python.sh contains the necessary
commands for cloning the dexom-python git repository, setting up a
python virtual environement and installing all required dependencies.system/Python-3.7.4 - otherwise you must use a
python version which is installed on your cluster.The snakemake workflow can be launched through the following command:
(note that you must replace the "path/to/solver" string with the
actual path to your CPLEX solver.)
sbatch dexom_python/cluster_utils/submit_slurm.sh
If you run this command without modifying any parameters, it will execute a short DEXOM pipeline (with reaction-enumeration followed by diversity-enumeration) on a toy model.
cluster_config.yaml.dexom_python/default_parameter_values.py file.This workflow uses a reaction-weights file as an input. The
pathway_enrichment.py can be used to perform a pathway
enrichment analysis using a one-sided hypergeometric testresult_functions.py contains the plot_pca function, which
performs Principal Component Analysis on the enumeration solutionsSome older scripts for running enumeration functions on a slurm cluster can be found in ``dexom_python/cluster_utils/legacy``. However, it is strongly recommended to use the snakemake workflow, which is more reliable and can be adapted more easily for different applications.
Example scripts
Toy models
toy_models.py script contains code for generating some small
metabolic models and reaction weights.toy_models/ folder contains some ready-to-use models and
reaction weight files.main.py script contains a simple example of the DEXOM workflow
using one of the toy models.dexom_python/cluster_utils/ also uses a toy model as an example.Recon 2.2
example_data/ folder contains a modified version of the Recon
2.2 model (Swainston et
al. 2016) as well as
some differential gene expression data which can be used to test this
implementation.python dexom_python/gpr_rules -m example_data/recon2v2_corrected.json -g example_data/pval_0-01_geneweights.csv -o example_data/pval_0-01_reactionweights
Alternatively an example of how this command can be submitted to a slurm
cluster is shown in slurm_example_gpr.sh (again, you must insert the
path to your CPLEX solver in the appropriate location).
In order to use the snakemake workflow on this example dataset, you must
modify some parameters in cluster_config.yaml:
model: example_data/recon2v2_corrected.json
reaction_weights: example_data/pval_0-01_reactionweights.csv
output_path: example_data_cluster_output/
Additionally, when using continuous reaction-weights, the solver may
have difficulty finding solutions if the constraints are too strict. To
relax the optimality tolerance on the objective value, modify the
following parameter in the file
dexom_python/default_parameter_values.py:
'obj_tol': 2e-3,
You can then once again start the snakemake workflow with the command:
sbatch dexom_python/cluster_utils/submit_slurm.sh
After all jobs are completed, you can analyze the results with the following commands:
python dexom_python/pathway_enrichment.py -s example_data_cluster_output/all_unique_solutions.csv -m example_data/recon2v2_corrected.json -o example_data/
python dexom_python/result_functions.py -s example_data_cluster_output/all_unique_solutions.csv -o example_data/
example_data_cluster_output/all_unique_solutions.csv
contains all unique solutions enumerated with DEXOM..png files in the example_data folder contain boxplots of
the pathway enrichment tests as well as a 2D PCA plot of the binary
solution vectors.Cell-specific reconstruction
An example of how to use DEXOM-python as a part of a cell-specific network reconstruction pipeline, including a more complete snakemake workflow, can be found here: https://forge.inrae.fr/metexplore/cbm/ocmmed
Latest version
2.3