DEXOM in python
API documentation is available here: https://dexom-python.readthedocs.io/en/stable/
The package can be installed using pip: pip install dexom-python
You can also clone the git repository with
git clone https://github.com/MetExplore/dexom-python and then
install dependencies with python setup.py install
Requirements
Python 3.7 - 3.9
CPLEX 12.10 - 22.10
Installing CPLEX
Free license (Trial version): this version is limited to 1000 variables and 1000 constraints, and is therefore not useable on larger models
Academic license: for this, you must sign up using an academic email address. - after logging in, you can access the download for “ILOG CPLEX Optimization Studio” - download version 12.10 or higher of the appropriate installer for your operating system - install the solver
setup.py file appropriate for your OS and
python versionpython "C:\Program Files\IBM\ILOG\CPLEX_Studio1210\python\setup.py" install
and/or pip install cplex==12.10Functions
These are the different functions which are available for context-specific metabolic subnetwork extraction
apply_gpr
gpr_rules.py script can be used to transform gene expression
data into reaction weights, for a limited selection of models.--convert flag will instead create semi-quantitative
reaction weights with values in {-1, 0, 1}. By default, the proportion
of these three weights will be {25%, 50%, 25%}.iMAT
imat.py contains a modified version of the iMAT algorithm as
defined by (Shlomi et
al. 2008).The remaining inputs of imat are: - epsilon: the activation
threshold of reactions with weight > 0 - threshold: the activation
threshold for unweighted reactions - full: a bool parameter for
switching between the partial & full-DEXOM implementation
In addition, the following solver parameters have been made available
through the solver API: - timelimit: the maximum amount of time
allowed for solver optimization (in seconds) - feasibility: the
solver feasbility tolerance - mipgaptol: the solver MIP gap
tolerance note: the feasibility determines the solver’s capacity to
return correct results. In particular, it is necessary that epsilon
> threshold > ub*feasibility (where ub is the maximal upper
bound for reaction flux in the model)
create_new_partial_variables function.
In this version, binary flux indicator variables are created for each
reaction with a non-zero weight.enum_functions
Four methods for enumerating context-specific networks are available: -
rxn_enum_functions.py contains reaction-enumeration (function name:
rxn_enum) - icut_functions.py contains integer-cut (function
name: icut) - maxdist_functions.py contains
distance-maximization (function name: maxdistm) -
diversity_enum_functions.py contains diversity-enumeration (function
name: diversity_enum)
prev_sol: an imat
solution used as a starting point (if none is provided, a new one will
be computed)obj_tol: the relative tolerance on the imat objective value for
the optimality of the solutionsmaxiter: the maximum number of iterations to run - full: set
to True to use the full-DEXOM implementationicut: if True, an icut constraint will be applied to prevent
duplicate solutionsParallelized DEXOM
enumeration.py contains the write_batch_script1 function,
which is used for creating a parallelization of DEXOM on a slurm
computation cluster. The main inputs of this function are: -
filenums: the number of parallel batches which should be launched
on slurm - iters: the number of div-enum iterations per batchOther inputs are used for personalizing the directories and filenames on the cluster.
file_0.sh, file_1.sh etc. depending
on the filenum parameter that was provided.runfiles.sh file. This file
contains the commands to submit the other files as job batches on the
slurm cluster.dexom_cluster_results.pycompiles and removes duplicate
solutions from the results of a parallel DEXOM run.pathway_enrichment.py can be used to perform a pathway
enrichment analysis using a one-sided hypergeometric testresult_functions.py contains the plot_pca function, which
performs Principal Component Analysis on the enumeration solutionsExamples
Toy models
toy_models.py script contains code for generating some small
metabolic models and reaction weights.main.py script contains a simple example of the DEXOM workflow
using one of the toy models.Recon 2.2
gpr_rules
script from the command line.python dexom_python/gpr_rules -m example_data/recon2v2_corrected.json -g example_data/pval_0-01_geneweights.csv -o example_data/pval_0-01_reactionweights
Then, call imat to produce a first context-specific subnetwork. This will create a file named “imat_solution.csv” in the example_data folder:
python dexom_python/imat -m example_data/recon2v2_corrected.json -r example_data/pval_0-01_reactionweights.csv -o example_data/imat_solution
-c argument.dexom-python project
on the cluster, which contains the dexom_python folder and the
example_data folder in the same directory.python dexom_python/enum_functions/enumeration -m example_data/recon2v2_corrected.json -r example_data/pval_0-01_reactionweights.csv -p example_data/imat_solution.csv -o example_data/ -n 100 -i 100 -c /home/mstingl/save/CPLEX_Studio1210/cplex/python/3.7/x86-64_linux
dos2unix runfiles.sh before sbatch runfiles.sh:cd example_data/
sbatch runfiles.sh
cd ..
After all jobs are completed, you can analyze the results using the following scripts:
python dexom_python/dexom_cluster_results -i example_data/ -o example_data/ -n 100
python dexom_python/pathway_enrichment -s example_data/all_dexom_sols.csv -m example_data/recon2v2_corrected.json -o example_data/
python dexom_python/result_functions -s example_data/all_dexom_sols.csv -o example_data/
all_dexom_sols.csv contains all unique solutions
enumerated with DEXOM.output.txt contains the average computation time per
iteration and the proportion of duplicate solutions..png files contain boxplots of the pathway enrichment tests as
well as a 2D PCA plot of the binary solution vectors.