README
API documentation is available here: https://dexom-python.readthedocs.io/en/latest/
Requirements
Python 3.7+
CPLEX 12.10+
Installing CPLEX
Free license (Trial version): this version is limited to 1000 variables and 1000 constraints, and is therefore not useable on larger models
setup.py file appropriate for your OS and python versionpython "C:\Program Files\IBM\ILOG\CPLEX_Studio1210\python\setup.py" installFunctions
These are the different functions which are available for context-specific metabolic subnetwork extraction
apply_gpr
gpr_rules.py script can be used to transform gene expression
data into reaction weights, for a limited selection of models.--convert
flag will instead create semi-quantitative reaction weights with
values in {-1, 0, 1}. By default, the proportion of these three
weights will be {25%, 50%, 25%}.iMAT
imat.py contains a modified version of the iMAT algorithm as
defined by (Shlomi et
al. 2008).The remaining inputs of imat are: - epsilon: the activation
threshold of reactions with weight > 0 - threshold: the activation
threshold for unweighted reactions - timelimit: the solver time
limit - feasibility: the solver feasbility tolerance -
mipgaptol: the solver MIP gap tolerance - full: a bool parameter
for switching between the partial & full-DEXOM implementation
note: the feasibility determines the solver’s capacity to return correct
results. In particular, the relation epsilon > threshold >
ub*feasibility is required (where ub is the maximal upper bound
for reaction flux in the model)
enum_functions
Four methods for enumerating context-specific networks are available: -
rxn-enum.py contains reaction-enumeration - icut.py contains
integer-cut - maxdist.py contains distance-maximization -
div-enum.py contains diversity-enumeration
prev_sol: a
starting imat solution (if none is provided, a new one will be
computed)obj_tol: a relative tolerance on the imat objective value for
the optimality of the solutionsmaxiter: the maximum number of iterations to run - full: set
to True to use the full-DEXOM implementationicut: if True, an icut constraint will be applied to prevent
duplicate solutionsParallelized DEXOM
enumeration.py contains the write_batch_script1 function,
which is used for creating a parallelization of DEXOM on a slurm
computation cluster. The main inputs of this function are: -
filenums: the number of parallel batches which should be launched
on slurm - iters: the number of div-enum iterations per batchOther inputs are used for personalizing the directories and filenames on the cluster.
file_0.sh, file_1.sh etc. depending
on the filenum parameter that was provided.runfiles.sh file. This file
contains the commands to submit the other files as job batches on the
slurm cluster.dexom_cluster_results.pycompiles and removes duplicate
solutions from the results of a parallel DEXOM run.pathway_enrichment.py can be used to perform a pathway
enrichment analysis using a one-sided hypergeometric testresult_functions.py contains the plot_pca function, which
performs Principal Component Analysis on the enumeration solutionsExamples
Toy models
toy_models.py script contains code for generating some small
metabolic models and reaction weights.main.py script contains a simple example of the DEXOM workflow
using one of the toy models.Recon 2.2
gpr_rules
script from the command line.python dexom_python/gpr_rules -m recon2v2/recon2v2_corrected.json -n recon2 -g recon2v2/pval_0-01_geneweights.csv -o recon2v2/pval_0-01_reactionweights
Then, call imat to produce a first context-specific subnetwork. This will create a file named “imat_solution.csv” in the recon2v2 folder:
python dexom_python/imat -m recon2v2/recon2v2_corrected.json -r recon2v2/pval_0-01_reactionweights.csv -o recon2v2/imat_solution
To run DEXOM on a slurm cluster, call the enumeration.py script to
create the necessary batch files (here: 100 batches with 100
iterations). Be careful to use your own username after the -u input.
This script assumes that you have cloned the dexom-python project
into a work folder on the cluster, and that you have installed CPLEX
v12.10 in the same work folder. Note that this step creates a file
called “recon2v2_reactions_shuffled.csv”, which shows the order in which
rxn-enum will call the reactions from the model.
python dexom_python/enum_functions/enumeration -m recon2v2/recon2v2_corrected.json -r recon2v2/pval_0-01_reactionweights.csv -p recon2v2/imat_solution.csv -o recon2v2/ -u mstingl -n 100 -i 100
Then, submit the job to the slurm cluster. Note that if you created the
files on a Windows pc, you must use the command dos2unix runfiles.sh
before sbatch runfiles.sh:
cd recon2v2/
sbatch runfiles.sh
cd -
After all jobs are completed, you can analyze the results using the following scripts:
python dexom_python/dexom_cluster_results -i recon2v2/ -o recon2v2/ -n 100
python dexom_python/pathway_enrichment -s recon2v2/all_dexom_sols.csv -m recon2v2/recon2v2_corrected.json -o recon2v2/
python dexom_python/result_functions -s recon2v2/all_dexom_sols.csv -o recon2v2/
all_dexom_sols.csv contains all unique solutions
enumerated with DEXOM.output.txt contains the average computation time per
iteration and the proportion of duplicate solutions..png files contain boxplots of the pathway enrichment tests as
well as a 2D PCA plot of the binary solution vectors.