Skip to content

Commit

Permalink
Merge branch 'master' into fix-issue-55
Browse files Browse the repository at this point in the history
  • Loading branch information
ArnaudBelcour committed Jan 9, 2024
2 parents ac96007 + 8424d5a commit 49412a1
Show file tree
Hide file tree
Showing 44 changed files with 2,622 additions and 613 deletions.
1 change: 1 addition & 0 deletions .github/workflows/pythonpackage.yml
Original file line number Diff line number Diff line change
Expand Up @@ -67,3 +67,4 @@ jobs:
pytest test_m2m_iscope.py
pytest test_m2m_metacom.py
pytest test_m2m_mincom.py
pytest test_tiny_toy_metacom.py
6 changes: 6 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -92,3 +92,9 @@ ENV/

# Data directories
.vscode

# m2m output directories
test/addedvalue_output/
test/metabolic_data/toy_bact/
tutorials/method_tutorial/output_folder/
tutorials/method_tutorial/tutorial_output_folder/
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,8 +69,8 @@ pip install -r requirements.txt --no-cache-dir
In particular, m2m relies on:
* [mpwt](https://github.com/AuReMe/mpwt) to automatize metabolic network reconstruction with Pathway Tools
* [padmet](https://github.com/AuReMe/padmet) to manage metabolic networks
* [menetools](https://github.com/cfrioux/MeneTools) to analyze individual metabolic capabilities using logic programming
* [miscoto](https://github.com/cfrioux/miscoto) to analyze collective metabolic capabilities and select communities within microbiota using logic programming
* [menetools](https://github.com/cfrioux/MeneTools) to analyze individual metabolic capabilities using logic programming. **Requires MeneTools > 3.4**
* [miscoto](https://github.com/cfrioux/miscoto) to analyze collective metabolic capabilities and select communities within microbiota using logic programming. **Requires MiSCoTo > 3.2**

Also, m2m_analysis relies on other packages:
* [networkx](https://github.com/networkx/networkx) to create graph from miscoto results
Expand Down
2 changes: 1 addition & 1 deletion docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ M2M relies on packages that can also be used independantly with more features an
Authors
=======

`Clémence Frioux <https://cfrioux.github.io/>`__ and `Arnaud Belcour <https://arnaudbelcour.github.io/blog/>`__, Univ Rennes, Inria, CNRS, IRISA, Rennes, France.
`Clémence Frioux <https://cfrioux.github.io/>`__ and `Arnaud Belcour <https://arnaudbelcour.github.io/blog/>`__, Univ Bordeaux, Inria, INRAE, Bordeaux, France and Univ Rennes, Inria, CNRS, IRISA, Rennes, France.

Acknowledgement
===============
Expand Down
5 changes: 5 additions & 0 deletions docs/m2m_analysis.rst
Original file line number Diff line number Diff line change
Expand Up @@ -562,3 +562,8 @@ If you have used the ``--taxon``, two new files have been created:
``taxonomy_species.tsv``: it is a tsv file with 9 columns. The row corresponds to the species in your community. For each species, you will have its name in your dataset, its taxID (from taxon_id.tsv), an attributed taxonomic name (used in the power graph), then the taxonomic classification: phylum, class, order, family, genus and species.

``taxon_tree.txt``: the topology of the taxonomic classification of your species according to the NCBI taxonomy.

More information
-----------------

Take a look at the complete tutorial in the `Github repository <https://github.com/AuReMe/metage2metabo/tree/master/tutorials/method_tutorial>`__
435 changes: 264 additions & 171 deletions docs/tutorial.rst

Large diffs are not rendered by default.

33 changes: 20 additions & 13 deletions metage2metabo/__main__.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@
from metage2metabo.m2m.community_addedvalue import addedvalue
from metage2metabo.m2m.minimal_community import mincom
from metage2metabo.m2m.m2m_workflow import run_workflow, metacom_analysis
from metage2metabo.sbml_management import get_compounds

from metage2metabo import sbml_management, utils

Expand Down Expand Up @@ -209,7 +210,8 @@ def main():
parent_parser_xml
],
description=
"Run metabolic network reconstruction for each annotated genome of the input directory, using Pathway Tools"
"Run metabolic network reconstruction for each annotated genome of the input directory, using Pathway Tools",
allow_abbrev=False
)
indivscope_parser = subparsers.add_parser(
"iscope",
Expand All @@ -218,7 +220,8 @@ def main():
parent_parser_n, parent_parser_s, parent_parser_o, parent_parser_q, parent_parser_c
],
description=
"Compute individual scopes (reachable metabolites from seeds) for each metabolic network of the input directory"
"Compute individual scopes (reachable metabolites from seeds) for each metabolic network of the input directory",
allow_abbrev=False
)
comscope_parser = subparsers.add_parser(
"cscope",
Expand All @@ -227,7 +230,7 @@ def main():
parent_parser_n, parent_parser_s, parent_parser_o, parent_parser_m,
parent_parser_q, parent_parser_t_optional
],
description="Compute the community scope of all metabolic networks")
description="Compute the community scope of all metabolic networks", allow_abbrev=False)
added_value_parser = subparsers.add_parser(
"addedvalue",
help="added value of microbiota's metabolism over individual's",
Expand All @@ -236,7 +239,8 @@ def main():
parent_parser_q
],
description=
"Compute metabolites that are reachable by the community/microbiota and not by individual organisms"
"Compute metabolites that are reachable by the community/microbiota and not by individual organisms",
allow_abbrev=False
)
mincom_parser = subparsers.add_parser(
"mincom",
Expand All @@ -246,13 +250,15 @@ def main():
parent_parser_q, parent_parser_t_required
],
description=
"Select minimal-size community to make reachable a set of metabolites")
"Select minimal-size community to make reachable a set of metabolites",
allow_abbrev=False)
seeds_parser = subparsers.add_parser(
"seeds",
help="creation of seeds SBML file",
parents=[parent_parser_o, parent_parser_q],
description=
"Create a SBML file starting for a simple text file with metabolic compounds identifiers"
"Create a SBML file starting for a simple text file with metabolic compounds identifiers",
allow_abbrev=False
)
seeds_parser.add_argument(
"--metabolites",
Expand All @@ -268,7 +274,8 @@ def main():
parent_parser_t_optional, parent_parser_cl, parent_parser_xml
],
description=
"Run the whole workflow: metabolic network reconstruction, individual and community scope analysis and community selection"
"Run the whole workflow: metabolic network reconstruction, individual and community scope analysis and community selection",
allow_abbrev=False
)
metacom_parser = subparsers.add_parser(
"metacom",
Expand All @@ -278,16 +285,16 @@ def main():
parent_parser_t_optional, parent_parser_q, parent_parser_c
],
description=
"Run the whole metabolism community analysis: individual and community scope analysis and community selection"
"Run the whole metabolism community analysis: individual and community scope analysis and community selection",
allow_abbrev=False
)
test_parser = subparsers.add_parser(
"test",
help="test on sample data from rumen experiments",
parents=[
parent_parser_q, parent_parser_c, parent_parser_o
],
description=
"Test the whole workflow on a data sample")
description="Test the whole workflow on a data sample", allow_abbrev=False)

args = parser.parse_args()

Expand Down Expand Up @@ -354,7 +361,7 @@ def main():
# test if some targets are seeds
itsct_seeds_targets = sbml_management.compare_seeds_and_targets(args.seeds, args.targets)
if itsct_seeds_targets != set():
logger.warning(f"\nWARNING: compounds {*list(itsct_seeds_targets),} are both in seeds and targets. Since they are in seeds, they will be in each organism's individual producibility scope (iscope), but not appear in the community scope (cscope). To be certain that they are produced (through an activable reaction and not just because they are seeds), check the output file: indiv_scopes/indiv_produced_seeds.json and the key 'individually producible' in the file producibility_targets.json.\n")
logger.warning(f"\nWARNING: compounds {*list(itsct_seeds_targets),} are both in seeds and targets. As such, they will be considered already reachable during community selection and will be ignored. However, their producibility can be assessed in individual and community scopes. If a host is provided, the community scope computation differs slightly and the producibility of the compound will have to be checked in the output files: indiv_scopes/seeds_in_indiv_scopes.json and the key 'individually producible' in the file producibility_targets.json. \n")
if args.cmd == "iscope":
main_iscope(network_dir, args.seeds, args.out, args.cpu)
elif args.cmd == "cscope":
Expand Down Expand Up @@ -415,7 +422,7 @@ def main_cscope(*allargs):
"""Run cscope command.
"""
instance_com, comscope = cscope(*allargs)
logger.info("\n" + str(len(comscope)) + " metabolites (excluding the seeds) reachable by the whole community/microbiota: \n")
logger.info("\n" + str(len(comscope)) + " metabolites reachable by the whole community/microbiota: \n")
logger.info('\n'.join(comscope))
#delete intermediate file
os.unlink(instance_com)
Expand Down Expand Up @@ -455,7 +462,7 @@ def main_mincom(sbmldir, seedsfiles, outdir, targets, host):
#create instance
instance = instance_community(sbmldir, seedsfiles, outdir, targets, host)
#run mincom
mincom(instance, outdir)
mincom(instance, seedsfiles, set(get_compounds(targets)), outdir)
#delete intermediate file
os.unlink(instance)

Expand Down
12 changes: 8 additions & 4 deletions metage2metabo/__main_analysis__.py
Original file line number Diff line number Diff line change
Expand Up @@ -183,7 +183,8 @@ def main():
parent_parser_s, parent_parser_n, parent_parser_t, parent_parser_m, parent_parser_o, parent_parser_q
],
description=
"Run miscoto enumeration on sbml species with seeds and targets"
"Run miscoto enumeration on sbml species with seeds and targets",
allow_abbrev=False
)
graph_parser = subparsers.add_parser(
"graph",
Expand All @@ -192,7 +193,8 @@ def main():
parent_parser_j, parent_parser_o, parent_parser_t, parent_parser_taxon, parent_parser_q,
parent_parser_level
],
description="Create the solution graph using the JSON from miscoto enumeration")
description="Create the solution graph using the JSON from miscoto enumeration",
allow_abbrev=False)
powergraph_parser = subparsers.add_parser(
"powergraph",
help="powergraph creation and visualization",
Expand All @@ -201,7 +203,8 @@ def main():
parent_parser_level, parent_parser_o
],
description=
"Compress the GMl graph of solution and create a powergraph (bbl), a website format of the powergraph and a svg of the graph (if you use the --oog option)"
"Compress the GMl graph of solution and create a powergraph (bbl), a website format of the powergraph and a svg of the graph (if you use the --oog option)",
allow_abbrev=False
)
wkf_parser = subparsers.add_parser(
"workflow",
Expand All @@ -211,7 +214,8 @@ def main():
parent_parser_taxon, parent_parser_q, parent_parser_level
],
description=
"Run the whole workflow: miscoto enumeration, graph on solution and powergraph creation"
"Run the whole workflow: miscoto enumeration, graph on solution and powergraph creation",
allow_abbrev=False
)

args = parser.parse_args()
Expand Down
6 changes: 6 additions & 0 deletions metage2metabo/m2m/community_addedvalue.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,12 @@ def addedvalue(iscope_rm, cscope_rm, out_dir):
Returns:
set: set of metabolites that can only be reached by a community
"""
logger.info('\n###############################################')
logger.info('# #')
logger.info('# Added-value of metabolic interactions #')
logger.info('# #')
logger.info('###############################################\n')

# Community targets = what can be produced only if cooperation occurs between species
newtargets = cscope_rm - iscope_rm
logger.info("\nAdded value of cooperation over individual metabolism: " +
Expand Down
91 changes: 76 additions & 15 deletions metage2metabo/m2m/community_scope.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,10 +18,11 @@
import sys
import tempfile
import time
import csv

from metage2metabo import utils

from miscoto import run_scopes, run_instance
from miscoto import run_focus, run_instance, run_scopes

from shutil import copyfile

Expand All @@ -43,11 +44,30 @@ def cscope(sbmldir, seeds, out_dir, targets_file=None, host=None):
tuple: instance file (str) and community scope (set)
"""
starttime = time.time()
logger.info('\n###############################################')
logger.info('# #')
logger.info('# Metabolic potential of the community #')
logger.info('# #')
logger.info('###############################################\n')

# Create instance for community analysis
instance_com = instance_community(sbmldir, seeds, out_dir, targets_file, host)
# Run community scope
logger.info("Running whole-community metabolic scopes")
community_reachable_metabolites = comm_scope_run(instance_com, out_dir, host)
logger.info("Running whole-community metabolic scopes...")
community_reachable_metabolites, contributions_of_microbes = comm_scope_run(instance_com, out_dir, host)
# compute the reverse cscope
contrib_microbes_path = os.path.join(*[out_dir, 'community_analysis', 'contributions_of_microbes.json'])
# reverse the dict to have compounds as keys, and species as values
reverse_contrib = {}
for species in contributions_of_microbes:
for compound in contributions_of_microbes[species]['produced_in_community']:
if compound in reverse_contrib:
reverse_contrib[compound].append(species)
else:
reverse_contrib[compound] = [species]
# export the reverse cscope to json and tsv
rev_cscopes_json_path, rev_cscopes_tsv_path = reverse_cscope(contributions_of_microbes, reverse_contrib, out_dir)
logger.info('Reverse community scopes for all metabolic networks available in ' + rev_cscopes_json_path + ' and ' + rev_cscopes_tsv_path + '. They higlight the producibility of metabolites by species in the community.\n')
logger.info("--- Community scope runtime %.2f seconds ---\n" %
(time.time() - starttime))
return instance_com, community_reachable_metabolites
Expand Down Expand Up @@ -84,7 +104,7 @@ def instance_community(sbml_dir, seeds, output_dir, targets_file=None, host_mn=N
targets_file=targets_file,
output=outputfile)

logger.info("Created instance in " + instance_filepath)
logger.info("Created temporary instance file in " + instance_filepath)
return instance_filepath


Expand All @@ -98,26 +118,67 @@ def comm_scope_run(instance, output_dir, host_mn=None):
Returns:
set: microbiota scope
dict: contribution of microbes to the scope
"""
miscoto_dir = os.path.join(output_dir, 'community_analysis')
com_scopes_path = os.path.join(miscoto_dir, 'comm_scopes.json')
contrib_microbes_path = os.path.join(miscoto_dir, "contributions_of_microbes.json")

if not utils.is_valid_dir(miscoto_dir):
logger.critical('Impossible to access/create output directory')
sys.exit(1)
microbiota_scope = run_scopes(lp_instance_file=instance)

# Remove keys "host_prodtargets", "host_scope", "comhost_scope" and "host_unprodtargets" if there is no host:
if host_mn is None:
del microbiota_scope['host_prodtargets']
del microbiota_scope['host_unprodtargets']
del microbiota_scope['host_scope']
del microbiota_scope['comhost_scope']
if host_mn is not None:
scopes_results = run_scopes(lp_instance_file=instance)
microbiota_scope = scopes_results['com_scope']
contributions_of_microbes = None
logger.info('The computation of the community scope with a host is of limited functionality. It will not highlight the contribution of each microbe to the community scope. Additionally, the producibility of seeds by the microbes will not be computed. Consider running the community scope without a host (i.e. all metabolic networks, including the host, in the same directory) to get the full functionality of the community scope. \n')
with open(com_scopes_path, 'w') as dumpfile:
json.dump(microbiota_scope, dumpfile, indent=4, sort_keys=True)
logger.info(f'Community scope for all metabolic networks available in {com_scopes_path}.\n')
else:
contributions_of_microbes = run_focus(seeds_file = None, bacteria_dir = None, focus_bact=[], all_networks=True, lp_instance_file=instance)
microbiota_scope = set()
for bacteria in contributions_of_microbes:
microbiota_scope.update(contributions_of_microbes[bacteria]['produced_in_community'])
dict_comscope = {'com_scope': list(microbiota_scope)}
with open(com_scopes_path, 'w') as dumpfile:
json.dump(dict_comscope, dumpfile, indent=4, sort_keys=True)
logger.info(f'Community scope for all metabolic networks available in {com_scopes_path}')
with open(os.path.join(miscoto_dir, 'contributions_of_microbes.json'), 'w') as dumpfile:
json.dump(contributions_of_microbes, dumpfile, indent=4, sort_keys=True)
logger.info(f'Contributions of microbes to community scope available in {contrib_microbes_path}.\n')

logger.info(f'\nNumber of metabolites producible in community: {len(microbiota_scope)}. \n')

return microbiota_scope, contributions_of_microbes

def reverse_cscope(bact_contrib, reverse_dict, output_dir):
"""Reverse a scope dictionary by focusing on metabolite producers.
Args:
bact_contrib (dict): dict of bacteria contributions to community scope
reverse_dict (dict): dict of metabolite producers in community
output_dir (str): path to output directory
Returns:
(str, str): paths to the JSON and TSV outputs
"""
rev_cscopes_json_path = os.path.join(*[output_dir, 'community_analysis', 'rev_cscope.json'])
rev_cscopes_tsv_path = os.path.join(*[output_dir, 'community_analysis', 'rev_cscope.tsv'])

with open(rev_cscopes_json_path, 'w') as g:
json.dump(reverse_dict, g, indent=True, sort_keys=True)

with open(com_scopes_path, 'w') as dumpfile:
json.dump(microbiota_scope, dumpfile, indent=4)
all_compounds = [compound for compound in reverse_dict]
all_species = [species for species in bact_contrib]

logger.info('Community scopes for all metabolic networks available in ' +
com_scopes_path)
# For each species get the possibility of production of each compounds.
with open(rev_cscopes_tsv_path, 'w') as output_file:
csvwriter = csv.writer(output_file, delimiter='\t')
csvwriter.writerow(['', *all_compounds])
for species in all_species:
csvwriter.writerow([species, *[1 if species in reverse_dict[compound] else 0 for compound in all_compounds]])

return set(microbiota_scope['com_scope'])
return rev_cscopes_json_path, rev_cscopes_tsv_path
Loading

0 comments on commit 49412a1

Please sign in to comment.