diff --git a/.github/workflows/pythonpackage.yml b/.github/workflows/pythonpackage.yml index a255914..f870254 100644 --- a/.github/workflows/pythonpackage.yml +++ b/.github/workflows/pythonpackage.yml @@ -67,3 +67,4 @@ jobs: pytest test_m2m_iscope.py pytest test_m2m_metacom.py pytest test_m2m_mincom.py + pytest test_tiny_toy_metacom.py diff --git a/.gitignore b/.gitignore index f0ddec6..64db5d8 100644 --- a/.gitignore +++ b/.gitignore @@ -92,3 +92,9 @@ ENV/ # Data directories .vscode + +# m2m output directories +test/addedvalue_output/ +test/metabolic_data/toy_bact/ +tutorials/method_tutorial/output_folder/ +tutorials/method_tutorial/tutorial_output_folder/ diff --git a/README.md b/README.md index 83d8d22..228021a 100644 --- a/README.md +++ b/README.md @@ -69,8 +69,8 @@ pip install -r requirements.txt --no-cache-dir In particular, m2m relies on: * [mpwt](https://github.com/AuReMe/mpwt) to automatize metabolic network reconstruction with Pathway Tools * [padmet](https://github.com/AuReMe/padmet) to manage metabolic networks -* [menetools](https://github.com/cfrioux/MeneTools) to analyze individual metabolic capabilities using logic programming -* [miscoto](https://github.com/cfrioux/miscoto) to analyze collective metabolic capabilities and select communities within microbiota using logic programming +* [menetools](https://github.com/cfrioux/MeneTools) to analyze individual metabolic capabilities using logic programming. **Requires MeneTools > 3.4** +* [miscoto](https://github.com/cfrioux/miscoto) to analyze collective metabolic capabilities and select communities within microbiota using logic programming. **Requires MiSCoTo > 3.2** Also, m2m_analysis relies on other packages: * [networkx](https://github.com/networkx/networkx) to create graph from miscoto results diff --git a/docs/index.rst b/docs/index.rst index 4be7bf6..b006865 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -53,7 +53,7 @@ M2M relies on packages that can also be used independantly with more features an Authors ======= -`Clémence Frioux `__ and `Arnaud Belcour `__, Univ Rennes, Inria, CNRS, IRISA, Rennes, France. +`Clémence Frioux `__ and `Arnaud Belcour `__, Univ Bordeaux, Inria, INRAE, Bordeaux, France and Univ Rennes, Inria, CNRS, IRISA, Rennes, France. Acknowledgement =============== diff --git a/docs/m2m_analysis.rst b/docs/m2m_analysis.rst index 336c0ec..e55be7f 100644 --- a/docs/m2m_analysis.rst +++ b/docs/m2m_analysis.rst @@ -562,3 +562,8 @@ If you have used the ``--taxon``, two new files have been created: ``taxonomy_species.tsv``: it is a tsv file with 9 columns. The row corresponds to the species in your community. For each species, you will have its name in your dataset, its taxID (from taxon_id.tsv), an attributed taxonomic name (used in the power graph), then the taxonomic classification: phylum, class, order, family, genus and species. ``taxon_tree.txt``: the topology of the taxonomic classification of your species according to the NCBI taxonomy. + +More information +----------------- + +Take a look at the complete tutorial in the `Github repository `__ diff --git a/docs/tutorial.rst b/docs/tutorial.rst index 9aa48ab..063e5c3 100644 --- a/docs/tutorial.rst +++ b/docs/tutorial.rst @@ -66,6 +66,11 @@ Please go check the `documentation of mpwt 13 - GCA_003437885 + GCA_003437055 + GCA_003437595 + GCA_003437295 + GCA_003437345 GCA_003437715 + GCA_003437815 GCA_003437905 + GCA_003437375 + GCA_003437195 + GCA_003438055 + GCA_003437885 GCA_003437665 GCA_003437255 - GCA_003437295 - GCA_003437815 - GCA_003437945 - GCA_003438055 - GCA_003437195 - GCA_003437375 - GCA_003437055 - GCA_003437595 ######### Key species: Union of minimal communities ######### # Bacteria occurring in at least one minimal community enabling the producibility of the target metabolites given as inputs Number of key species => 17 - GCA_003437885 + GCA_003437055 + GCA_003437325 + GCA_003437595 + GCA_003437345 GCA_003437715 GCA_003437905 - GCA_003437665 - GCA_003437295 + GCA_003437945 GCA_003438055 - GCA_003437195 - GCA_003437175 - GCA_003437345 - GCA_003437055 - GCA_003437595 - GCA_003437785 GCA_003437255 + GCA_003437295 + GCA_003437785 GCA_003437815 - GCA_003437945 + GCA_003437175 GCA_003437375 - GCA_003437325 + GCA_003437195 + GCA_003437885 + GCA_003437665 ######### Essential symbionts: Intersection of minimal communities ######### # Bacteria occurring in ALL minimal communities enabling the producibility of the target metabolites given as inputs Number of essential symbionts => 12 - GCA_003437885 + GCA_003437055 + GCA_003437595 + GCA_003437295 GCA_003437715 + GCA_003437815 GCA_003437905 + GCA_003437375 + GCA_003437195 + GCA_003438055 + GCA_003437885 GCA_003437665 GCA_003437255 - GCA_003437295 - GCA_003437815 - GCA_003438055 - GCA_003437195 - GCA_003437375 - GCA_003437055 - GCA_003437595 ######### Alternative symbionts: Difference between Union and Intersection ######### # Bacteria occurring in at least one minimal community but not all minimal communities enabling the producibility of the target metabolites given as inputs Number of alternative symbionts => 5 - GCA_003437785 - GCA_003437945 GCA_003437345 - GCA_003437175 GCA_003437325 + GCA_003437945 + GCA_003437785 + GCA_003437175 - --- Mincom runtime 3.70 seconds --- - - --- Total runtime 6.50 seconds --- - --- Logs written in output_directory//m2m_mincom.log --- + --- Mincom runtime 5.61 seconds --- + --- Total runtime 7.72 seconds --- + --- Logs written in output_directory/m2m_mincom.log --- This output gives the result of minimal community selection. It means that for producing the 120 metabolic targets, a minimum of 13 bacteria out of the 17 is required. One example of such minimal community is given. In addition, the whole space of solution is studied. All bacteria (17) occur in at least one minimal community (key species). Finally, the intersection gives the following information: a set of 12 bacteria occurs in each minimal communtity. This means that these 12 bacteria are needed in any case (essential symbionts), and that any of the remaining 5 bacteria (alternative symbionts) can complete the missing function(s). * files outputs - * As for other commands, a json file with the results is produced in ``output_directory/community_analysis/comm_scopes.json``, together with logs at the root of the results directory. + * As for other commands, a json file with the results is produced in ``output_directory/community_analysis/mincom.json.json``, together with logs at the root of the results directory. m2m metacom ------------ @@ -674,17 +738,21 @@ Optional arguments: .. code :: output_directory/ - ├── m2m_metacom.log - ├── producibility_targets.json + ├── m2m_mincom.log ├── community_analysis - │ ├── addedvalue.json + │   ├── addedvalue.json │   ├── comm_scopes.json + │   ├── contributions_of_microbes.json │   ├── mincom.json - │   ├── targets.sbml + │   ├── rev_cscope.json + │   ├── rev_cscope.tsv + │   └── targets.sbml ├── indiv_scopes │   └── indiv_scopes.json │   └── rev_iscope.json │   └── rev_iscope.tsv + │   └── seeds_in_indiv_scopes.json + m2m workflow and m2m test ------------------------- @@ -734,87 +802,100 @@ Optional arguments: .. code :: - Uncompressing test data to output_directory - Launching workflow on test data - ######### Running metabolic network reconstruction with Pathway Tools ######### - ~~~~~~~~~~Creation of input data from Genbank/GFF/PF~~~~~~~~~~ - Checking inputs for GCA_003433675: no missing files. - Checking inputs for GCA_003433665: no missing files. - ----------End of creation of input data from Genbank/GFF/PF: 0.02s---------- - ~~~~~~~~~~Inference on the data~~~~~~~~~~ - pathway-tools -no-web-cel-overview -no-cel-overview -no-patch-download -disable-metadata-saving -nologfile -patho output_directory/workflow_genomes/GCA_003433665/ - pathway-tools -no-web-cel-overview -no-cel-overview -no-patch-download -disable-metadata-saving -nologfile -patho output_directory/workflow_genomes/GCA_003433675/ - ~~~~~~~~~~Check inference~~~~~~~~~~ - No log directory, it will be created. + WARNING: compounds ('M_MANNITOL_c',) are both in seeds and targets. As such, they will be considered already reachable during community selection and will be ignored. However, their producibility can be assessed in individual and community scopes. If a host is provided, the community scope computation differs slightly and the producibility of the compound will have to be checked in the output files: indiv_scopes/seeds_in_indiv_scopes.json and the key 'individually producible' in the file producibility_targets.json. - 2 builds have passed! - ----------End of PathoLogic inference: 403.90s---------- - ~~~~~~~~~~Creation of the .dat files~~~~~~~~~~ - pathway-tools -no-patch-download -disable-metadata-saving -nologfile -load output_directory/workflow_genomes/GCA_003433675/dat_creation.lisp - pathway-tools -no-patch-download -disable-metadata-saving -nologfile -load output_directory/workflow_genomes/GCA_003433665/dat_creation.lisp - ~~~~~~~~~~Check .dat~~~~~~~~~~ - gca_003433675cyc: 23 out of 23 dat files created. - gca_003433665cyc: 23 out of 23 dat files created. - ----------End of dat files creation: 163.21s---------- - ~~~~~~~~~~End of Pathway Tools~~~~~~~~~~ - ~~~~~~~~~~Moving result files~~~~~~~~~~ - ----------End of moving fimes: 0.12s---------- - ----------mpwt has finished in 567.29s! Thank you for using it. - ######### Creating SBML files ######### - ######### Stats GSMN reconstruction ######### - Number of genomes: 2 - Number of reactions in all GSMN: 2026 - Number of compounds in all GSMN: 2095 - Average reactions per GSMN: 1437.00(+/- 678.82) - Average compounds per GSMN: 1560.00(+/- 615.18) - Average genes per GSMN: 893.00(+/- 475.18) - Average pathways per GSMN: 257.00(+/- 134.35) - Percentage of reactions associated with genes: 79.90(+/- 3.20) - --- Recon runtime 574.26 seconds --- + ############################################### + # # + # Individual metabolic potentials # + # # + ############################################### - ######### Running individual metabolic scopes ######### - Individual scopes for all metabolic networks available in output_directory/indiv_scopes/indiv_scopes.json - 2 metabolic models considered. - 123 metabolites in core reachable by all organisms (intersection) + Individual scopes for all metabolic networks available in output_directory/indiv_scopes/indiv_scopes.json. The scopes have been filtered a way that if a seed is in a scope, it means the corresponding species is predicted to be able to produce it. + + Information regarding the producibility of seeds, and the possible absence of seeds in some metabolic networks is stored in output_directory/indiv_scopes/seeds_in_indiv_scopes.json. + + 17 metabolic models considered. + + 50 metabolites in core reachable by all organisms (intersection) ... - 325 metabolites reachable by individual organisms altogether (union), among which 26 seeds (growth medium) + 576 metabolites reachable by individual organisms altogether (union), among which 44 metabolites that are also part of the seeds (growth medium) ... - intersection of scope 123 - union of scope 325 - max metabolites in scope 321 - min metabolites in scope 127 - average number of metabolites in scope 224.00 (+/- 137.18) + Summary: + - intersection of scopes 50 + - union of scopes 576 + - max metabolites in scopes 422 + - min metabolites in scopes 116 + - average number of metabolites in scopes 239.06 (+/- 89.51) + Analysis of functional redundancy (producers of all metabolites) is computed as a dictionary in output_directory/indiv_scopes/rev_iscope.json and as a matrix in output_directory/indiv_scopes/rev_iscope.tsv. - --- Indiv scopes runtime 1.21 seconds --- + --- Indiv scopes runtime 25.26 seconds --- + + + ############################################### + # # + # Metabolic potential of the community # + # # + ############################################### ######### Creating metabolic instance for the whole community ######### - Created instance in /shared/programs/metage2metabo/test/output_directory/community_analysis/miscoto_17f5ygw7.lp - Running whole-community metabolic scopes - Community scopes for all metabolic networks available in output_directory/community_analysis/comm_scopes.json - --- Community scope runtime 0.85 seconds --- + Created temporary instance file in ..metage2metabo/test/metabolic_data/output_directory/community_analysis/miscoto_h00llsyy.lp + Running whole-community metabolic scopes... + Community scope for all metabolic networks available in output_directory/community_analysis/comm_scopes.json + Contributions of microbes to community scope available in output_directory/community_analysis/contributions_of_microbes.json. - Added value of cooperation over individual metabolism: 33 newly reachable metabolites: + Number of metabolites producible in community: 698. - ... + Reverse community scopes for all metabolic networks available in output_directory/community_analysis/rev_cscope.json and output_directory/community_analysis/rev_cscope.tsv. They higlight the producibility of metabolites by species in the community. + + --- Community scope runtime 6.05 seconds --- + + + ############################################### + # # + # Added-value of metabolic interactions # + # # + ############################################### + Added value of cooperation over individual metabolism: 122 newly reachable metabolites: + + ... + Added-value of cooperation written in output_directory/community_analysis/addedvalue.json - Target file created with the addedvalue targets in: output_directory/community_analysis/targets.sbml - Setting 33 compounds as targets + The following 1 targets are individually reachable by at least one organism: + + M_MANNITOL_c + + The following 119 targets are additionally reachable through putative cross-feeding events: + + ... + + Target file created with the targets provided by the user in: output_directory/community_analysis/targets.sbml + Setting 120 compounds as targets. + + + ############################################### + # # + # Minimal community selection # + # # + ############################################### + + WARNING: The following seeds are among the targets: {'M_MANNITOL_c'}. They will not be considered as targets during the computation of minimal communities: they will be considered as already reachable according to the network expansion definition. Running minimal community selection + ...metage2metabo/lib/python3.10/site-packages/miscoto/encodings/community_soup.lp - In the initial and minimal communities 33 targets are producible and 0 remain unproducible. + In the initial and minimal communities 120 targets are producible and 0 remain unproducible. - 33 producible targets: + 120 producible targets: ... 0 still unproducible targets: @@ -824,33 +905,33 @@ Optional arguments: ######### One minimal community ######### # One minimal community enabling the producibility of the target metabolites given as inputs - Minimal number of bacteria in communities => 2 + Minimal number of bacteria in communities => 13 + + ... - GCA_003433665 - GCA_003433675 ######### Key species: Union of minimal communities ######### # Bacteria occurring in at least one minimal community enabling the producibility of the target metabolites given as inputs - Number of key species => 2 + Number of key species => 17 + + ... - GCA_003433665 - GCA_003433675 ######### Essential symbionts: Intersection of minimal communities ######### # Bacteria occurring in ALL minimal communities enabling the producibility of the target metabolites given as inputs - Number of essential symbionts => 2 + Number of essential symbionts => 12 + + ... - GCA_003433665 - GCA_003433675 ######### Alternative symbionts: Difference between Union and Intersection ######### # Bacteria occurring in at least one minimal community but not all minimal communities enabling the producibility of the target metabolites given as inputs - Number of alternative symbionts => 0 - + Number of alternative symbionts => 5 + ... - --- Mincom runtime 1.36 seconds --- + --- Mincom runtime 8.76 seconds --- Targets producibility are available at output_directory/producibility_targets.json - --- Total runtime 577.84 seconds --- - --- Logs written in output_directory/m2m_test.log --- + --- Total runtime 40.18 seconds --- + --- Logs written in output_directory/m2m_metacom.log --- * files outputs @@ -862,14 +943,18 @@ Optional arguments: ├── m2m_workflow.log ├── producibility_targets.json ├── community_analysis - │ ├── addedvalue.json + │   ├── addedvalue.json │   ├── comm_scopes.json + │   ├── contributions_of_microbes.json │   ├── mincom.json - │   ├── targets.sbml + │   ├── rev_cscope.json + │   ├── rev_cscope.tsv + │   └── targets.sbml ├── indiv_scopes │   └── indiv_scopes.json │   └── rev_iscope.json │   └── rev_iscope.tsv + │   └── seeds_in_indiv_scopes.json ├── padmet │   ├── GCA_003433665.padmet │   └── GCA_003433675.padmet @@ -929,6 +1014,8 @@ Optional arguments: These files are the same as the ones presented in the previous commands: metabolic networks reconstructions (Pathway Tools data, SBML), individual and collective scopes, minimal community selection. + ``m2m metacom`` runs the whole workflow except the reconstruction of metabolic networks. We advise to use this command to explore the metabolism of the microbial community when you already have metabolic networks. + Including a host in the picture ------------------------------- @@ -946,3 +1033,9 @@ Then back to the effect of the host in the other commands. More generally, for more information and analysis on the usage of hosts in addition to the microbiota, we refer the interested user to the `Miscoto `_ `Python package `__, on which m2m relies. Miscoto can be used as a standalone package for such analyses, with additional options, such as the identification of putative exchanges among the minimal communities. + + +More information +----------------- + +Take a look at the complete tutorial in the `Github repository `__ \ No newline at end of file diff --git a/metage2metabo/__main__.py b/metage2metabo/__main__.py index 6b5588c..2935d27 100644 --- a/metage2metabo/__main__.py +++ b/metage2metabo/__main__.py @@ -30,6 +30,7 @@ from metage2metabo.m2m.community_addedvalue import addedvalue from metage2metabo.m2m.minimal_community import mincom from metage2metabo.m2m.m2m_workflow import run_workflow, metacom_analysis +from metage2metabo.sbml_management import get_compounds from metage2metabo import sbml_management, utils @@ -209,7 +210,8 @@ def main(): parent_parser_xml ], description= - "Run metabolic network reconstruction for each annotated genome of the input directory, using Pathway Tools" + "Run metabolic network reconstruction for each annotated genome of the input directory, using Pathway Tools", + allow_abbrev=False ) indivscope_parser = subparsers.add_parser( "iscope", @@ -218,7 +220,8 @@ def main(): parent_parser_n, parent_parser_s, parent_parser_o, parent_parser_q, parent_parser_c ], description= - "Compute individual scopes (reachable metabolites from seeds) for each metabolic network of the input directory" + "Compute individual scopes (reachable metabolites from seeds) for each metabolic network of the input directory", + allow_abbrev=False ) comscope_parser = subparsers.add_parser( "cscope", @@ -227,7 +230,7 @@ def main(): parent_parser_n, parent_parser_s, parent_parser_o, parent_parser_m, parent_parser_q, parent_parser_t_optional ], - description="Compute the community scope of all metabolic networks") + description="Compute the community scope of all metabolic networks", allow_abbrev=False) added_value_parser = subparsers.add_parser( "addedvalue", help="added value of microbiota's metabolism over individual's", @@ -236,7 +239,8 @@ def main(): parent_parser_q ], description= - "Compute metabolites that are reachable by the community/microbiota and not by individual organisms" + "Compute metabolites that are reachable by the community/microbiota and not by individual organisms", + allow_abbrev=False ) mincom_parser = subparsers.add_parser( "mincom", @@ -246,13 +250,15 @@ def main(): parent_parser_q, parent_parser_t_required ], description= - "Select minimal-size community to make reachable a set of metabolites") + "Select minimal-size community to make reachable a set of metabolites", + allow_abbrev=False) seeds_parser = subparsers.add_parser( "seeds", help="creation of seeds SBML file", parents=[parent_parser_o, parent_parser_q], description= - "Create a SBML file starting for a simple text file with metabolic compounds identifiers" + "Create a SBML file starting for a simple text file with metabolic compounds identifiers", + allow_abbrev=False ) seeds_parser.add_argument( "--metabolites", @@ -268,7 +274,8 @@ def main(): parent_parser_t_optional, parent_parser_cl, parent_parser_xml ], description= - "Run the whole workflow: metabolic network reconstruction, individual and community scope analysis and community selection" + "Run the whole workflow: metabolic network reconstruction, individual and community scope analysis and community selection", + allow_abbrev=False ) metacom_parser = subparsers.add_parser( "metacom", @@ -278,7 +285,8 @@ def main(): parent_parser_t_optional, parent_parser_q, parent_parser_c ], description= - "Run the whole metabolism community analysis: individual and community scope analysis and community selection" + "Run the whole metabolism community analysis: individual and community scope analysis and community selection", + allow_abbrev=False ) test_parser = subparsers.add_parser( "test", @@ -286,8 +294,7 @@ def main(): parents=[ parent_parser_q, parent_parser_c, parent_parser_o ], - description= - "Test the whole workflow on a data sample") + description="Test the whole workflow on a data sample", allow_abbrev=False) args = parser.parse_args() @@ -354,7 +361,7 @@ def main(): # test if some targets are seeds itsct_seeds_targets = sbml_management.compare_seeds_and_targets(args.seeds, args.targets) if itsct_seeds_targets != set(): - logger.warning(f"\nWARNING: compounds {*list(itsct_seeds_targets),} are both in seeds and targets. Since they are in seeds, they will be in each organism's individual producibility scope (iscope), but not appear in the community scope (cscope). To be certain that they are produced (through an activable reaction and not just because they are seeds), check the output file: indiv_scopes/indiv_produced_seeds.json and the key 'individually producible' in the file producibility_targets.json.\n") + logger.warning(f"\nWARNING: compounds {*list(itsct_seeds_targets),} are both in seeds and targets. As such, they will be considered already reachable during community selection and will be ignored. However, their producibility can be assessed in individual and community scopes. If a host is provided, the community scope computation differs slightly and the producibility of the compound will have to be checked in the output files: indiv_scopes/seeds_in_indiv_scopes.json and the key 'individually producible' in the file producibility_targets.json. \n") if args.cmd == "iscope": main_iscope(network_dir, args.seeds, args.out, args.cpu) elif args.cmd == "cscope": @@ -415,7 +422,7 @@ def main_cscope(*allargs): """Run cscope command. """ instance_com, comscope = cscope(*allargs) - logger.info("\n" + str(len(comscope)) + " metabolites (excluding the seeds) reachable by the whole community/microbiota: \n") + logger.info("\n" + str(len(comscope)) + " metabolites reachable by the whole community/microbiota: \n") logger.info('\n'.join(comscope)) #delete intermediate file os.unlink(instance_com) @@ -455,7 +462,7 @@ def main_mincom(sbmldir, seedsfiles, outdir, targets, host): #create instance instance = instance_community(sbmldir, seedsfiles, outdir, targets, host) #run mincom - mincom(instance, outdir) + mincom(instance, seedsfiles, set(get_compounds(targets)), outdir) #delete intermediate file os.unlink(instance) diff --git a/metage2metabo/__main_analysis__.py b/metage2metabo/__main_analysis__.py index 458b398..10d8a58 100644 --- a/metage2metabo/__main_analysis__.py +++ b/metage2metabo/__main_analysis__.py @@ -183,7 +183,8 @@ def main(): parent_parser_s, parent_parser_n, parent_parser_t, parent_parser_m, parent_parser_o, parent_parser_q ], description= - "Run miscoto enumeration on sbml species with seeds and targets" + "Run miscoto enumeration on sbml species with seeds and targets", + allow_abbrev=False ) graph_parser = subparsers.add_parser( "graph", @@ -192,7 +193,8 @@ def main(): parent_parser_j, parent_parser_o, parent_parser_t, parent_parser_taxon, parent_parser_q, parent_parser_level ], - description="Create the solution graph using the JSON from miscoto enumeration") + description="Create the solution graph using the JSON from miscoto enumeration", + allow_abbrev=False) powergraph_parser = subparsers.add_parser( "powergraph", help="powergraph creation and visualization", @@ -201,7 +203,8 @@ def main(): parent_parser_level, parent_parser_o ], description= - "Compress the GMl graph of solution and create a powergraph (bbl), a website format of the powergraph and a svg of the graph (if you use the --oog option)" + "Compress the GMl graph of solution and create a powergraph (bbl), a website format of the powergraph and a svg of the graph (if you use the --oog option)", + allow_abbrev=False ) wkf_parser = subparsers.add_parser( "workflow", @@ -211,7 +214,8 @@ def main(): parent_parser_taxon, parent_parser_q, parent_parser_level ], description= - "Run the whole workflow: miscoto enumeration, graph on solution and powergraph creation" + "Run the whole workflow: miscoto enumeration, graph on solution and powergraph creation", + allow_abbrev=False ) args = parser.parse_args() diff --git a/metage2metabo/m2m/community_addedvalue.py b/metage2metabo/m2m/community_addedvalue.py index cc7af2e..b07081b 100644 --- a/metage2metabo/m2m/community_addedvalue.py +++ b/metage2metabo/m2m/community_addedvalue.py @@ -32,6 +32,12 @@ def addedvalue(iscope_rm, cscope_rm, out_dir): Returns: set: set of metabolites that can only be reached by a community """ + logger.info('\n###############################################') + logger.info('# #') + logger.info('# Added-value of metabolic interactions #') + logger.info('# #') + logger.info('###############################################\n') + # Community targets = what can be produced only if cooperation occurs between species newtargets = cscope_rm - iscope_rm logger.info("\nAdded value of cooperation over individual metabolism: " + diff --git a/metage2metabo/m2m/community_scope.py b/metage2metabo/m2m/community_scope.py index e55bd1b..ce3305e 100644 --- a/metage2metabo/m2m/community_scope.py +++ b/metage2metabo/m2m/community_scope.py @@ -18,10 +18,11 @@ import sys import tempfile import time +import csv from metage2metabo import utils -from miscoto import run_scopes, run_instance +from miscoto import run_focus, run_instance, run_scopes from shutil import copyfile @@ -43,11 +44,30 @@ def cscope(sbmldir, seeds, out_dir, targets_file=None, host=None): tuple: instance file (str) and community scope (set) """ starttime = time.time() + logger.info('\n###############################################') + logger.info('# #') + logger.info('# Metabolic potential of the community #') + logger.info('# #') + logger.info('###############################################\n') + # Create instance for community analysis instance_com = instance_community(sbmldir, seeds, out_dir, targets_file, host) # Run community scope - logger.info("Running whole-community metabolic scopes") - community_reachable_metabolites = comm_scope_run(instance_com, out_dir, host) + logger.info("Running whole-community metabolic scopes...") + community_reachable_metabolites, contributions_of_microbes = comm_scope_run(instance_com, out_dir, host) + # compute the reverse cscope + contrib_microbes_path = os.path.join(*[out_dir, 'community_analysis', 'contributions_of_microbes.json']) + # reverse the dict to have compounds as keys, and species as values + reverse_contrib = {} + for species in contributions_of_microbes: + for compound in contributions_of_microbes[species]['produced_in_community']: + if compound in reverse_contrib: + reverse_contrib[compound].append(species) + else: + reverse_contrib[compound] = [species] + # export the reverse cscope to json and tsv + rev_cscopes_json_path, rev_cscopes_tsv_path = reverse_cscope(contributions_of_microbes, reverse_contrib, out_dir) + logger.info('Reverse community scopes for all metabolic networks available in ' + rev_cscopes_json_path + ' and ' + rev_cscopes_tsv_path + '. They higlight the producibility of metabolites by species in the community.\n') logger.info("--- Community scope runtime %.2f seconds ---\n" % (time.time() - starttime)) return instance_com, community_reachable_metabolites @@ -84,7 +104,7 @@ def instance_community(sbml_dir, seeds, output_dir, targets_file=None, host_mn=N targets_file=targets_file, output=outputfile) - logger.info("Created instance in " + instance_filepath) + logger.info("Created temporary instance file in " + instance_filepath) return instance_filepath @@ -98,26 +118,67 @@ def comm_scope_run(instance, output_dir, host_mn=None): Returns: set: microbiota scope + dict: contribution of microbes to the scope """ miscoto_dir = os.path.join(output_dir, 'community_analysis') com_scopes_path = os.path.join(miscoto_dir, 'comm_scopes.json') + contrib_microbes_path = os.path.join(miscoto_dir, "contributions_of_microbes.json") if not utils.is_valid_dir(miscoto_dir): logger.critical('Impossible to access/create output directory') sys.exit(1) - microbiota_scope = run_scopes(lp_instance_file=instance) # Remove keys "host_prodtargets", "host_scope", "comhost_scope" and "host_unprodtargets" if there is no host: - if host_mn is None: - del microbiota_scope['host_prodtargets'] - del microbiota_scope['host_unprodtargets'] - del microbiota_scope['host_scope'] - del microbiota_scope['comhost_scope'] + if host_mn is not None: + scopes_results = run_scopes(lp_instance_file=instance) + microbiota_scope = scopes_results['com_scope'] + contributions_of_microbes = None + logger.info('The computation of the community scope with a host is of limited functionality. It will not highlight the contribution of each microbe to the community scope. Additionally, the producibility of seeds by the microbes will not be computed. Consider running the community scope without a host (i.e. all metabolic networks, including the host, in the same directory) to get the full functionality of the community scope. \n') + with open(com_scopes_path, 'w') as dumpfile: + json.dump(microbiota_scope, dumpfile, indent=4, sort_keys=True) + logger.info(f'Community scope for all metabolic networks available in {com_scopes_path}.\n') + else: + contributions_of_microbes = run_focus(seeds_file = None, bacteria_dir = None, focus_bact=[], all_networks=True, lp_instance_file=instance) + microbiota_scope = set() + for bacteria in contributions_of_microbes: + microbiota_scope.update(contributions_of_microbes[bacteria]['produced_in_community']) + dict_comscope = {'com_scope': list(microbiota_scope)} + with open(com_scopes_path, 'w') as dumpfile: + json.dump(dict_comscope, dumpfile, indent=4, sort_keys=True) + logger.info(f'Community scope for all metabolic networks available in {com_scopes_path}') + with open(os.path.join(miscoto_dir, 'contributions_of_microbes.json'), 'w') as dumpfile: + json.dump(contributions_of_microbes, dumpfile, indent=4, sort_keys=True) + logger.info(f'Contributions of microbes to community scope available in {contrib_microbes_path}.\n') + + logger.info(f'\nNumber of metabolites producible in community: {len(microbiota_scope)}. \n') + + return microbiota_scope, contributions_of_microbes + +def reverse_cscope(bact_contrib, reverse_dict, output_dir): + """Reverse a scope dictionary by focusing on metabolite producers. + + Args: + bact_contrib (dict): dict of bacteria contributions to community scope + reverse_dict (dict): dict of metabolite producers in community + output_dir (str): path to output directory + + Returns: + (str, str): paths to the JSON and TSV outputs + """ + rev_cscopes_json_path = os.path.join(*[output_dir, 'community_analysis', 'rev_cscope.json']) + rev_cscopes_tsv_path = os.path.join(*[output_dir, 'community_analysis', 'rev_cscope.tsv']) + + with open(rev_cscopes_json_path, 'w') as g: + json.dump(reverse_dict, g, indent=True, sort_keys=True) - with open(com_scopes_path, 'w') as dumpfile: - json.dump(microbiota_scope, dumpfile, indent=4) + all_compounds = [compound for compound in reverse_dict] + all_species = [species for species in bact_contrib] - logger.info('Community scopes for all metabolic networks available in ' + - com_scopes_path) + # For each species get the possibility of production of each compounds. + with open(rev_cscopes_tsv_path, 'w') as output_file: + csvwriter = csv.writer(output_file, delimiter='\t') + csvwriter.writerow(['', *all_compounds]) + for species in all_species: + csvwriter.writerow([species, *[1 if species in reverse_dict[compound] else 0 for compound in all_compounds]]) - return set(microbiota_scope['com_scope']) + return rev_cscopes_json_path, rev_cscopes_tsv_path diff --git a/metage2metabo/m2m/individual_scope.py b/metage2metabo/m2m/individual_scope.py index 8368d9e..3465a68 100644 --- a/metage2metabo/m2m/individual_scope.py +++ b/metage2metabo/m2m/individual_scope.py @@ -52,13 +52,14 @@ def iscope(sbmldir, seeds, out_dir, cpu_number=1): name for name in os.listdir(sbmldir) if os.path.isfile(os.path.join(sbmldir, name)) and utils.get_extension(os.path.join(sbmldir, name)).lower() in ["xml", "sbml"] ]) > 1: - scope_json = indiv_scope_run(sbmldir, seeds, out_dir, cpu_number) - logger.info("Individual scopes for all metabolic networks available in " + scope_json) + scope_dict, seeds_dict, scope_json, seeds_status_path = indiv_scope_run(sbmldir, seeds, out_dir, cpu_number) + logger.info(f'\nIndividual scopes for all metabolic networks available in {scope_json}. The scopes have been filtered a way that if a seed is in a scope, it means the corresponding species is predicted to be able to produce it.') + logger.info(f'\nInformation regarding the producibility of seeds, and the possible absence of seeds in some metabolic networks is stored in {seeds_status_path}.\n') # Analyze the individual scopes results (json file) - reachable_metabolites_union = analyze_indiv_scope(scope_json, seeds) + reachable_metabolites_union = analyze_indiv_scope(scope_dict, seeds_dict, seeds) # Compute the reverse iscopes (who produces each metabolite) - reverse_scope_json, reverse_scope_tsv = reverse_scope(scope_json, out_dir) - logger.info(f"Analysis of functional redundancy (producers of all metabolites) is computed as a dictionary in {reverse_scope_json} and as a matrix in {reverse_scope_tsv}.") + reverse_scope_json, reverse_scope_tsv = reverse_scope(scope_dict, out_dir) + logger.info(f"\nAnalysis of functional redundancy (producers of all metabolites) is computed as a dictionary in {reverse_scope_json} and as a matrix in {reverse_scope_tsv}.") logger.info("--- Indiv scopes runtime %.2f seconds ---\n" % (time.time() - starttime)) return reachable_metabolites_union @@ -79,11 +80,15 @@ def indiv_scope_run(sbml_dir, seeds, output_dir, cpu_number=1): Returns: str: path to output file for scope from Menetools analysis """ - logger.info('######### Running individual metabolic scopes #########') + logger.info('\n###############################################') + logger.info('# #') + logger.info('# Individual metabolic potentials #') + logger.info('# #') + logger.info('###############################################\n') menetools_dir = os.path.join(output_dir, 'indiv_scopes') indiv_scopes_path = os.path.join(menetools_dir, 'indiv_scopes.json') - produced_seeds_path = os.path.join(menetools_dir, 'indiv_produced_seeds.json') + produced_seeds_path = os.path.join(menetools_dir, 'seeds_in_indiv_scopes.json') if not utils.is_valid_dir(menetools_dir): logger.critical('Impossible to access/create output directory') @@ -96,6 +101,8 @@ def indiv_scope_run(sbml_dir, seeds, output_dir, cpu_number=1): ] all_scopes = {} all_produced_seeds = {} + all_absent_seeds = {} + all_non_produced_seeds = {} multiprocessing_indiv_scopes = [] for f in all_files: bname = utils.get_basename(f) @@ -115,17 +122,32 @@ def indiv_scope_run(sbml_dir, seeds, output_dir, cpu_number=1): menescope_results = result[2] all_scopes[bname] = menescope_results['scope'] all_produced_seeds[bname] = menescope_results['produced_seeds'] + all_absent_seeds[bname] = menescope_results['absent_seeds'] + all_non_produced_seeds[bname] = menescope_results['non_produced_seeds'] menescope_pool.close() menescope_pool.join() + seeds_status = {} + seeds_status['individually_producible_seeds'] = all_produced_seeds + seeds_status['seeds_absent_in_metabolic_network'] = all_absent_seeds + seeds_status['individually_non_producible_seeds'] = all_non_produced_seeds + + # Some seeds might be producible by some metabolic networks. By default, menescope include all seeds in the scope, regardless of their real producibility by the network. seed_status_dict holds this information. + # We'll remove them here + # remove from each scope the seeds that are not producible by the corresponding species. + for species in all_scopes: + # non producible seeds are in seeds_status_dict['individually_non_producible_seeds'][species] + all_scopes[species] = list(set(all_scopes[species]) - set(seeds_status['individually_non_producible_seeds'][species])) + + with open(indiv_scopes_path, 'w') as dumpfile: json.dump(all_scopes, dumpfile, indent=4) with open(produced_seeds_path, 'w') as dumpfile: - json.dump(all_produced_seeds, dumpfile, indent=4) + json.dump(seeds_status, dumpfile, indent=4, sort_keys=True) - return indiv_scopes_path + return all_scopes, seeds_status, indiv_scopes_path, produced_seeds_path def indiv_scope_on_species(sbml_path, bname, seeds_path): @@ -155,33 +177,33 @@ def indiv_scope_on_species(sbml_path, bname, seeds_path): return [error, bname, menescope_results] -def analyze_indiv_scope(jsonfile, seeds): - """Analyze the output of Menescope, stored in a json +def analyze_indiv_scope(scope_dict, seeds_status_dict, seeds): + """Analyze the output of Menescope, stored in two dictionaries Args: - jsonfile (str): output of menescope + scope_dict (dict): output of all menescope runs + seeds_status_dict (dict): production status of seeds in all menescope runs seeds (str): SBML seeds file Returns: set: union of all the individual scopes """ - with open(jsonfile) as json_data: - d = json.load(json_data) - if not d: - logger.critical('Json file is empty. Individual scopes calculation failed. Please fill an issue on Github') - sys.exit(1) - d_set = {} + scope_dict_set = {} + individually_producible_seeds_set = {} + + for elem in scope_dict: + scope_dict_set[elem] = set(scope_dict[elem]) - for elem in d: - d_set[elem] = set(d[elem]) + for elem in seeds_status_dict['individually_producible_seeds']: + individually_producible_seeds_set[elem] = set(seeds_status_dict['individually_producible_seeds'][elem]) try: seed_metabolites = readSBMLspecies_clyngor(seeds, 'seeds') except FileNotFoundError: - logger.critical('File not found: '+seeds) + logger.critical('File not found: '+ seeds) sys.exit(1) except etree.ParseError: - logger.critical('Invalid syntax in SBML file: '+seeds) + logger.critical(f'Invalid syntax in SBML file: {seeds}') sys.exit(1) except: traceback_str = traceback.format_exc() @@ -191,29 +213,31 @@ def analyze_indiv_scope(jsonfile, seeds): logger.critical('---------------Something went wrong running Menetools on " + seeds + "---------------') sys.exit(1) - logger.info('%i metabolic models considered.' %(len(d_set))) - intersection_scope = set.intersection(*list(d_set.values())) + logger.info('%i metabolic models considered.' %(len(scope_dict_set))) + intersection_scope = set.intersection(*list(scope_dict_set.values())) logger.info('\n' + str(len(intersection_scope)) + ' metabolites in core reachable by all organisms (intersection) \n') logger.info("\n".join(intersection_scope)) - union_scope = set.union(*list(d_set.values())) - logger.info('\n' + str(len(union_scope)) + ' metabolites reachable by individual organisms altogether (union), among which ' + str(len(seed_metabolites)) + ' seeds (growth medium) \n') + union_scope = set.union(*list(scope_dict_set.values())) + union_producible_seeds = set.union(*list(individually_producible_seeds_set.values())) + logger.info('\n' + str(len(union_scope)) + ' metabolites reachable by individual organisms altogether (union), among which ' + str(len(union_producible_seeds)) + ' metabolites that are also part of the seeds (growth medium) \n') logger.info("\n".join(union_scope)) - len_scope = [len(d[elem]) for elem in d] - logger.info('\nintersection of scope ' + str(len(intersection_scope))) - logger.info('union of scope ' + str(len(union_scope))) - logger.info('max metabolites in scope ' + str(max(len_scope))) - logger.info('min metabolites in scope ' + str(min(len_scope))) - logger.info('average number of metabolites in scope %.2f (+/- %.2f)' % + len_scope = [len(scope_dict[elem]) for elem in scope_dict] + logger.info('\nSummary:') + logger.info('- intersection of scopes ' + str(len(intersection_scope))) + logger.info('- union of scopes ' + str(len(union_scope))) + logger.info('- max metabolites in scopes ' + str(max(len_scope))) + logger.info('- min metabolites in scopes ' + str(min(len_scope))) + logger.info('- average number of metabolites in scopes %.2f (+/- %.2f)' % (statistics.mean(len_scope), statistics.stdev(len_scope))) return union_scope -def reverse_scope(json_scope, output_dir): +def reverse_scope(scope_dict, output_dir): """Reverse a scope dictionary by focusing on metabolite producers. Args: - json_scope (str): path to JSON dict of scope + scope_dict (dict): dict of scope output_dir (str): path to output directory Returns: @@ -222,11 +246,8 @@ def reverse_scope(json_scope, output_dir): rev_indiv_scopes_json_path = os.path.join(*[output_dir, 'indiv_scopes', 'rev_iscope.json']) rev_indiv_scopes_tsv_path = os.path.join(*[output_dir, 'indiv_scopes', 'rev_iscope.tsv']) - with open(json_scope, 'r') as f: - initial_dict = json.load(f) - new_dic = {} - for k,v in initial_dict.items(): + for k,v in scope_dict.items(): for x in v: new_dic.setdefault(x,[]).append(k) @@ -234,7 +255,7 @@ def reverse_scope(json_scope, output_dir): json.dump(new_dic, g, indent=True, sort_keys=True) all_compounds = [compound for compound in new_dic] - all_species = [species for species in initial_dict] + all_species = [species for species in scope_dict] # For each species get the possibility of production of each compounds. with open(rev_indiv_scopes_tsv_path, 'w') as output_file: @@ -243,4 +264,4 @@ def reverse_scope(json_scope, output_dir): for species in all_species: csvwriter.writerow([species, *[1 if species in new_dic[compound] else 0 for compound in all_compounds]]) - return(rev_indiv_scopes_json_path, rev_indiv_scopes_tsv_path) + return rev_indiv_scopes_json_path, rev_indiv_scopes_tsv_path diff --git a/metage2metabo/m2m/m2m_workflow.py b/metage2metabo/m2m/m2m_workflow.py index 659680b..b992836 100644 --- a/metage2metabo/m2m/m2m_workflow.py +++ b/metage2metabo/m2m/m2m_workflow.py @@ -22,7 +22,7 @@ from metage2metabo import utils, sbml_management from metage2metabo.m2m.reconstruction import recon from metage2metabo.m2m.individual_scope import iscope -from metage2metabo.m2m.community_scope import cscope +from metage2metabo.m2m.community_scope import cscope, reverse_cscope from metage2metabo.m2m.community_addedvalue import addedvalue from metage2metabo.m2m.minimal_community import mincom @@ -81,10 +81,10 @@ def metacom_analysis(sbml_dir, out_dir, seeds, host_mn, targets_file, cpu_number logger.info("\n".join(individually_producible_targets)) commonly_producible_targets = user_targets.intersection(addedvalue_targets) if len(commonly_producible_targets) > 0: - logger.info('\n The following ' + str(len(commonly_producible_targets)) + " targets are additionally reachable through putative cooperation events: \n") + logger.info('\n The following ' + str(len(commonly_producible_targets)) + " targets are additionally reachable through putative cross-feeding events: \n") logger.info("\n".join(commonly_producible_targets)) else: - logger.info("Cooperation events do not enable the producibility of additional targets") + logger.info("Cross feeding interactions do not enable the producibility of additional targets") else: user_targets = None newtargets = addedvalue_targets @@ -103,7 +103,7 @@ def metacom_analysis(sbml_dir, out_dir, seeds, host_mn, targets_file, cpu_number sbml_management.create_species_sbml(newtargets, target_file_path) # Add these targets to the instance - logger.info("Setting " + str(len(newtargets)) + " compounds as targets \n") + logger.info("Setting " + str(len(newtargets)) + " compounds as targets. \n") # if len(newtargets) != len(addedvalue_targets): # logger.info("\n".join(newtargets)) @@ -111,7 +111,7 @@ def metacom_analysis(sbml_dir, out_dir, seeds, host_mn, targets_file, cpu_number instance_com, out_dir, newtargets) # MINCOM - mincom(instance_w_targets, out_dir) + mincom(instance_w_targets, seeds, newtargets, out_dir) # remove intermediate files os.unlink(instance_com) os.unlink(instance_w_targets) @@ -171,8 +171,9 @@ def targets_producibility(m2m_out_dir, union_targets_iscope, targets_cscope, add prod_targets['indiv_producible'] = list(indiv_producible) indiv_scopes_path = os.path.join(*[m2m_out_dir, 'indiv_scopes', 'indiv_scopes.json']) - produced_seeds_path = os.path.join(*[m2m_out_dir, 'indiv_scopes', 'indiv_produced_seeds.json']) + produced_seeds_path = os.path.join(*[m2m_out_dir, 'indiv_scopes', 'seeds_in_indiv_scopes.json']) comm_scopes_path = os.path.join(*[m2m_out_dir, 'community_analysis', 'comm_scopes.json']) + reverse_cscope_path = os.path.join(*[m2m_out_dir, 'community_analysis', 'rev_cscope.json']) mincom_path = os.path.join(*[m2m_out_dir, 'community_analysis', 'mincom.json']) producibility_targets_path = os.path.join(m2m_out_dir, 'producibility_targets.json') @@ -200,15 +201,26 @@ def targets_producibility(m2m_out_dir, union_targets_iscope, targets_cscope, add if os.path.exists(comm_scopes_path): prod_targets['com_only_producers'] = {} - with open(comm_scopes_path) as json_data: - com_producible_compounds = json.load(json_data) - for target in selected_targets: - if target in com_producible_compounds['targets_producers']: - if target in prod_targets['individual_producers']: - only_com_producing_species = list(set(com_producible_compounds['targets_producers'][target]) - set(prod_targets['individual_producers'][target])) - else: - only_com_producing_species = com_producible_compounds['targets_producers'][target] - prod_targets['com_only_producers'][target] = only_com_producing_species + if os.path.exists(reverse_cscope_path): + with open(reverse_cscope_path) as json_data: + rev_cscope = json.load(json_data) + for target in selected_targets: + if target in rev_cscope: + if target in prod_targets['individual_producers']: + only_com_producing_species = list(set(rev_cscope[target]) - set(prod_targets['individual_producers'][target])) + else: + only_com_producing_species = rev_cscope[target] + prod_targets['com_only_producers'][target] = only_com_producing_species + else: + with open(comm_scopes_path) as json_data: + com_producible_compounds = json.load(json_data) + for target in selected_targets: + if target in com_producible_compounds['targets_producers']: + if target in prod_targets['individual_producers']: + only_com_producing_species = list(set(com_producible_compounds['targets_producers'][target]) - set(prod_targets['individual_producers'][target])) + else: + only_com_producing_species = com_producible_compounds['targets_producers'][target] + prod_targets['com_only_producers'][target] = only_com_producing_species if os.path.exists(mincom_path): with open(mincom_path) as json_data: diff --git a/metage2metabo/m2m/minimal_community.py b/metage2metabo/m2m/minimal_community.py index 23a1633..22d2079 100644 --- a/metage2metabo/m2m/minimal_community.py +++ b/metage2metabo/m2m/minimal_community.py @@ -19,6 +19,7 @@ import time from metage2metabo import utils +from metage2metabo.sbml_management import get_compounds from miscoto import run_mincom @@ -28,16 +29,33 @@ logging.getLogger("mpwt").setLevel(logging.INFO) -def mincom(instance_w_targets, out_dir): +def mincom(instance_w_targets, seeds, targets, out_dir): """Compute minimal community selection and show analyses. Args: instance_w_targets (str): ASP instance filepath + seeds (str): seeds filepath + targets (str): targets set out_dir (str): results directory """ starttime = time.time() + + logger.info('\n###############################################') + logger.info('# #') + logger.info('# Minimal community selection #') + logger.info('# #') + logger.info('###############################################\n') + miscoto_dir = os.path.join(out_dir, 'community_analysis') miscoto_mincom_path = os.path.join(miscoto_dir, 'mincom.json') + + # check if seeds are among the targets + seeds = set(get_compounds(seeds)) + targets = set(targets) + intersection = seeds.intersection(targets) + if len(intersection) > 0: + logger.warning(f'WARNING: The following seeds are among the targets: {intersection}. They will not be considered as targets during the computation of minimal communities: they will be considered as already reachable according to the network expansion definition.\n') + if not utils.is_valid_dir(miscoto_dir): logger.critical('Impossible to access/create output directory') sys.exit(1) diff --git a/metage2metabo/m2m/reconstruction.py b/metage2metabo/m2m/reconstruction.py index d9f7701..b1f79fe 100644 --- a/metage2metabo/m2m/reconstruction.py +++ b/metage2metabo/m2m/reconstruction.py @@ -53,6 +53,11 @@ def recon(inp_dir, out_dir, noorphan_bool, padmet_bool, sbml_level, nb_cpu, clea tuple: PGDB directory (str), SBML directory (str) """ starttime = time.time() + logger.info('\n###############################################') + logger.info('# #') + logger.info('# Metabolic network reconstruction #') + logger.info('# #') + logger.info('###############################################\n') if use_pwt_xml and padmet_bool: logger.critical("-p/padmet_bool and --pwt-xml/use_pwt_xml are incompatible arguments") diff --git a/test/metabolic_data/tiny_toy/networks/bact1.sbml b/test/metabolic_data/tiny_toy/networks/bact1.sbml new file mode 100644 index 0000000..7929a4f --- /dev/null +++ b/test/metabolic_data/tiny_toy/networks/bact1.sbml @@ -0,0 +1,88 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/test/metabolic_data/tiny_toy/networks/bact2.sbml b/test/metabolic_data/tiny_toy/networks/bact2.sbml new file mode 100644 index 0000000..eb8869d --- /dev/null +++ b/test/metabolic_data/tiny_toy/networks/bact2.sbml @@ -0,0 +1,89 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/test/metabolic_data/tiny_toy/networks/bact3.sbml b/test/metabolic_data/tiny_toy/networks/bact3.sbml new file mode 100644 index 0000000..43fe515 --- /dev/null +++ b/test/metabolic_data/tiny_toy/networks/bact3.sbml @@ -0,0 +1,102 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/test/metabolic_data/tiny_toy/networks/bact4.sbml b/test/metabolic_data/tiny_toy/networks/bact4.sbml new file mode 100644 index 0000000..71a0533 --- /dev/null +++ b/test/metabolic_data/tiny_toy/networks/bact4.sbml @@ -0,0 +1,71 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/test/metabolic_data/tiny_toy/networks/bact5.sbml b/test/metabolic_data/tiny_toy/networks/bact5.sbml new file mode 100644 index 0000000..e5e4bc1 --- /dev/null +++ b/test/metabolic_data/tiny_toy/networks/bact5.sbml @@ -0,0 +1,63 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/test/metabolic_data/tiny_toy/networks/bact6.sbml b/test/metabolic_data/tiny_toy/networks/bact6.sbml new file mode 100644 index 0000000..26cedf1 --- /dev/null +++ b/test/metabolic_data/tiny_toy/networks/bact6.sbml @@ -0,0 +1,65 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/test/metabolic_data/tiny_toy/networks_viz.pdf b/test/metabolic_data/tiny_toy/networks_viz.pdf new file mode 100644 index 0000000..d7298fe Binary files /dev/null and b/test/metabolic_data/tiny_toy/networks_viz.pdf differ diff --git a/test/metabolic_data/tiny_toy/seeds_community.sbml b/test/metabolic_data/tiny_toy/seeds_community.sbml new file mode 100644 index 0000000..a391f36 --- /dev/null +++ b/test/metabolic_data/tiny_toy/seeds_community.sbml @@ -0,0 +1,13 @@ + + + + + + + + + + + + diff --git a/test/metabolic_data/tiny_toy/targets_community.sbml b/test/metabolic_data/tiny_toy/targets_community.sbml new file mode 100644 index 0000000..298fd57 --- /dev/null +++ b/test/metabolic_data/tiny_toy/targets_community.sbml @@ -0,0 +1,15 @@ + + + + + + + + + + + + + + diff --git a/test/metabolic_data/tiny_toy/targets_community_allprod.sbml b/test/metabolic_data/tiny_toy/targets_community_allprod.sbml new file mode 100644 index 0000000..cb012f9 --- /dev/null +++ b/test/metabolic_data/tiny_toy/targets_community_allprod.sbml @@ -0,0 +1,14 @@ + + + + + + + + + + + + + diff --git a/test/test_m2m_addedvalue.py b/test/test_m2m_addedvalue.py index 68ba2f5..498490e 100644 --- a/test/test_m2m_addedvalue.py +++ b/test/test_m2m_addedvalue.py @@ -77,12 +77,15 @@ 'M_REDUCED__45__MENAQUINONE_c', 'M_CPD__45__14378_c', 'M_2__45__METHYL__45__ACETO__45__ACETYL__45__COA_c', 'M_CPD0__45__2121_c', 'M_CPD__45__12773_c', 'M_CPD__45__13644_c', - 'M_ALPHA__45__GLC__45__6__45__P_c' + 'M_ALPHA__45__GLC__45__6__45__P_c', + 'M_CELLULOSE_c', + 'M_DODECANOATE_c', + 'M_STEARIC_ACID_c', } NUMBER_BACT = 17 -SIZE_UNION = 625 -SIZE_INTERSECTION = 135 -SIZE_CSCOPE = 651 +SIZE_UNION = 576 +SIZE_INTERSECTION = 50 +SIZE_CSCOPE = 698 def test_m2m_addedvalue_call(): @@ -129,7 +132,7 @@ def test_m2m_addedvalue_call(): reader = SBMLReader() document = reader.readSBML(target_file) new_targets = set([specie.getId() for specie in document.getModel().getListOfSpecies()]) - assert new_targets == EXPECTED_TARGETS + assert sorted(new_targets) == sorted(EXPECTED_TARGETS) # clean shutil.rmtree(respath) diff --git a/test/test_m2m_cscope.py b/test/test_m2m_cscope.py index 421adc4..078fe79 100644 --- a/test/test_m2m_cscope.py +++ b/test/test_m2m_cscope.py @@ -3,7 +3,7 @@ """ Description: -Test m2m addedvalue on 17 metabolic networks and a file representing growth medium (seeds). +Test m2m cscope on 17 metabolic networks and a file representing growth medium (seeds). """ import os @@ -11,6 +11,7 @@ import subprocess import json +from metage2metabo.sbml_management import get_compounds from metage2metabo import utils EXPECTED_TARGETS = { @@ -75,10 +76,11 @@ 'M_REDUCED__45__MENAQUINONE_c', 'M_CPD__45__14378_c', 'M_2__45__METHYL__45__ACETO__45__ACETYL__45__COA_c', 'M_CPD0__45__2121_c', 'M_CPD__45__12773_c', 'M_CPD__45__13644_c', - 'M_ALPHA__45__GLC__45__6__45__P_c' + 'M_ALPHA__45__GLC__45__6__45__P_c', + 'M_MANNITOL_c' } -SIZE_CSCOPE = 651 +SIZE_CSCOPE = 698 def test_m2m_cscope_call(): @@ -107,9 +109,12 @@ def test_m2m_cscope_call(): # CSCOPE ANALYSIS with open(cscope_file, 'r') as json_cdata: d_cscope = json.load(json_cdata) + producible_compounds = set(d_cscope['com_scope']) + targets = set(get_compounds(targets_path)) + producible_targets = producible_compounds.intersection(targets) + print(producible_targets) assert len(d_cscope['com_scope']) == SIZE_CSCOPE - assert sorted(d_cscope['com_prodtargets']) == sorted(EXPECTED_TARGETS) - assert d_cscope['com_unprodtargets'] == [] + assert sorted(producible_targets) == sorted(EXPECTED_TARGETS) # clean shutil.rmtree(respath) diff --git a/test/test_m2m_iscope.py b/test/test_m2m_iscope.py index 11d6b26..a93d9db 100644 --- a/test/test_m2m_iscope.py +++ b/test/test_m2m_iscope.py @@ -14,13 +14,10 @@ from metage2metabo import utils EXPECTED_PRODUCED_COMPOUNDS = { - 'M_10__45__FORMYL__45__THF_c': 17, 'M_1__45__AMINO__45__PROPAN__45__2__45__ONE__45__3__45__PHOSPHATE_c': 13, 'M_1__45__KESTOTRIOSE_c': 1, 'M_1__45__KETO__45__2__45__METHYLVALERATE_c': 13, 'M_1__45__L__45__MYO__45__INOSITOL__45__1__45__P_c': 7, - 'M_2C__45__METH__45__D__45__ERYTHRITOL__45__CYCLODIPHOSPHATE_c': 9, - 'M_2K__45__4CH3__45__PENTANOATE_c': 3, 'M_2__45__3__45__DIHYDROXYBENZOATE_c': 2, 'M_2__45__5__45__TRIPHOSPHORIBOSYL__45__3__45__DEPHOSPHO__45___c': 1, 'M_2__45__ACETO__45__2__45__HYDROXY__45__BUTYRATE_c': 13, @@ -30,63 +27,62 @@ 'M_2__45__C__45__METHYL__45__D__45__ERYTHRITOL__45__4__45__PHOSPHATE_c': 15, 'M_2__45__D__45__THREO__45__HYDROXY__45__3__45__CARBOXY__45__ISOCAPROATE_c': 2, 'M_2__45__HYDROXY__45__3__45__KETO__45__5__45__METHYLTHIO__45__1__45__PHOSPHOP_c': 2, - 'M_2__45__KETOGLUTARATE_c': 16, 'M_2__45__KETO__45__3__45__DEOXY__45__6__45__P__45__GLUCONATE_c': 3, 'M_2__45__KETO__45__3__45__METHYL__45__VALERATE_c': 15, 'M_2__45__KETO__45__GLUTARAMATE_c': 2, 'M_2__45__KETO__45__ISOVALERATE_c': 15, - 'M_2__45__METHYL__45__3__45__PHYTYL__45__14__45__NAPHTHOQUINONE_c': 17, + 'M_2__45__KETOGLUTARATE_c': 16, 'M_2__45__METHYL__45__BUTYRYL__45__COA_c': 1, 'M_2__45__OXOBUTANOATE_c': 13, 'M_2__45__PG_c': 17, 'M_2__45__PHOSPHO__45__4__45__CYTIDINE__45__5__45__DIPHOSPHO__45__2__45__C__45__MET_c': 9, - 'M_3OH__45__4P__45__OH__45__ALPHA__45__KETOBUTYRATE_c': 13, - 'M_3S__45__CITRYL__45__COA_c': 1, + 'M_2C__45__METH__45__D__45__ERYTHRITOL__45__CYCLODIPHOSPHATE_c': 9, + 'M_2K__45__4CH3__45__PENTANOATE_c': 3, 'M_3__45__4__45__DIHYDROXYBENZOATE_c': 1, 'M_3__45__CARBOXY__45__3__45__HYDROXY__45__ISOCAPROATE_c': 2, 'M_3__45__DEHYDRO__45__SHIKIMATE_c': 15, 'M_3__45__DEOXY__45__D__45__ARABINO__45__HEPTULOSONATE__45__7__45__P_c': 15, 'M_3__45__ENOLPYRUVYL__45__SHIKIMATE__45__5P_c': 15, 'M_3__45__HYDROXY__45__PROPIONATE_c': 2, - 'M_3__45__KETOBUTYRATE_c': 1, 'M_3__45__KETO__45__ADIPATE_c': 1, 'M_3__45__KETO__45__L__45__GULONATE_c': 3, + 'M_3__45__KETOBUTYRATE_c': 1, 'M_3__45__MERCAPTO__45__PYRUVATE_c': 2, 'M_3__45__OXOADIPATE__45__ENOL__45__LACTONE_c': 1, 'M_3__45__P__45__HYDROXYPYRUVATE_c': 14, 'M_3__45__P__45__SERINE_c': 14, 'M_3__45__SULFINOALANINE_c': 3, 'M_3__45__UREIDO__45__PROPIONATE_c': 2, + 'M_3OH__45__4P__45__OH__45__ALPHA__45__KETOBUTYRATE_c': 13, + 'M_3S__45__CITRYL__45__COA_c': 1, 'M_4__45__AMINO__45__4__45__DEOXYCHORISMATE_c': 13, 'M_4__45__AMINO__45__BUTYRALDEHYDE_c': 2, 'M_4__45__AMINO__45__BUTYRATE_c': 5, 'M_4__45__CYTIDINE__45__5__45__DIPHOSPHO__45__2__45__C_c': 9, 'M_4__45__FUMARYL__45__ACETOACETATE_c': 1, + 'M_4__45__hydroxybenzoate_c': 1, 'M_4__45__IMIDAZOLONE__45__5__45__PROPIONATE_c': 4, 'M_4__45__MALEYL__45__ACETOACETATE_c': 1, - 'M_4__45__PHOSPHONOOXY__45__THREONINE_c': 13, 'M_4__45__P__45__PANTOTHENATE_c': 8, - 'M_4__45__hydroxybenzoate_c': 1, - 'M_5__45__HYDROXYISOURATE_c': 2, + 'M_4__45__PHOSPHONOOXY__45__THREONINE_c': 13, 'M_5__45__HYDROXY__45__CTP_c': 4, + 'M_5__45__HYDROXYISOURATE_c': 2, 'M_5__45__METHYLTHIOADENOSINE_c': 5, - 'M_5__45__METHYL__45__THF_c': 17, 'M_5__45__OXOPROLINE_c': 15, - 'M_5__45__PHOSPHO__45__RIBOSYL__45__GLYCINEAMIDE_c': 15, 'M_5__45__P__45__BETA__45__D__45__RIBOSYL__45__AMINE_c': 15, + 'M_5__45__PHOSPHO__45__RIBOSYL__45__GLYCINEAMIDE_c': 15, 'M_6__45__KESTOSE_c': 1, 'M_7__45__8__45__DIHYDROPTEROATE_c': 5, 'M_7__45__AMINOMETHYL__45__7__45__DEAZAGUANINE_c': 5, 'M_7__45__CYANO__45__7__45__DEAZAGUANINE_c': 5, + 'M_ACET_c': 6, 'M_ACETALD_c': 10, - 'M_ACETYLSERINE_c': 3, 'M_ACETYL__45__COA_c': 3, 'M_ACETYL__45__GLU_c': 1, 'M_ACETYL__45__P_c': 3, - 'M_ACET_c': 6, + 'M_ACETYLSERINE_c': 3, 'M_ADENINE_c': 9, 'M_ADENOSINE_c': 16, - 'M_ADENOSYLCOBALAMIN_c': 17, 'M_ADENOSYL__45__HOMO__45__CYS_c': 3, 'M_ADENOSYL__45__P4_c': 2, 'M_ADENYLOSUCC_c': 10, @@ -98,10 +94,9 @@ 'M_AICAR_c': 14, 'M_ALLANTOATE_c': 2, 'M_ALPHA__45__GLUCOSE_c': 17, - 'M_ALPHA__45__TOCOPHEROL_c': 17, 'M_AMINO__45__ACETONE_c': 6, - 'M_AMINO__45__HYDROXYMETHYL__45__METHYLPYRIMIDINE__45__PP_c': 4, 'M_AMINO__45__HYDROXYMETHYL__45__METHYL__45__PYR__45__P_c': 4, + 'M_AMINO__45__HYDROXYMETHYL__45__METHYLPYRIMIDINE__45__PP_c': 4, 'M_AMINO__45__OH__45__HYDROXYMETHYL__45__DIHYDROPTERIDINE_c': 6, 'M_AMINO__45__OXOBUT_c': 6, 'M_AMINO__45__RIBOSYLAMINO__45__1H__45__3H__45__PYR__45__DIONE_c': 6, @@ -110,20 +105,18 @@ 'M_AMP_c': 17, 'M_ANTHRANILATE_c': 13, 'M_ARABINOSE__45__5P_c': 2, - 'M_ARACHIDIC_ACID_c': 17, - 'M_ARACHIDONIC_ACID_c': 17, - 'M_ARG_c': 17, - 'M_ASCORBATE_c': 17, + 'M_ARG_c': 13, + 'M_ASCORBATE_c': 3, 'M_ASN_c': 9, 'M_ATP_c': 17, - 'M_BETA__45__D__45__FRUCTOSE_c': 17, + 'M_B__45__ALANINE_c': 7, + 'M_BETA__45__D__45__FRUCTOSE_c': 8, 'M_BIFURCOSE_c': 1, - 'M_BIOTIN_c': 17, 'M_BIO__45__5__45__AMP_c': 14, - 'M_BUTYRIC_ACID_c': 17, + 'M_BUTYRIC_ACID_c': 6, 'M_BUTYRYL__45__COA_c': 2, 'M_BUTYRYL__45__P_c': 6, - 'M_B__45__ALANINE_c': 7, + 'M_C__45__DI__45__GMP_c': 2, 'M_C3_c': 1, 'M_CADAVERINE_c': 2, 'M_CAMP_c': 4, @@ -132,37 +125,18 @@ 'M_CARBAMYUL__45__L__45__ASPARTATE_c': 13, 'M_CARBON__45__DIOXIDE_c': 17, 'M_CARBOXYPHENYLAMINO__45__DEOXYRIBULOSE__45__P_c': 13, - 'M_CA__43__2_c': 17, 'M_CDP__45__D__45__GLUCOSE_c': 3, 'M_CDP_c': 10, 'M_CELLOBIOSE_c': 1, - 'M_CELLULOSE_c': 17, + 'M_Cellodextrins_c': 4, 'M_CGMP_c': 1, 'M_CH33ADO_c': 12, - 'M_CHOLESTEROL_c': 17, 'M_CHORISMATE_c': 15, 'M_CIS__45__ACONITATE_c': 10, 'M_CIT_c': 12, - 'M_CL__45___c': 17, 'M_CMP_c': 10, - 'M_CO3_c': 13, 'M_CO__45__A_c': 3, - 'M_CPD0__45__1108_c': 10, - 'M_CPD0__45__1456_c': 3, - 'M_CPD0__45__1699_c': 5, - 'M_CPD0__45__1905_c': 9, - 'M_CPD0__45__2015_c': 1, - 'M_CPD0__45__2101_c': 2, - 'M_CPD0__45__2208_c': 17, - 'M_CPD0__45__2298_c': 2, - 'M_CPD0__45__2461_c': 9, - 'M_CPD0__45__2467_c': 2, - 'M_CPD0__45__2468_c': 2, - 'M_CPD0__45__2472_c': 17, - 'M_CPD0__45__2474_c': 10, - 'M_CPD0__45__2483_c': 2, - 'M_CPD1F__45__129_c': 17, - 'M_CPD__45__10244_c': 17, + 'M_CO3_c': 13, 'M_CPD__45__10267_c': 2, 'M_CPD__45__10330_c': 10, 'M_CPD__45__10353_c': 9, @@ -179,14 +153,13 @@ 'M_CPD__45__10774_c': 3, 'M_CPD__45__10775_c': 3, 'M_CPD__45__10776_c': 3, - 'M_CPD__45__1086_c': 6, 'M_CPD__45__108_c': 12, + 'M_CPD__45__1086_c': 6, 'M_CPD__45__1091_c': 2, 'M_CPD__45__11281_c': 2, 'M_CPD__45__1133_c': 2, 'M_CPD__45__11770_c': 2, 'M_CPD__45__11855_c': 2, - 'M_CPD__45__12189_c': 17, 'M_CPD__45__12258_c': 1, 'M_CPD__45__12279_c': 12, 'M_CPD__45__12365_c': 3, @@ -196,14 +169,11 @@ 'M_CPD__45__12427_c': 3, 'M_CPD__45__12575_c': 8, 'M_CPD__45__12601_c': 4, - 'M_CPD__45__12653_c': 17, - 'M_CPD__45__12826_c': 17, 'M_CPD__45__13043_c': 5, 'M_CPD__45__13118_c': 1, 'M_CPD__45__13357_c': 14, 'M_CPD__45__13469_c': 17, 'M_CPD__45__13489_c': 5, - 'M_CPD__45__13792_c': 17, 'M_CPD__45__13847_c': 1, 'M_CPD__45__13851_c': 9, 'M_CPD__45__13907_c': 3, @@ -211,7 +181,6 @@ 'M_CPD__45__13912_c': 3, 'M_CPD__45__13913_c': 3, 'M_CPD__45__13914_c': 3, - 'M_CPD__45__14292_c': 17, 'M_CPD__45__14443_c': 16, 'M_CPD__45__14553_c': 8, 'M_CPD__45__14808_c': 2, @@ -227,28 +196,24 @@ 'M_CPD__45__15382_c': 7, 'M_CPD__45__15403_c': 2, 'M_CPD__45__15435_c': 13, - 'M_CPD__45__15590_c': 15, 'M_CPD__45__155_c': 1, + 'M_CPD__45__15590_c': 15, 'M_CPD__45__15700_c': 3, 'M_CPD__45__15709_c': 17, 'M_CPD__45__15818_c': 10, - 'M_CPD__45__15972_c': 17, + 'M_CPD__45__159_c': 1, 'M_CPD__45__15975_c': 3, 'M_CPD__45__15979_c': 2, - 'M_CPD__45__159_c': 1, 'M_CPD__45__16013_c': 13, 'M_CPD__45__16015_c': 16, 'M_CPD__45__16606_c': 1, 'M_CPD__45__16876_c': 2, - 'M_CPD__45__17188_c': 17, - 'M_CPD__45__17322_c': 17, + 'M_CPD__45__18_c': 2, 'M_CPD__45__18085_c': 17, 'M_CPD__45__18118_c': 2, 'M_CPD__45__18238_c': 13, 'M_CPD__45__187_c': 2, - 'M_CPD__45__18_c': 2, 'M_CPD__45__19306_c': 2, - 'M_CPD__45__195_c': 17, 'M_CPD__45__196_c': 2, 'M_CPD__45__19753_c': 2, 'M_CPD__45__20826_c': 8, @@ -265,11 +230,9 @@ 'M_CPD__45__318_c': 3, 'M_CPD__45__334_c': 3, 'M_CPD__45__335_c': 1, - 'M_CPD__45__3617_c': 17, 'M_CPD__45__365_c': 2, 'M_CPD__45__375_c': 7, 'M_CPD__45__380_c': 2, - 'M_CPD__45__387_c': 17, 'M_CPD__45__389_c': 2, 'M_CPD__45__444_c': 3, 'M_CPD__45__469_c': 1, @@ -283,114 +246,117 @@ 'M_CPD__45__602_c': 6, 'M_CPD__45__606_c': 3, 'M_CPD__45__6124_c': 2, - 'M_CPD__45__622_c': 1, 'M_CPD__45__62_c': 2, + 'M_CPD__45__622_c': 1, 'M_CPD__45__645_c': 2, 'M_CPD__45__653_c': 17, 'M_CPD__45__658_c': 3, 'M_CPD__45__667_c': 1, - 'M_CPD__45__6972_c': 2, 'M_CPD__45__69_c': 13, + 'M_CPD__45__6972_c': 2, 'M_CPD__45__7100_c': 2, 'M_CPD__45__7224_c': 2, 'M_CPD__45__7671_c': 10, 'M_CPD__45__7737_c': 1, - 'M_CPD__45__7830_c': 17, - 'M_CPD__45__7836_c': 17, 'M_CPD__45__8050_c': 2, 'M_CPD__45__8052_c': 2, 'M_CPD__45__8259_c': 16, 'M_CPD__45__8268_c': 1, 'M_CPD__45__827_c': 2, - 'M_CPD__45__8462_c': 17, 'M_CPD__45__85_c': 2, 'M_CPD__45__8999_c': 2, 'M_CPD__45__9000_c': 2, - 'M_CPD__45__9245_c': 17, 'M_CPD__45__9451_c': 2, 'M_CPD__45__9550_c': 1, 'M_CPD__45__9923_c': 4, 'M_CPD__45__9924_c': 4, 'M_CPD__45__9925_c': 2, + 'M_CPD0__45__1108_c': 10, + 'M_CPD0__45__1456_c': 3, + 'M_CPD0__45__1699_c': 5, + 'M_CPD0__45__1905_c': 9, + 'M_CPD0__45__2015_c': 1, + 'M_CPD0__45__2101_c': 2, + 'M_CPD0__45__2298_c': 2, + 'M_CPD0__45__2461_c': 9, + 'M_CPD0__45__2467_c': 2, + 'M_CPD0__45__2468_c': 2, + 'M_CPD0__45__2472_c': 17, + 'M_CPD0__45__2474_c': 10, + 'M_CPD0__45__2483_c': 2, 'M_CTP_c': 10, - 'M_CU__43__2_c': 17, 'M_CYS__45__GLY_c': 9, - 'M_CYS_c': 17, + 'M_CYS_c': 9, 'M_CYTIDINE_c': 10, - 'M_C__45__DI__45__GMP_c': 2, - 'M_Cellodextrins_c': 4, + 'M_D__45__6__45__P__45__GLUCONO__45__DELTA__45__LACTONE_c': 13, + 'M_D__45__ALA__45__D__45__ALA_c': 17, + 'M_D__45__ALANINE_c': 17, + 'M_D__45__ALPHABETA__45__D__45__HEPTOSE__45__7__45__PHOSPHATE_c': 3, + 'M_D__45__arabinofuranose_c': 3, + 'M_D__45__arabinopyranose_c': 3, + 'M_D__45__BETA__45__D__45__HEPTOSE__45__1__45__P_c': 2, + 'M_D__45__BETA__45__D__45__HEPTOSE__45__17__45__DIPHOSPHATE_c': 3, + 'M_D__45__ERYTHRO__45__IMIDAZOLE__45__GLYCEROL__45__P_c': 14, + 'M_D__45__galactopyranose_c': 15, + 'M_D__45__GLT_c': 17, + 'M_D__45__glucopyranose__45__6__45__phosphate_c': 17, + 'M_D__45__GLUCOSAMINE__45__6__45__P_c': 17, + 'M_D__45__LACTATE_c': 10, + 'M_D__45__mannopyranose_c': 4, + 'M_D__45__METHYL__45__MALONYL__45__COA_c': 2, + 'M_D__45__PROLINE_c': 1, + 'M_D__45__Ribofuranose_c': 10, + 'M_D__45__Ribopyranose_c': 10, + 'M_D__45__RIBULOSE__45__1__45__P_c': 3, + 'M_D__45__RIBULOSE__45__15__45__P2_c': 2, + 'M_D__45__RIBULOSE_c': 5, + 'M_D__45__SEDOHEPTULOSE__45__7__45__P_c': 15, + 'M_D__45__Xylopyranose_c': 4, + 'M_D__45__XYLULOSE_c': 3, 'M_DATP_c': 9, 'M_DCDP_c': 2, 'M_DCMP_c': 2, 'M_DCTP_c': 4, 'M_DEAMIDO__45__NAD_c': 16, 'M_DEHYDROQUINATE_c': 15, + 'M_DEOXY__45__D__45__RIBOSE__45__1__45__PHOSPHATE_c': 8, + 'M_DEOXY__45__RIBOSE__45__5P_c': 8, 'M_DEOXYGUANOSINE_c': 8, 'M_DEOXYINOSINE_c': 8, 'M_DEOXYXYLULOSE__45__5P_c': 16, - 'M_DEOXY__45__D__45__RIBOSE__45__1__45__PHOSPHATE_c': 8, - 'M_DEOXY__45__RIBOSE__45__5P_c': 8, 'M_DEPHOSPHO__45__COA_c': 3, 'M_DGDP_c': 1, 'M_DGMP_c': 1, 'M_DGTP_c': 9, + 'M_DI__45__H__45__OROTATE_c': 13, + 'M_DI__45__H__45__URACIL_c': 2, 'M_DIAMINO__45__OH__45__PHOSPHORIBOSYLAMINO__45__PYR_c': 7, + 'M_DIHYDRO__45__DIOH__45__BENZOATE_c': 2, + 'M_DIHYDRO__45__NEO__45__PTERIN_c': 6, 'M_DIHYDROFOLATE_c': 5, 'M_DIHYDROMONAPTERIN__45__TRIPHOSPHATE_c': 2, - 'M_DIHYDRONEOPTERIN__45__P3_c': 8, 'M_DIHYDRONEOPTERIN__45__P_c': 6, + 'M_DIHYDRONEOPTERIN__45__P3_c': 8, 'M_DIHYDROPTERIN__45__CH2OH__45__PP_c': 6, + 'M_DIHYDROXY__45__ACETONE__45__PHOSPHATE_c': 17, + 'M_DIHYDROXY__45__BUTANONE__45__P_c': 7, 'M_DIHYDROXYACETONE_c': 9, 'M_DIHYDROXYNAPHTHOATE_c': 2, 'M_DIHYDROXYPENTANEDIONE_c': 3, - 'M_DIHYDROXY__45__ACETONE__45__PHOSPHATE_c': 17, - 'M_DIHYDROXY__45__BUTANONE__45__P_c': 7, - 'M_DIHYDRO__45__DIOH__45__BENZOATE_c': 2, - 'M_DIHYDRO__45__NEO__45__PTERIN_c': 6, 'M_DIMETHYL__45__D__45__RIBITYL__45__LUMAZINE_c': 6, - 'M_DI__45__H__45__OROTATE_c': 13, - 'M_DI__45__H__45__URACIL_c': 2, - 'M_DOCOSANOATE_c': 17, - 'M_DODECANOATE_c': 17, 'M_DPG_c': 17, 'M_DUMP_c': 4, 'M_DUTP_c': 4, - 'M_D__45__6__45__P__45__GLUCONO__45__DELTA__45__LACTONE_c': 13, - 'M_D__45__ALANINE_c': 17, - 'M_D__45__ALA__45__D__45__ALA_c': 17, - 'M_D__45__ALPHABETA__45__D__45__HEPTOSE__45__7__45__PHOSPHATE_c': 3, - 'M_D__45__BETA__45__D__45__HEPTOSE__45__17__45__DIPHOSPHATE_c': 3, - 'M_D__45__BETA__45__D__45__HEPTOSE__45__1__45__P_c': 2, - 'M_D__45__ERYTHRO__45__IMIDAZOLE__45__GLYCEROL__45__P_c': 14, - 'M_D__45__GLT_c': 17, - 'M_D__45__GLUCOSAMINE__45__6__45__P_c': 17, - 'M_D__45__LACTATE_c': 10, - 'M_D__45__METHYL__45__MALONYL__45__COA_c': 2, - 'M_D__45__PROLINE_c': 1, - 'M_D__45__RIBULOSE__45__15__45__P2_c': 2, - 'M_D__45__RIBULOSE__45__1__45__P_c': 3, - 'M_D__45__RIBULOSE_c': 5, - 'M_D__45__Ribofuranose_c': 10, - 'M_D__45__Ribopyranose_c': 10, - 'M_D__45__SEDOHEPTULOSE__45__7__45__P_c': 15, - 'M_D__45__XYLULOSE_c': 3, - 'M_D__45__Xylopyranose_c': 4, - 'M_D__45__arabinofuranose_c': 3, - 'M_D__45__arabinopyranose_c': 3, - 'M_D__45__galactopyranose_c': 17, - 'M_D__45__glucopyranose__45__6__45__phosphate_c': 17, - 'M_D__45__mannopyranose_c': 4, 'M_ENOL__45__PHENYLPYRUVATE_c': 4, 'M_ENTEROBACTIN_c': 2, 'M_ERYTHRONATE__45__4P_c': 13, 'M_ERYTHROSE__45__4P_c': 15, 'M_ETHYLENE__45__CMPD_c': 2, - 'M_ETOH_c': 17, + 'M_ETOH_c': 10, 'M_FAD_c': 16, - 'M_FE__43__2_c': 17, - 'M_FE__43__3_c': 17, - 'M_FMNH2_c': 6, + 'M_FE__43__3_c': 9, 'M_FMN_c': 16, + 'M_FMNH2_c': 6, 'M_FORMAMIDE_c': 2, 'M_FORMATE_c': 9, 'M_FORMYL__45__ISOGLUTAMINE_c': 2, @@ -409,46 +375,46 @@ 'M_GLN_c': 17, 'M_GLT_c': 17, 'M_GLUCONATE_c': 2, + 'M_Glucopyranose_c': 12, 'M_GLUCOSAMINE__45__1P_c': 17, + 'M_Glucose_c': 4, 'M_GLUTATHIONE_c': 9, 'M_GLUTATHIONYLSPERMIDINE_c': 2, + 'M_GLY_c': 7, 'M_GLYCERATE_c': 1, 'M_GLYCEROL__45__3P_c': 10, + 'M_GLYCOL_c': 2, 'M_GLYCOLALDEHYDE_c': 7, 'M_GLYCOLLATE_c': 4, - 'M_GLYCOL_c': 2, 'M_GLYOX_c': 13, - 'M_GLY_c': 17, 'M_GMP_c': 10, 'M_GTP_c': 10, 'M_GUANINE_c': 10, 'M_GUANOSINE__45__5DP__45__3DP_c': 10, 'M_GUANOSINE_c': 10, - 'M_Glucopyranose_c': 12, - 'M_Glucose_c': 4, 'M_H2CO3_c': 13, 'M_HCO3_c': 13, + 'M_HIS_c': 14, 'M_HISTAMINE_c': 1, 'M_HISTIDINAL_c': 14, 'M_HISTIDINOL_c': 14, - 'M_HIS_c': 17, 'M_HMP_c': 4, - 'M_HOMOGENTISATE_c': 1, 'M_HOMO__45__CYS_c': 9, + 'M_HOMOGENTISATE_c': 1, 'M_HS_c': 9, 'M_HYDROGEN__45__MOLECULE_c': 3, 'M_HYDROGEN__45__PEROXIDE_c': 9, 'M_HYPOXANTHINE_c': 10, 'M_IDP_c': 2, - 'M_ILE_c': 17, + 'M_ILE_c': 15, 'M_IMIDAZOLE__45__ACETOL__45__P_c': 14, 'M_IMINOASPARTATE_c': 16, 'M_IMP_c': 10, + 'M_INDOLE__45__3__45__GLYCEROL__45__P_c': 13, 'M_INDOLE_ACETALDEHYDE_c': 2, 'M_INDOLE_ACETATE_AUXIN_c': 2, - 'M_INDOLE_PYRUVATE_c': 2, - 'M_INDOLE__45__3__45__GLYCEROL__45__P_c': 13, 'M_INDOLE_c': 13, + 'M_INDOLE_PYRUVATE_c': 2, 'M_INOSINE_c': 10, 'M_ISOBUTYRYL__45__COA_c': 1, 'M_ISOCHORISMATE_c': 4, @@ -457,18 +423,11 @@ 'M_ITP_c': 7, 'M_KDO__45__8P_c': 2, 'M_KDO_c': 2, - 'M_K__43___c': 17, - 'M_LAUROYLCOA__45__CPD_c': 3, - 'M_LEU_c': 17, - 'M_LINOLEIC_ACID_c': 17, - 'M_LINOLENIC_ACID_c': 17, - 'M_LINOLENOYL__45__COA_c': 2, - 'M_LYS_c': 17, 'M_L__45__1__45__LYSOPHOSPHATIDATE_c': 1, 'M_L__45__ALPHA__45__ALANINE_c': 17, 'M_L__45__ARGININO__45__SUCCINATE_c': 13, 'M_L__45__ASPARTATE__45__SEMIALDEHYDE_c': 17, - 'M_L__45__ASPARTATE_c': 17, + 'M_L__45__ASPARTATE_c': 15, 'M_L__45__BETA__45__ASPARTYL__45__P_c': 17, 'M_L__45__CITRULLINE_c': 13, 'M_L__45__CYSTATHIONINE_c': 9, @@ -477,8 +436,8 @@ 'M_L__45__DELTA1__45__PYRROLINE_5__45__CARBOXYLATE_c': 16, 'M_L__45__DI__45__GMP_c': 2, 'M_L__45__GAMMA__45__GLUTAMYLCYSTEINE_c': 9, - 'M_L__45__GLUTAMATE_GAMMA__45__SEMIALDEHYDE_c': 16, 'M_L__45__GLUTAMATE__45__5__45__P_c': 15, + 'M_L__45__GLUTAMATE_GAMMA__45__SEMIALDEHYDE_c': 16, 'M_L__45__GLYCERALDEHYDE__45__3__45__PHOSPHATE_c': 7, 'M_L__45__HISTIDINOL__45__P_c': 14, 'M_L__45__LACTATE_c': 15, @@ -487,45 +446,46 @@ 'M_L__45__THREO__45__3__45__METHYL__45__ASPARTATE_c': 1, 'M_L__45__XYLULOSE__45__5__45__P_c': 8, 'M_Large__45__branched__45__glucans_c': 4, + 'M_LAUROYLCOA__45__CPD_c': 3, + 'M_LEU_c': 3, + 'M_LINOLENOYL__45__COA_c': 2, 'M_Long__45__linear__45__glucans_c': 1, + 'M_MAL_c': 10, 'M_MALONATE__45__S__45__ALD_c': 2, 'M_MALONYL__45__COA_c': 3, - 'M_MALTOSE_c': 17, - 'M_MAL_c': 10, - 'M_MANNITOL_c': 17, + 'M_MALTOSE_c': 4, + 'M_MANNITOL_c': 4, 'M_MANNOSE__45__1P_c': 2, 'M_MANNOSE_c': 1, 'M_MESACONATE_c': 1, + 'M_MET_c': 12, 'M_METHYL__45__GLYOXAL_c': 9, 'M_METHYL__45__MALONYL__45__COA_c': 1, - 'M_MET_c': 17, - 'M_MG__43__2_c': 17, - 'M_MN__43__2_c': 17, 'M_MYO__45__INOSITOL_c': 7, + 'M_N__45__23__45__DIHYDROXYBENZOYL__45__L__45__SERINE_c': 2, + 'M_N__45__5__45__PHOSPHORIBOSYL__45__ANTHRANILATE_c': 13, + 'M_N__45__ACETYL__45__D__45__GLUCOSAMINE__45__1__45__P_c': 3, + 'M_N__45__ACETYL__45__GLUTAMYL__45__P_c': 1, + 'M_N__45__ALPHA__45__ACETYLORNITHINE_c': 1, + 'M_N__45__FORMIMINO__45__L__45__GLUTAMATE_c': 4, 'M_N2__45__SUCCINYLGLUTAMATE_c': 1, + 'M_NA__43___e': 3, + 'M_NAD_c': 17, 'M_NADH_c': 17, - 'M_NADPH_c': 17, 'M_NADP_c': 17, - 'M_NAD_c': 17, - 'M_NA__43___c': 17, - 'M_NA__43___e': 3, + 'M_NADPH_c': 17, 'M_NEOKESTOSE_c': 1, - 'M_NIACINAMIDE_c': 17, - 'M_NIACINE_c': 17, + 'M_NIACINAMIDE_c': 2, + 'M_NIACINE_c': 8, 'M_NICOTINAMIDE_NUCLEOTIDE_c': 17, 'M_NICOTINAMIDE_RIBOSE_c': 17, 'M_NICOTINATE_NUCLEOTIDE_c': 16, 'M_NMNH_c': 13, 'M_NYSTOSE_c': 1, - 'M_N__45__23__45__DIHYDROXYBENZOYL__45__L__45__SERINE_c': 2, - 'M_N__45__5__45__PHOSPHORIBOSYL__45__ANTHRANILATE_c': 13, - 'M_N__45__ACETYL__45__D__45__GLUCOSAMINE__45__1__45__P_c': 3, - 'M_N__45__ACETYL__45__GLUTAMYL__45__P_c': 1, - 'M_N__45__ALPHA__45__ACETYLORNITHINE_c': 1, - 'M_N__45__FORMIMINO__45__L__45__GLUTAMATE_c': 4, + 'M_O__45__SUCCINYL__45__L__45__HOMOSERINE_c': 9, + 'M_O__45__SUCCINYLBENZOATE_c': 4, 'M_OH__45__PYR_c': 1, 'M_OH_c': 9, - 'M_OLEATE__45__CPD_c': 17, 'M_OLEOYL__45__COA_c': 2, 'M_OROTATE_c': 11, 'M_OROTIDINE__45__5__45__PHOSPHATE_c': 10, @@ -535,114 +495,105 @@ 'M_OXAMATE_c': 2, 'M_OXIDIZED__45__GLUTATHIONE_c': 3, 'M_OXYGEN__45__MOLECULE_c': 9, - 'M_O__45__SUCCINYLBENZOATE_c': 4, - 'M_O__45__SUCCINYL__45__L__45__HOMOSERINE_c': 9, + 'M_P__45__AMINO__45__BENZOATE_c': 13, + 'M_P__45__HYDROXY__45__PHENYLPYRUVATE_c': 3, + 'M_P__45__RIBOSYL__45__4__45__SUCCCARB__45__AMINOIMIDAZOLE_c': 1, 'M_P3I_c': 9, - 'M_PALMITATE_c': 17, + 'M_PALMITATE_c': 1, 'M_PALMITYL__45__COA_c': 3, 'M_PANTETHEINE__45__P_c': 3, - 'M_PANTOTHENATE_c': 17, + 'M_PANTOTHENATE_c': 1, + 'M_PHE_c': 2, 'M_PHENYL__45__PYRUVATE_c': 4, - 'M_PHE_c': 17, + 'M_PHOSPHO__45__ENOL__45__PYRUVATE_c': 17, 'M_PHOSPHORIBOSYL__45__AMP_c': 14, 'M_PHOSPHORIBOSYL__45__ATP_c': 14, 'M_PHOSPHORIBOSYL__45__FORMAMIDO__45__CARBOXAMIDE_c': 1, 'M_PHOSPHORIBOSYL__45__FORMIMINO__45__AICAR__45__P_c': 14, 'M_PHOSPHORIBULOSYL__45__FORMIMINO__45__AICAR__45__P_c': 14, - 'M_PHOSPHO__45__ENOL__45__PYRUVATE_c': 17, + 'M_Pi_c': 17, 'M_PPI_c': 17, 'M_PREPHENATE_c': 10, + 'M_PRO_c': 1, 'M_PROPIONATE_c': 1, 'M_PROPIONYL__45__COA_c': 1, 'M_PROPIONYL__45__P_c': 1, 'M_PROTON_c': 17, 'M_PROTON_e': 17, - 'M_PRO_c': 17, 'M_PRPP_c': 16, 'M_PUTRESCINE_c': 6, + 'M_PYRIDOXAL_c': 9, 'M_PYRIDOXAL_PHOSPHATE_c': 15, - 'M_PYRIDOXAL_c': 17, 'M_PYRIDOXAMINE__45__5P_c': 15, - 'M_PYRIDOXAMINE_c': 17, + 'M_PYRIDOXAMINE_c': 2, 'M_PYRIDOXINE__45__5P_c': 15, - 'M_PYRIDOXINE_c': 17, + 'M_PYRIDOXINE_c': 2, 'M_PYRUVATE_c': 17, - 'M_P__45__AMINO__45__BENZOATE_c': 13, - 'M_P__45__HYDROXY__45__PHENYLPYRUVATE_c': 3, - 'M_P__45__RIBOSYL__45__4__45__SUCCCARB__45__AMINOIMIDAZOLE_c': 1, - 'M_Pi_c': 17, 'M_QUINATE_c': 1, 'M_QUINOLINATE_c': 16, - 'M_RIBOFLAVIN_c': 17, + 'M_R__45____45__ALLANTOIN_c': 2, + 'M_R__45__4__45__PHOSPHOPANTOTHENOYL__45__L__45__CYSTEINE_c': 3, + 'M_RIBOFLAVIN_c': 9, 'M_RIBOSE__45__1P_c': 10, 'M_RIBOSE__45__5P_c': 16, 'M_RIBULOSE__45__5P_c': 16, - 'M_R__45__4__45__PHOSPHOPANTOTHENOYL__45__L__45__CYSTEINE_c': 3, - 'M_R__45____45__ALLANTOIN_c': 2, - 'M_Retinols_c': 17, + 'M_S__45__ADENOSYLMETHIONINAMINE_c': 6, + 'M_S__45__ADENOSYLMETHIONINE_c': 17, + 'M_S__45__ALLANTOIN_c': 2, + 'M_S__45__CITRAMALATE_c': 1, + 'M_S__45__LACTOYL__45__GLUTATHIONE_c': 3, + 'M_SER_c': 15, 'M_SERYL__45__AMP_c': 2, - 'M_SER_c': 17, 'M_SHIKIMATE__45__5P_c': 15, 'M_SHIKIMATE_c': 15, + 'M_Short__45__glucans_c': 1, 'M_SO3_c': 1, - 'M_SORBITOL_c': 17, 'M_SPERMIDINE_c': 5, - 'M_STEARIC_ACID_c': 17, 'M_STEAROYL__45__COA_c': 2, - 'M_SUCC__45__S__45__ALD_c': 2, - 'M_SUCROSE_c': 17, 'M_SUC__45__COA_c': 1, 'M_SUC_c': 15, + 'M_SUCC__45__S__45__ALD_c': 2, + 'M_SUCROSE_c': 1, 'M_SUPER__45__OXIDE_c': 8, - 'M_S__45__ADENOSYLMETHIONINAMINE_c': 6, - 'M_S__45__ADENOSYLMETHIONINE_c': 17, - 'M_S__45__ALLANTOIN_c': 2, - 'M_S__45__CITRAMALATE_c': 1, - 'M_S__45__LACTOYL__45__GLUTATHIONE_c': 3, - 'M_Short__45__glucans_c': 1, - 'M_Starch_c': 17, 'M_TARTRONATE__45__S__45__ALD_c': 1, - 'M_TETRACOSANOATE_c': 17, - 'M_THF_c': 17, - 'M_THIAMINE__45__PYROPHOSPHATE_c': 16, + 'M_THF_c': 5, 'M_THIAMINE__45__P_c': 4, - 'M_THIAMINE_c': 17, + 'M_THIAMINE__45__PYROPHOSPHATE_c': 16, + 'M_THIAMINE_c': 4, + 'M_THR_c': 6, 'M_THREO__45__DS__45__ISO__45__CITRATE_c': 13, - 'M_THR_c': 17, 'M_THZ__45__P_c': 4, 'M_THZ_c': 4, - 'M_TRP_c': 17, + 'M_TRP_c': 13, + 'M_TYR_c': 3, 'M_TYRAMINE_c': 1, - 'M_TYR_c': 17, 'M_UDP__45__AA__45__GLUTAMATE_c': 3, 'M_UDP__45__ACETYL__45__CARBOXYVINYL__45__GLUCOSAMINE_c': 3, 'M_UDP__45__D__45__GALACTO__45__14__45__FURANOSE_c': 7, - 'M_UDP__45__MANNACA_c': 1, 'M_UDP__45__MANNAC_c': 3, - 'M_UDP__45__N__45__ACETYLMURAMATE_c': 3, + 'M_UDP__45__MANNACA_c': 1, 'M_UDP__45__N__45__ACETYL__45__D__45__GLUCOSAMINE_c': 3, + 'M_UDP__45__N__45__ACETYLMURAMATE_c': 3, 'M_UDP_c': 10, 'M_UMP_c': 10, 'M_URACIL_c': 3, - 'M_URATE_c': 17, + 'M_URATE_c': 11, 'M_UREA_c': 17, 'M_URIDINE_c': 10, 'M_UROCANATE_c': 4, 'M_UTP_c': 10, - 'M_VAL_c': 17, - 'M_VITAMIN_D3_c': 17, + 'M_VAL_c': 15, 'M_WATER_c': 17, 'M_XANTHINE_c': 10, 'M_XANTHOSINE__45__5__45__PHOSPHATE_c': 10, 'M_XANTHOSINE_c': 10, 'M_XTP_c': 7, - 'M_XYLITOL_c': 17, - 'M_XYLULOSE__45__5__45__PHOSPHATE_c': 16, - 'M_ZN__43__2_c': 17 - } + 'M_XYLITOL_c': 4, + 'M_XYLULOSE__45__5__45__PHOSPHATE_c': 16 +} -def test_m2m_cscope_call(): +def test_m2m_iscope_call(): """ Test m2m addedvalue when called in terminal. """ @@ -692,4 +643,4 @@ def test_m2m_cscope_call(): shutil.rmtree(respath) if __name__ == "__main__": - test_m2m_cscope_call() \ No newline at end of file + test_m2m_iscope_call() \ No newline at end of file diff --git a/test/test_m2m_metacom.py b/test/test_m2m_metacom.py index 35654ee..6f32e11 100644 --- a/test/test_m2m_metacom.py +++ b/test/test_m2m_metacom.py @@ -3,7 +3,7 @@ """ Description: -Test m2m addedvalue on 17 metabolic networks and a file representing growth medium (seeds). +Test m2m metacom on 17 metabolic networks and a file representing growth medium (seeds). """ import os @@ -77,7 +77,10 @@ 'M_REDUCED__45__MENAQUINONE_c', 'M_CPD__45__14378_c', 'M_2__45__METHYL__45__ACETO__45__ACETYL__45__COA_c', 'M_CPD0__45__2121_c', 'M_CPD__45__12773_c', 'M_CPD__45__13644_c', - 'M_ALPHA__45__GLC__45__6__45__P_c' + 'M_ALPHA__45__GLC__45__6__45__P_c', + 'M_CELLULOSE_c', + 'M_DODECANOATE_c', + 'M_STEARIC_ACID_c', } UNION = { @@ -94,70 +97,75 @@ } MIN_SIZE_COM = 13 PROD_TARGETS = { - "M_ALPHA__45__GLC__45__6__45__P_c", "M_CPD__45__448_c", "M_CPD__45__650_c", - "M_GLC__45__6__45__P_c", "M_CPD__45__12117_c", - "M_ALL__45__TRANS__45__HEPTAPRENYL__45__DIPHOSPHATE_c", "M_FADH2_c", - "M_UNDECAPRENYL__45__DIPHOSPHATE_c", "M_CPD__45__15199_c", - "M_CPD__45__18529_c", "M_HOMO__45__SER_c", "M_ACETOACETYL__45__COA_c", - "M_K__45__HEXANOYL__45__COA_c", - "M_OCTAPRENYL__45__METHYL__45__METHOXY__45__BENZQ_c", "M_CPD__45__12304_c", - "M_CPD__45__568_c", "M_PHOSPHORIBOSYL__45__CARBOXY__45__AMINOIMIDAZOLE_c", - "M_OCTAPRENYL__45__DIPHOSPHATE_c", "M_FARNESYL__45__PP_c", - "M_CPD__45__421_c", "M_CPD__45__12179_c", "M_CPD__45__597_c", - "M_CPD__45__221_c", "M_2__45__OCTAPRENYL__45__6__45__HYDROXYPHENOL_c", - "M_CPD__45__507_c", "M_CPD__45__7000_c", "M_MEVALONATE_c", - "M_CPD__45__4211_c", "M_GERANYL__45__PP_c", - "M_2__45__METHYL__45__ACETO__45__ACETYL__45__COA_c", - "M_2__45__OCTAPRENYL__45__6__45__METHOXYPHENOL_c", "M_DAMP_c", - "M_3__45__HYDROXY__45__3__45__METHYL__45__GLUTARYL__45__COA_c", - "M_CARBON__45__MONOXIDE_c", "M_CPD0__45__2123_c", "M_CPD__45__12305_c", - "M_ACETYL__45__ETCETERA__45__GLUCOSAMINYLDIPHOSPHOUND_c", - "M_ACETYL__45__D__45__GLUCOSAMINYLDIPHOSPHO__45__UNDECAPRE_c", - "M_CPD__45__12115_c", "M_CPD__45__466_c", "M_CPD__45__12173_c", - "M_ACETONE_c", "M_UDP__45__GLUCURONATE_c", "M_CPD__45__12175_c", - "M_LACTOYL__45__COA_c", "M_Teichoic__45__P__45__Gro__45__Glc_c", - "M_DEOXYADENOSINE_c", "M_DEOXYURIDINE_c", - "M_5__45__P__45__RIBOSYL__45__N__45__FORMYLGLYCINEAMIDE_c", - "M_CPD__45__14017_c", "M_CPD__45__9852_c", "M_CPD__45__569_c", - "M_CPD0__45__2339_c", "M_ACRYLYL__45__COA_c", "M_DEOXYCYTIDINE_c", - "M_C55__45__PP__45__GLCNAC__45__MANNACA_c", "M_HSO3_c", "M_CPD__45__804_c", - "M_Teichoic__45__P__45__Gro_c", "M_CPD__45__12307_c", - "M_2__45__OCTAPRENYLPHENOL_c", "M_TREHALOSE__45__6P_c", - "M_D__45__LACTOYL__45__COA_c", "M_TREHALOSE_c", - "M_O__45__PHOSPHO__45__L__45__HOMOSERINE_c", "M_CPD__45__9853_c", - "M_DELTA3__45__ISOPENTENYL__45__PP_c", - "M_5__45__BETA__45__L__45__THREO__45__PENTAPYRANOSYL__45__4__45__ULOSE__45___c", - "M_REDUCED__45__MENAQUINONE_c", "M_C4_c", "M_DADP_c", "M_CPD0__45__2121_c", - "M_CPD__45__592_c", - "M_UDP__45__4__45__AMINO__45__4__45__DEOXY__45__L__45__ARABINOSE_c", - "M_CPD0__45__2338_c", "M_CPD__45__12306_c", "M_CPD__45__9646_c", - "M_N2__45__SUCCINYLORNITHINE_c", "M_CPD__45__13644_c", - "M_CPD__45__12303_c", "M_CPD0__45__2244_c", "M_CPD__45__16607_c", - "M_S__45__3__45__HYDROXYBUTANOYL__45__COA_c", "M_S2O3_c", - "M_Teichoic__45__P__45__Gro__45__Glc_e", "M_CPD__45__16020_c", - "M_CPD__45__12773_c", "M_CPD0__45__2331_c", - "M_2__45__KETO__45__3__45__DEOXY__45__D__45__GLUCARATE_c", - "M_CPD0__45__2340_c", "M_FORMYL__45__COA_c", - "M_OCTAPRENYL__45__METHOXY__45__BENZOQUINONE_c", "M_CPD__45__499_c", - "M_CPD__45__822_c", "M_CPD__45__3462_c", "M_CPD__45__12309_c", - "M_CPD0__45__181_c", "M_CPD__45__12308_c", "M_RETINAL_c", - "M_CPD__45__672_c", "M_CMP__45__KDO_c", "M_METHACRYLYL__45__COA_c", - "M_CPD__45__12177_c", "M_CROTONYL__45__COA_c", - "M_3__45__OCTAPRENYL__45__4__45__HYDROXYBENZOATE_c", - "M_OH__45__HEXANOYL__45__COA_c", "M_CPD__45__7695_c", - "M_OXALYL__45__COA_c", "M_CPD__45__641_c", "M_CPD__45__12125_c", - "M_CPD__45__5802_c", "M_CH3__45__MALONATE__45__S__45__ALD_c", - "M_4__45__GUANIDO__45__BUTYRAMIDE_c", "M_CPD__45__19958_c", - "M_CPD__45__14378_c", "M_INDOLEYL__45__CPD_c", - "M_5__45__PHOSPHORIBOSYL__45__N__45__FORMYLGLYCINEAMIDINE_c", - "M_5__45__PHOSPHORIBOSYL__45__5__45__AMINOIMIDAZOLE_c", - "M_CPD__45__14021_c", "M_MANNITOL_c" + 'M_3__45__OCTAPRENYL__45__4__45__HYDROXYBENZOATE_c', + 'M_CPD__45__16607_c', 'M_CPD__45__221_c', 'M_CPD__45__16020_c', + 'M_D__45__LACTOYL__45__COA_c', 'M_CPD__45__7695_c', + 'M_CPD0__45__2331_c', 'M_CARBON__45__MONOXIDE_c', 'M_CPD__45__12306_c', + 'M_CPD0__45__2123_c', 'M_HSO3_c', 'M_CPD__45__12179_c', + 'M_CPD0__45__2338_c', 'M_CPD__45__641_c', 'M_CPD__45__421_c', + 'M_ACETOACETYL__45__COA_c', 'M_FADH2_c', 'M_TREHALOSE__45__6P_c', + 'M_ACETONE_c', 'M_CPD__45__592_c', 'M_DEOXYCYTIDINE_c', + 'M_C55__45__PP__45__GLCNAC__45__MANNACA_c', + 'M_ACETYL__45__ETCETERA__45__GLUCOSAMINYLDIPHOSPHOUND_c', + 'M_HOMO__45__SER_c', 'M_RETINAL_c', + 'M_S__45__3__45__HYDROXYBUTANOYL__45__COA_c', 'M_CPD__45__672_c', + 'M_CPD__45__18529_c', + 'M_5__45__PHOSPHORIBOSYL__45__N__45__FORMYLGLYCINEAMIDINE_c', + 'M_DELTA3__45__ISOPENTENYL__45__PP_c', 'M_CPD__45__507_c', + 'M_K__45__HEXANOYL__45__COA_c', 'M_OCTAPRENYL__45__DIPHOSPHATE_c', + 'M_CMP__45__KDO_c', 'M_DEOXYADENOSINE_c', 'M_OXALYL__45__COA_c', + 'M_C4_c', 'M_METHACRYLYL__45__COA_c', 'M_FARNESYL__45__PP_c', + 'M_PHOSPHORIBOSYL__45__CARBOXY__45__AMINOIMIDAZOLE_c', + 'M_CROTONYL__45__COA_c', 'M_MEVALONATE_c', + 'M_4__45__GUANIDO__45__BUTYRAMIDE_c', 'M_CPD__45__12305_c', + 'M_DEOXYURIDINE_c', 'M_2__45__OCTAPRENYL__45__6__45__METHOXYPHENOL_c', + 'M_CPD__45__650_c', 'M_CPD__45__14017_c', 'M_CPD__45__14021_c', + 'M_5__45__PHOSPHORIBOSYL__45__5__45__AMINOIMIDAZOLE_c', + 'M_INDOLEYL__45__CPD_c', 'M_CPD__45__569_c', + 'M_O__45__PHOSPHO__45__L__45__HOMOSERINE_c', 'M_CPD__45__7000_c', + 'M_CPD__45__568_c', 'M_LACTOYL__45__COA_c', 'M_CPD__45__12303_c', + 'M_CPD__45__12173_c', 'M_FORMYL__45__COA_c', 'M_CPD__45__499_c', + 'M_CPD0__45__2340_c', + 'M_OCTAPRENYL__45__METHYL__45__METHOXY__45__BENZQ_c', + 'M_2__45__OCTAPRENYL__45__6__45__HYDROXYPHENOL_c', + 'M_GERANYL__45__PP_c', 'M_Teichoic__45__P__45__Gro__45__Glc_c', + 'M_CPD__45__4211_c', 'M_CPD__45__9646_c', 'M_CPD__45__12177_c', + 'M_OCTAPRENYL__45__METHOXY__45__BENZOQUINONE_c', 'M_CPD__45__466_c', + 'M_DADP_c', 'M_CPD__45__5802_c', 'M_CPD__45__804_c', + 'M_ACRYLYL__45__COA_c', 'M_UNDECAPRENYL__45__DIPHOSPHATE_c', + 'M_Teichoic__45__P__45__Gro_c', + 'M_2__45__KETO__45__3__45__DEOXY__45__D__45__GLUCARATE_c', + 'M_CPD__45__12307_c', 'M_OH__45__HEXANOYL__45__COA_c', + 'M_CPD0__45__181_c', 'M_CPD__45__822_c', + 'M_Teichoic__45__P__45__Gro__45__Glc_e', 'M_CPD0__45__2339_c', + 'M_CPD__45__12304_c', 'M_CPD__45__448_c', 'M_CPD__45__3462_c', + 'M_ACETYL__45__D__45__GLUCOSAMINYLDIPHOSPHO__45__UNDECAPRE_c', + 'M_2__45__OCTAPRENYLPHENOL_c', 'M_UDP__45__GLUCURONATE_c', + 'M_CPD__45__9852_c', 'M_DAMP_c', 'M_CPD__45__15199_c', + 'M_CPD0__45__2244_c', + 'M_UDP__45__4__45__AMINO__45__4__45__DEOXY__45__L__45__ARABINOSE_c', + 'M_CPD__45__12117_c', 'M_S2O3_c', 'M_CPD__45__12175_c', + 'M_CPD__45__12125_c', 'M_CPD__45__19958_c', + 'M_5__45__P__45__RIBOSYL__45__N__45__FORMYLGLYCINEAMIDE_c', + 'M_CPD__45__12115_c', 'M_N2__45__SUCCINYLORNITHINE_c', + 'M_CPD__45__597_c', + 'M_ALL__45__TRANS__45__HEPTAPRENYL__45__DIPHOSPHATE_c', + 'M_3__45__HYDROXY__45__3__45__METHYL__45__GLUTARYL__45__COA_c', + 'M_TREHALOSE_c', 'M_CPD__45__12309_c', + 'M_CH3__45__MALONATE__45__S__45__ALD_c', 'M_GLC__45__6__45__P_c', + 'M_CPD__45__12308_c', 'M_CPD__45__9853_c', + 'M_5__45__BETA__45__L__45__THREO__45__PENTAPYRANOSYL__45__4__45__ULOSE__45___c', + 'M_REDUCED__45__MENAQUINONE_c', 'M_CPD__45__14378_c', + 'M_2__45__METHYL__45__ACETO__45__ACETYL__45__COA_c', + 'M_CPD0__45__2121_c', 'M_CPD__45__12773_c', 'M_CPD__45__13644_c', + 'M_ALPHA__45__GLC__45__6__45__P_c', + 'M_MANNITOL_c' } NUMBER_BACT = 17 -SIZE_UNION = 625 -SIZE_INTERSECTION = 135 -SIZE_CSCOPE = 651 +SIZE_UNION = 576 +SIZE_INTERSECTION = 50 +SIZE_CSCOPE = 698 def test_m2m_metacom_call(): diff --git a/test/test_tiny_toy_metacom.py b/test/test_tiny_toy_metacom.py new file mode 100644 index 0000000..4c5337e --- /dev/null +++ b/test/test_tiny_toy_metacom.py @@ -0,0 +1,495 @@ +#!/usr/bin/env python3 +# -*- coding: utf-8 -*- + +""" +Description: +Test m2m metacom on a tiny dataset that is easily visualized and understood. +""" + +import os +import shutil +import subprocess +import tarfile +import json +from libsbml import SBMLReader +from metage2metabo.m2m.m2m_workflow import metacom_analysis + +ISCOPE = { + "bact1": [ + "M_A_c", + "M_B_c", + "M_R_c", + "M_H_c", + "M_D_c" + ], + "bact6": [], + "bact4": [ + "M_B_c" + ], + "bact5": [], + "bact2": [ + "M_B_c", + "M_N_c", + "M_A_c", + "M_S2_c" + ], + "bact3": [ + "M_A_c", + "M_B_c" + ] +} + +PRODUCED_SEEDS_ISCOPE = { + "individually_producible_seeds": { + "bact1": [], + "bact2": [ + "M_S2_c" + ], + "bact3": [], + "bact4": [], + "bact5": [], + "bact6": [] + }, + "seeds_absent_in_metabolic_network": { + "bact1": [], + "bact2": [], + "bact3": [], + "bact4": [ + "M_S1_c" + ], + "bact5": [ + "M_S2_c" + ], + "bact6": [ + "M_S2_c" + ] + } +} + +REV_ISCOPE = { + "M_A_c": [ + "bact1", + "bact2", + "bact3" + ], + "M_B_c": [ + "bact1", + "bact4", + "bact2", + "bact3" + ], + "M_D_c": [ + "bact1" + ], + "M_H_c": [ + "bact1" + ], + "M_N_c": [ + "bact2" + ], + "M_R_c": [ + "bact1" + ], + "M_S2_c": [ + "bact2" + ] +} + +COMSCOPE = { + "com_scope": [ + "M_X_c", + "M_N_c", + "M_A_c", + "M_K_c", + "M_F_c", + "M_B_c", + "M_C_c", + "M_R_c", + "M_H_c", + "M_V_c", + "M_S1_c", + "M_D_c", + "M_E_c", + "M_G_c", + "M_S2_c" + ] +} + +REV_CSCOPE = { + "M_A_c": [ + "bact1", + "bact2", + "bact3" + ], + "M_B_c": [ + "bact1", + "bact2", + "bact3", + "bact4" + ], + "M_C_c": [ + "bact5", + "bact6" + ], + "M_D_c": [ + "bact1" + ], + "M_E_c": [ + "bact2", + "bact3" + ], + "M_F_c": [ + "bact1" + ], + "M_G_c": [ + "bact2", + "bact3" + ], + "M_H_c": [ + "bact1", + "bact2", + "bact3" + ], + "M_K_c": [ + "bact5", + "bact6" + ], + "M_N_c": [ + "bact2", + "bact3" + ], + "M_R_c": [ + "bact1" + ], + "M_S1_c": [ + "bact3" + ], + "M_S2_c": [ + "bact2" + ], + "M_V_c": [ + "bact5" + ], + "M_X_c": [ + "bact4", + "bact6" + ] +} + +MICROBE_CONTRIBUTIONS = { + "bact1": { + "community_metabolic_gain": [ + "M_F_c" + ], + "produced_alone": [ + "M_A_c", + "M_H_c", + "M_D_c", + "M_B_c", + "M_R_c" + ], + "produced_in_community": [ + "M_H_c", + "M_D_c", + "M_R_c", + "M_A_c", + "M_F_c", + "M_B_c" + ] + }, + "bact2": { + "community_metabolic_gain": [ + "M_H_c", + "M_E_c", + "M_G_c" + ], + "produced_alone": [ + "M_B_c", + "M_N_c", + "M_S2_c", + "M_A_c" + ], + "produced_in_community": [ + "M_H_c", + "M_E_c", + "M_S2_c", + "M_A_c", + "M_B_c", + "M_N_c", + "M_G_c" + ] + }, + "bact3": { + "community_metabolic_gain": [ + "M_N_c", + "M_E_c", + "M_H_c", + "M_S1_c", + "M_G_c" + ], + "produced_alone": [ + "M_B_c", + "M_A_c" + ], + "produced_in_community": [ + "M_H_c", + "M_E_c", + "M_A_c", + "M_N_c", + "M_S1_c", + "M_B_c", + "M_G_c" + ] + }, + "bact4": { + "community_metabolic_gain": [ + "M_X_c" + ], + "produced_alone": [ + "M_B_c" + ], + "produced_in_community": [ + "M_B_c", + "M_X_c" + ] + }, + "bact5": { + "community_metabolic_gain": [ + "M_K_c", + "M_V_c", + "M_C_c" + ], + "produced_alone": [], + "produced_in_community": [ + "M_K_c", + "M_V_c", + "M_C_c" + ] + }, + "bact6": { + "community_metabolic_gain": [ + "M_X_c", + "M_K_c", + "M_C_c" + ], + "produced_alone": [], + "produced_in_community": [ + "M_X_c", + "M_K_c", + "M_C_c" + ] + } +} + +EXPECTED_TARGETS_ADVAL = { + 'M_X_c', + 'M_K_c', + 'M_F_c', + 'M_C_c', + 'M_E_c', + 'M_V_c', + 'M_S1_c', + 'M_G_c' +} + +UNION_MINCOM = { + 'bact1', 'bact2', 'bact3', + 'bact5', 'bact6' +} +INTERSECTION_MINCOM = { + 'bact1' +} +MIN_SIZE_COM = 3 +PROD_TARGETS = { + 'M_C_c', + 'M_F_c', + 'M_H_c' +} + +UNPROD_TARGETS = { + 'M_foo_c' +} + +PRODUCIBILITY_TARGETS = { + "unproducible": [ + "M_foo_c" + ], + "producible": [ + "M_F_c", + "M_C_c", + "M_H_c" + ], + "indiv_producible": [ + "M_H_c" + ], + "individual_producers": { + "M_H_c": [ + "bact1" + ] + }, + "com_only_producers": { + "M_C_c": [ + "bact5", + "bact6" + ], + "M_H_c": [ + "bact2", + "bact3" + ], + "M_F_c": [ + "bact1" + ] + }, + "mincom_producible": [ + "M_F_c", + "M_C_c", + "M_H_c" + ], + "key_species": [ + "bact2", + "bact1", + "bact5", + "bact3", + "bact6" + ], + "mincom_union_producers": { + "M_C_c": [ + "bact5", + "bact6" + ], + "M_H_c": [ + "bact1", + "bact3", + "bact2" + ], + "M_F_c": [ + "bact1" + ] + }, + "mincom_inter_producers": { + "M_H_c": [ + "bact1" + ], + "M_F_c": [ + "bact1" + ] + } +} + +NUMBER_BACT = 6 +SIZE_UNION = 7 +SIZE_INTERSECTION = 0 +SIZE_CSCOPE = 15 + + +def test_m2m_metacom_tiny_toy(): + """ + Test m2m metacom when called from the API using the tiny toy dataset. + """ + # RUN THE COMMAND + inppath = 'metabolic_data/tiny_toy' + respath = 'tiny_metacom_output' + networks_path = os.path.join(inppath, 'networks') + seeds_path = os.path.join(inppath, 'seeds_community.sbml') + targets_path = os.path.join(inppath, 'targets_community.sbml') + + metacom_analysis(sbml_dir = networks_path, out_dir = respath, seeds = seeds_path, host_mn = None, targets_file = targets_path, cpu_number=1) + + # result files + iscope_file = os.path.join(*[respath, 'indiv_scopes', 'indiv_scopes.json']) + rev_iscope_file = os.path.join(*[respath, 'indiv_scopes', 'rev_iscope.json']) + seeds_iscope_file = os.path.join(*[respath, 'indiv_scopes', 'seeds_in_indiv_scopes.json']) + cscope_file = os.path.join(*[respath, 'community_analysis', 'comm_scopes.json']) + rev_cscope_file = os.path.join(*[respath, 'community_analysis', 'rev_cscope.json']) + contrib_microbes_file = os.path.join(*[respath, 'community_analysis', 'contributions_of_microbes.json']) + addedvalue_file = os.path.join(*[respath, 'community_analysis', 'addedvalue.json']) + mincom_file = os.path.join(*[respath, 'community_analysis', 'mincom.json']) + targets_file = os.path.join(*[respath, 'community_analysis', 'targets.sbml']) + producibility_targets_file = os.path.join(*[respath, 'producibility_targets.json']) + + # open and load all result files + with open(iscope_file, 'r') as json_idata: + iscope = json.load(json_idata) + with open(rev_iscope_file, 'r') as json_idata: + rev_iscope = json.load(json_idata) + with open(seeds_iscope_file, 'r') as json_idata: + seeds_iscope = json.load(json_idata) + with open(cscope_file, 'r') as json_cdata: + cscope = json.load(json_cdata) + with open(rev_cscope_file, 'r') as json_cdata: + rev_cscope = json.load(json_cdata) + with open(contrib_microbes_file, 'r') as json_cdata: + contrib_microbes = json.load(json_cdata) + with open(addedvalue_file, 'r') as json_cdata: + addedvalue = json.load(json_cdata) + with open(mincom_file, 'r') as json_cdata: + mincom = json.load(json_cdata) + with open(producibility_targets_file, 'r') as json_data: + producibility_targets = json.load(json_data) + + # ISCOPE ANALYSIS + # ensure there is the right number of computed indiv scopes + assert len(iscope) == NUMBER_BACT + # ensure the union and intersection are ok + iscope_set = {} + for elem in iscope: + iscope_set[elem] = set(iscope[elem]) + # union of iscopes + union_iscope = set.union(*list(iscope_set.values())) + assert len(union_iscope) == SIZE_UNION + intersection_iscope = set.intersection(*list(iscope_set.values())) + # intersection of iscopes + assert len(intersection_iscope) == SIZE_INTERSECTION + # scope content + for bact in iscope: + assert set(iscope[bact]) == set(ISCOPE[bact]) + # reverse iscope + for compound in rev_iscope: + assert set(rev_iscope[compound]) == set(REV_ISCOPE[compound]) + # seeds in iscope + for category in PRODUCED_SEEDS_ISCOPE: + for bact in PRODUCED_SEEDS_ISCOPE[category]: + assert set(seeds_iscope[category][bact]) == set(PRODUCED_SEEDS_ISCOPE[category][bact]) + + # CSCOPE ANALYSIS + # comscope content + assert set(cscope['com_scope']) == set(COMSCOPE['com_scope']) + # reverse cscope + for compound in rev_cscope: + assert set(rev_cscope[compound]) == set(REV_CSCOPE[compound]) + # contributions of microbes + for bact in contrib_microbes: + for category in contrib_microbes[bact]: + assert set(contrib_microbes[bact][category]) == set(MICROBE_CONTRIBUTIONS[bact][category]) + + # ADDEDVALUE ANALYSIS + # newly producible compounds + assert set(addedvalue['addedvalue']) == EXPECTED_TARGETS_ADVAL + reader = SBMLReader() + document = reader.readSBML(targets_file) + new_targets = set([specie.getId() for specie in document.getModel().getListOfSpecies()]) + assert new_targets == PROD_TARGETS.union(UNPROD_TARGETS) + + # MINCOM ANALYSIS + # ensure the minimal number of bacteria in a minimal community is ok + assert len(mincom['bacteria']) == MIN_SIZE_COM + # ensure the bacteria in union are ok + assert set(mincom['union_bacteria']) == UNION_MINCOM + # ensure the bacteria in intersection are ok + assert set(mincom['inter_bacteria']) == INTERSECTION_MINCOM + # ensure the newly producible targets are ok + assert set(mincom['producible']) == PROD_TARGETS + + # PRODUCIBILITY ANALYSIS + for key in PRODUCIBILITY_TARGETS: + if key in ["unproducible", "producible", "indiv_producible", "mincom_producible", "key_species"]: + assert set(producibility_targets[key]) == set(PRODUCIBILITY_TARGETS[key]) + else: + for compound in producibility_targets[key]: + assert set(producibility_targets[key][compound]) == set(PRODUCIBILITY_TARGETS[key][compound]) + + # clean + shutil.rmtree(respath) + +if __name__ == "__main__": + test_m2m_metacom_tiny_toy() diff --git a/tutorials/method_tutorial/images/tiny_toy_association_graph.pdf b/tutorials/method_tutorial/images/tiny_toy_association_graph.pdf new file mode 100644 index 0000000..25b25cd Binary files /dev/null and b/tutorials/method_tutorial/images/tiny_toy_association_graph.pdf differ diff --git a/tutorials/method_tutorial/images/tiny_toy_association_graph.png b/tutorials/method_tutorial/images/tiny_toy_association_graph.png new file mode 100644 index 0000000..40cf183 Binary files /dev/null and b/tutorials/method_tutorial/images/tiny_toy_association_graph.png differ diff --git a/tutorials/method_tutorial/images/tiny_toy_cscope.pdf b/tutorials/method_tutorial/images/tiny_toy_cscope.pdf new file mode 100644 index 0000000..196d28f Binary files /dev/null and b/tutorials/method_tutorial/images/tiny_toy_cscope.pdf differ diff --git a/tutorials/method_tutorial/images/tiny_toy_cscope.png b/tutorials/method_tutorial/images/tiny_toy_cscope.png new file mode 100644 index 0000000..9d477fd Binary files /dev/null and b/tutorials/method_tutorial/images/tiny_toy_cscope.png differ diff --git a/tutorials/method_tutorial/images/tiny_toy_cscope_metaorg.pdf b/tutorials/method_tutorial/images/tiny_toy_cscope_metaorg.pdf new file mode 100644 index 0000000..49376e2 Binary files /dev/null and b/tutorials/method_tutorial/images/tiny_toy_cscope_metaorg.pdf differ diff --git a/tutorials/method_tutorial/images/tiny_toy_cscope_metaorg.png b/tutorials/method_tutorial/images/tiny_toy_cscope_metaorg.png new file mode 100644 index 0000000..a7d1fff Binary files /dev/null and b/tutorials/method_tutorial/images/tiny_toy_cscope_metaorg.png differ diff --git a/tutorials/method_tutorial/images/tiny_toy_enum_sol.pdf b/tutorials/method_tutorial/images/tiny_toy_enum_sol.pdf new file mode 100644 index 0000000..cb94fd0 Binary files /dev/null and b/tutorials/method_tutorial/images/tiny_toy_enum_sol.pdf differ diff --git a/tutorials/method_tutorial/images/tiny_toy_enum_sol.png b/tutorials/method_tutorial/images/tiny_toy_enum_sol.png new file mode 100644 index 0000000..af560d3 Binary files /dev/null and b/tutorials/method_tutorial/images/tiny_toy_enum_sol.png differ diff --git a/tutorials/method_tutorial/images/tiny_toy_iscopes.pdf b/tutorials/method_tutorial/images/tiny_toy_iscopes.pdf new file mode 100644 index 0000000..b2fc3b7 Binary files /dev/null and b/tutorials/method_tutorial/images/tiny_toy_iscopes.pdf differ diff --git a/tutorials/method_tutorial/images/tiny_toy_iscopes.png b/tutorials/method_tutorial/images/tiny_toy_iscopes.png new file mode 100644 index 0000000..1ae5e80 Binary files /dev/null and b/tutorials/method_tutorial/images/tiny_toy_iscopes.png differ diff --git a/tutorials/method_tutorial/images/tiny_toy_mincom_onesol.png b/tutorials/method_tutorial/images/tiny_toy_mincom_onesol.png new file mode 100644 index 0000000..3d973ec Binary files /dev/null and b/tutorials/method_tutorial/images/tiny_toy_mincom_onesol.png differ diff --git a/tutorials/method_tutorial/images/tiny_toy_networks.pdf b/tutorials/method_tutorial/images/tiny_toy_networks.pdf new file mode 100644 index 0000000..5172aa4 Binary files /dev/null and b/tutorials/method_tutorial/images/tiny_toy_networks.pdf differ diff --git a/tutorials/method_tutorial/images/tiny_toy_networks.png b/tutorials/method_tutorial/images/tiny_toy_networks.png new file mode 100644 index 0000000..06bd00f Binary files /dev/null and b/tutorials/method_tutorial/images/tiny_toy_networks.png differ diff --git a/tutorials/method_tutorial/images/tiny_toy_powergraph.png b/tutorials/method_tutorial/images/tiny_toy_powergraph.png new file mode 100644 index 0000000..73be024 Binary files /dev/null and b/tutorials/method_tutorial/images/tiny_toy_powergraph.png differ diff --git a/tutorials/method_tutorial/m2m_tutorial.ipynb b/tutorials/method_tutorial/m2m_tutorial.ipynb index 81188b7..7bf1fce 100644 --- a/tutorials/method_tutorial/m2m_tutorial.ipynb +++ b/tutorials/method_tutorial/m2m_tutorial.ipynb @@ -10,13 +10,15 @@ "\n", "This jupyter notebook uses the following versions of these dependencies:\n", "\n", - "- Metage2Metabo==1.5.0\n", + "- Metage2Metabo==1.6.0\n", "- mpwt==0.6.0\n", "- padmet==5.0.1\n", - "- MeneTools==3.1.1\n", - "- Miscoto==3.1.1\n", + "- MeneTools==3.4.0\n", + "- Miscoto==3.2.0\n", "- bubbletools==0.6.11\n", - "- networkx==2.5" + "- networkx==2.5\n", + "- pandas\n", + "- tabulate==0.9.0" ] }, { @@ -40,34 +42,42 @@ "\n", "The seeds are composed of metabolites A and B.\n", "\n", - "With these inputs, the scope will be compute. As A and B are available as seed, they can be used as reactant to activate the reaction R1, which will produce metabolite D. But the reaction R2 will not be activated as the metabolite C is not availabe in the seeds.\n", + "With these inputs, the scope will be computed. As A and B are available as seed, they can be used as reactant to activate the reaction R1, which will produce metabolite D. But the reaction R2 will not be activated as the metabolite C is not availabe in the seeds.\n", "\n", "The metabolite D, produced by the reaction R1, can activate the reaction R3 to produce the metabolite F.\n", "\n", "

\n", " \n", - "

Figue 1: Individual scope
\n", + "
Figure 1: Individual scope
\n", "

\n", "\n", "### Tutorial\n", "\n", - "For this tutorial, we will use this example:" + "For this tutorial, we will use an example of a community consisting of 6 bacteria, i.e. 6 metabolic networks. The medium used for the simulation consists of 2 metabolites, $S_1$ and $S_2$. Additionally, we are interested in the production of 4 metabolites, $H$, $C$, $E$ and $foo$. In the SBML files of the dataset, each metabolite has the prefix $M\\_$, and a suffix corresponding to the compartment it belongs to, e.g. \"M_S1_c\" is the metabolite $S_1$ in the cytosol. \n", + "\n", + "The metabolic networks of bact1, bact2, bact3, bact4, bact5 and bact6 are depicted below:\n", + "\n", + "

\n", + " \n", + "

Figure 2: Metabolic networks of the toy community
\n", + "

\n", + "\n", + "In each of the networks, the compounds of the nutritional environmenent (i.e. the seeds) are depicted in yellow, and the metabolites we are interested in (i.e. the targets) are depicted in green.\n", + "\n", + "Let's look at the individual scope of each metabolic network in the medium consisting of the two seeds. " ] }, { "cell_type": "code", - "execution_count": 2, + "execution_count": 20, "metadata": {}, "outputs": [ { - "data": { - "text/plain": [ - "{'M_A_c', 'M_B_c', 'M_D_c', 'M_F_c'}" - ] - }, - "execution_count": 2, - "metadata": {}, - "output_type": "execute_result" + "name": "stdout", + "output_type": "stream", + "text": [ + "{'M_A_c', 'M_S2_c', 'M_H_c', 'M_B_c', 'M_R_c', 'M_N_c', 'M_D_c'}\n" + ] } ], "source": [ @@ -81,12 +91,23 @@ "# m2m iscope -n data/community -s data/seeds.sbml -o output_folder\n", "from metage2metabo.m2m import individual_scope\n", "\n", - "individual_scope.iscope('data/community', 'data/seeds.sbml', 'output_folder')\n" + "union_of_iscopes = individual_scope.iscope('../../test/metabolic_data/tiny_toy/networks/', '../../test/metabolic_data/tiny_toy/seeds_community.sbml', 'output_folder')\n", + "\n", + "## The scopes for each species will be computed in the output_folder. The union of all individual scopes will be returned as a set. \n", + "\n", + "print(union_of_iscopes)\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The union of the individual scope encompasses 7 metabolites. More details are provided in the output files. Let's look at them." ] }, { "cell_type": "code", - "execution_count": 3, + "execution_count": 21, "metadata": {}, "outputs": [ { @@ -94,7 +115,7 @@ "output_type": "stream", "text": [ "Individual scope:\n", - "OrgA: 4 metabolites (M_A_c, M_B_c, M_D_c, M_F_c).\n" + "bact1: 5 metabolites (M_A_c, M_B_c, M_D_c, M_H_c, M_R_c).\n" ] } ], @@ -108,7 +129,192 @@ "\n", "print('Individual scope:')\n", "\n", - "print('{0}: {1} metabolites ({2}).'.format('OrgA', len(iscope_json_data['OrgA']), ', '.join(sorted(iscope_json_data['OrgA']))))" + "print('{0}: {1} metabolites ({2}).'.format('bact1', len(iscope_json_data['bact1']), ', '.join(sorted(iscope_json_data['bact1']))))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We can check the individual scope of each network (circled in orange below):\n", + "\n", + "\n", + "

\n", + " \n", + "

Figure 3: Individual scopes
\n", + "

" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "bact1: 5 metabolites (M_A_c, M_B_c, M_D_c, M_H_c, M_R_c).\n", + "bact6: 0 metabolites ().\n", + "bact4: 1 metabolites (M_B_c).\n", + "bact5: 0 metabolites ().\n", + "bact2: 4 metabolites (M_A_c, M_B_c, M_N_c, M_S2_c).\n", + "bact3: 2 metabolites (M_A_c, M_B_c).\n" + ] + } + ], + "source": [ + "for org in iscope_json_data.keys():\n", + " print('{0}: {1} metabolites ({2}).'.format(org, len(iscope_json_data[org]), ', '.join(sorted(iscope_json_data[org]))))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We notice that two species cannot produce anything from the seeds: bact5 and bact6. \n", + "\n", + "Species bact1 has the largest scope, 5 metabolites. \n", + "\n", + "Species bact2 has the potential to produce one of the seeds, $S_2$. \n", + "\n", + "We notice that contrary to the initial definition of the scope above, only the seeds that can really be produced, i.e are the product of a reaction that is predicted to be activated, are included in the scopes. \n", + "\n", + "More generally, one might be interested in checking the status of the seeds, and take a look at the dedicated output:" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Seeds that cannot be individually produced:\n", + "{'bact1': ['M_S2_c', 'M_S1_c'], 'bact2': ['M_S1_c'], 'bact3': ['M_S2_c', 'M_S1_c'], 'bact4': ['M_S2_c'], 'bact5': ['M_S1_c'], 'bact6': ['M_S1_c']}\n", + "\n", + "Seeds that can be individually produced:\n", + "{'bact1': [], 'bact2': ['M_S2_c'], 'bact3': [], 'bact4': [], 'bact5': [], 'bact6': []}\n", + "\n", + "Seeds that do not appear in the metabolic networks:\n", + "{'bact1': [], 'bact2': [], 'bact3': [], 'bact4': ['M_S1_c'], 'bact5': ['M_S2_c'], 'bact6': ['M_S2_c']}\n" + ] + } + ], + "source": [ + "with open('output_folder/indiv_scopes/seeds_in_indiv_scopes.json') as f:\n", + " iscope_seeds_data = json.load(f)\n", + "\n", + "print('Seeds that cannot be individually produced:')\n", + "print(iscope_seeds_data['individually_non_producible_seeds'])\n", + "\n", + "print('\\nSeeds that can be individually produced:')\n", + "print(iscope_seeds_data['individually_producible_seeds'])\n", + "\n", + "print('\\nSeeds that do not appear in the metabolic networks:')\n", + "print(iscope_seeds_data['seeds_absent_in_metabolic_network'])" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Notable information here is that $S_1$ does not appear in bact2 and $S_2$ does not appear in bact5 nor 6. A context in which a seed never occurs in any of the networks might indicate an error in the seed definition, a typo for instance. \n", + "\n", + "The fact that a species has the ability to produce a seed could be interesting to suggest a renewal of the seed, and less competition for that molecule. \n", + "\n", + "Other outputs related to the individual scope computation are the reversed iscopes. They provide the same information but with a focus on the molecules rather than the species: which species are predicted to produce each molecule? The information is stored as a json file and as a matrix." + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "{\n", + " \"M_A_c\": [\n", + " \"bact1\",\n", + " \"bact2\",\n", + " \"bact3\"\n", + " ],\n", + " \"M_B_c\": [\n", + " \"bact1\",\n", + " \"bact4\",\n", + " \"bact2\",\n", + " \"bact3\"\n", + " ],\n", + " \"M_D_c\": [\n", + " \"bact1\"\n", + " ],\n", + " \"M_H_c\": [\n", + " \"bact1\"\n", + " ],\n", + " \"M_N_c\": [\n", + " \"bact2\"\n", + " ],\n", + " \"M_R_c\": [\n", + " \"bact1\"\n", + " ],\n", + " \"M_S2_c\": [\n", + " \"bact2\"\n", + " ]\n", + "}\n" + ] + } + ], + "source": [ + "with open('output_folder/indiv_scopes/rev_iscope.json') as f:\n", + " rev_iscope = json.load(f)\n", + "\n", + "print(json.dumps(rev_iscope, indent = 4, sort_keys=True))\n" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "| \tM_A_c\tM_H_c\tM_B_c\tM_R_c\tM_D_c\tM_S2_c\tM_N_c |\n", + "|:--|\n", + "| bact1\t1\t1\t1\t1\t1\t0\t0 |\n", + "| bact6\t0\t0\t0\t0\t0\t0\t0 |\n", + "| bact4\t0\t0\t1\t0\t0\t0\t0 |\n", + "| bact5\t0\t0\t0\t0\t0\t0\t0 |\n", + "| bact2\t1\t0\t1\t0\t0\t1\t1 |\n", + "| bact3\t1\t0\t1\t0\t0\t0\t0 |\n" + ] + } + ], + "source": [ + "import pandas \n", + "import csv\n", + "from io import StringIO\n", + "\n", + "print(pandas.read_csv('output_folder/indiv_scopes/rev_iscope.tsv').to_markdown(index=False))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Command line from the terminal\n", + "\n", + "Most of the time, you will launch m2m from the terminal rather than from the API. \n", + "The command for individual scope computation is the following, it will create all output files:\n", + "\n", + "```sh\n", + "m2m iscope -n ../../test/metabolic_data/tiny_toy/networks/ -s ../../test/metabolic_data/tiny_toy/seeds_community.sbml -o output_folder\n", + "```\n" ] }, { @@ -117,7 +323,7 @@ "source": [ "## `m2m cscope`: Community scope\n", "\n", - "This part encompasses the computation of the community scope.\n", + "This part encompasses the computation of the community scope, i.e. the metabolic potential of the community considering that interactions could occur between members. By interactions, we hereby mean cross feedings, or metabolic exchanges. We consider that any molecule can become a by-product or be shared through cross-feeding to other bacterial populations. We therefore assess the upper bound of the metabolic potential, considering that it is unlikely that all metabolites can be shared to all species. \n", "\n", "### Community scope formalisation\n", "\n", @@ -125,7 +331,7 @@ "\n", "$$CommunityScope(G_1.. G_N, S)=Scope \\left( \\left(\\bigcup_{i \\in {\\{1..n\\}}} R_i, \\bigcup_{i \\in {\\{1..n\\}}} M_i, \\bigcup_{i \\in {\\{1..n\\}}} E_i \\right), S \\right).$$ \n", "\n", - "### An example\n", + "### Application to the example dataset. \n", "\n", "In this example, we have 3 metabolic networks (OrgA, OrgB and OrgC). OrgA has 3 reactions (R1, R2 and R3). OrgB and OrgC have each 2 reactions, one they shared (R5) and one specific to each (R4 an R6).\n", "The formalism behind the community scope of m2m uses a Mixed Bag modelling, which considers the metabolic networks of the community as a meta-organism allowing exchanges between them without cost (described in the figure as a dotted line). \n", @@ -136,24 +342,31 @@ "\n", "No reaction in the metabolic network OrgA takes the metabolite F as reactant but the reaction R5 of metabolic networks OrgB and OrgC does. With the Mixed Bag modelling, the metabolite F is then available to be used by these organisms. This activates both R5 reactions and produces H. The reactions R4 and R6 can not be activated as their reactants are not available.\n", "\n", - "The Community Scope of OrgA, OrgB and OrgC is composed of the metabolites D, F and H.\n", + "The Community Scope of OrgA, OrgB and OrgC is composed of of 15 metabolites, all circled in blue.\n", + "\n", + "

\n", + " \n", + "

Figure 4: Community scope
\n", + "

\n", + "\n", + "We could visualise the meta-metabolism of an organism with all the merged reactions of the 6 species. \n", "\n", "

\n", - " \n", - "

Figue 2: Community scope
\n", + " \n", + "
Figure 5: Corresponding meta-organism sharing all metabolic capabilities
\n", "

" ] }, { "cell_type": "code", - "execution_count": 4, + "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ - "The community scope contains 3 metabolites: M_D_c, M_F_c, M_H_c.\n" + "The community scope contains 15 metabolites: M_A_c, M_B_c, M_C_c, M_D_c, M_E_c, M_F_c, M_G_c, M_H_c, M_K_c, M_N_c, M_R_c, M_S1_c, M_S2_c, M_V_c, M_X_c.\n" ] } ], @@ -163,18 +376,337 @@ "#m2m cscope -n data/community -s data/seeds.sbml -o output_folder\n", "from metage2metabo.m2m import community_scope\n", "\n", - "instance_path, network_scopes = community_scope.cscope('data/community', 'data/seeds.sbml', 'output_folder')\n", + "instance_path, network_scopes = community_scope.cscope('../../test/metabolic_data/tiny_toy/networks/', '../../test/metabolic_data/tiny_toy/seeds_community.sbml', 'output_folder')\n", "\n", "print('The community scope contains {0} metabolites: {1}.'.format(len(network_scopes), ', '.join(sorted(network_scopes))))" ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "It is however important to know which species are responsible for the prodicibility of each molecule, i.e. has an activated reaction that produce the molecules (although the activation of this reaction might rely on interactions with other species). \n", + "\n", + "This information is also provided by `m2m cscope`, in a dedicated output file. \n", + "\n", + "Overall the community scope command creates 4 output files : \n", + "\n", + "- the list of the compounds in the community scope, stored in `output_folder/community_analysis/comm_scopes.json`\n", + "- the contribution of each microbe to the community scope in `output_folder/community_analysis/contributions_of_microbes.json`\n", + "- the reverse community scope, with a focus on the molecules rather that the species, in json format\n", + "- the reverse community scope, with a focus on the molecules rather that the species, in tabulated format\n", + "\n", + "Let's look at the outputs:\n", + "\n", + "First the community scope itself:\n" + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "{\n", + " \"com_scope\": [\n", + " \"M_K_c\",\n", + " \"M_G_c\",\n", + " \"M_A_c\",\n", + " \"M_F_c\",\n", + " \"M_X_c\",\n", + " \"M_H_c\",\n", + " \"M_S2_c\",\n", + " \"M_B_c\",\n", + " \"M_E_c\",\n", + " \"M_C_c\",\n", + " \"M_R_c\",\n", + " \"M_V_c\",\n", + " \"M_S1_c\",\n", + " \"M_N_c\",\n", + " \"M_D_c\"\n", + " ]\n", + "}\n" + ] + } + ], + "source": [ + "with open('output_folder/community_analysis/comm_scopes.json') as f:\n", + " cscope = json.load(f)\n", + "\n", + "print(json.dumps(cscope, indent = 4, sort_keys=True))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The contribution of microbes file describes what each community member produces individually (iscope), and in community, and highlights the added-value of interactions, i.e. what can be produced in community that cannot be produced individually." + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "{\n", + " \"bact1\": {\n", + " \"community_metabolic_gain\": [\n", + " \"M_F_c\"\n", + " ],\n", + " \"produced_alone\": [\n", + " \"M_R_c\",\n", + " \"M_H_c\",\n", + " \"M_A_c\",\n", + " \"M_B_c\",\n", + " \"M_D_c\"\n", + " ],\n", + " \"produced_in_community\": [\n", + " \"M_A_c\",\n", + " \"M_F_c\",\n", + " \"M_R_c\",\n", + " \"M_H_c\",\n", + " \"M_D_c\",\n", + " \"M_B_c\"\n", + " ]\n", + " },\n", + " \"bact2\": {\n", + " \"community_metabolic_gain\": [\n", + " \"M_G_c\",\n", + " \"M_E_c\",\n", + " \"M_H_c\"\n", + " ],\n", + " \"produced_alone\": [\n", + " \"M_B_c\",\n", + " \"M_S2_c\",\n", + " \"M_A_c\",\n", + " \"M_N_c\"\n", + " ],\n", + " \"produced_in_community\": [\n", + " \"M_H_c\",\n", + " \"M_B_c\",\n", + " \"M_E_c\",\n", + " \"M_G_c\",\n", + " \"M_S2_c\",\n", + " \"M_A_c\",\n", + " \"M_N_c\"\n", + " ]\n", + " },\n", + " \"bact3\": {\n", + " \"community_metabolic_gain\": [\n", + " \"M_G_c\",\n", + " \"M_H_c\",\n", + " \"M_E_c\",\n", + " \"M_S1_c\",\n", + " \"M_N_c\"\n", + " ],\n", + " \"produced_alone\": [\n", + " \"M_B_c\",\n", + " \"M_A_c\"\n", + " ],\n", + " \"produced_in_community\": [\n", + " \"M_A_c\",\n", + " \"M_N_c\",\n", + " \"M_S1_c\",\n", + " \"M_H_c\",\n", + " \"M_E_c\",\n", + " \"M_B_c\",\n", + " \"M_G_c\"\n", + " ]\n", + " },\n", + " \"bact4\": {\n", + " \"community_metabolic_gain\": [\n", + " \"M_X_c\"\n", + " ],\n", + " \"produced_alone\": [\n", + " \"M_B_c\"\n", + " ],\n", + " \"produced_in_community\": [\n", + " \"M_X_c\",\n", + " \"M_B_c\"\n", + " ]\n", + " },\n", + " \"bact5\": {\n", + " \"community_metabolic_gain\": [\n", + " \"M_K_c\",\n", + " \"M_C_c\",\n", + " \"M_V_c\"\n", + " ],\n", + " \"produced_alone\": [],\n", + " \"produced_in_community\": [\n", + " \"M_K_c\",\n", + " \"M_C_c\",\n", + " \"M_V_c\"\n", + " ]\n", + " },\n", + " \"bact6\": {\n", + " \"community_metabolic_gain\": [\n", + " \"M_K_c\",\n", + " \"M_C_c\",\n", + " \"M_X_c\"\n", + " ],\n", + " \"produced_alone\": [],\n", + " \"produced_in_community\": [\n", + " \"M_K_c\",\n", + " \"M_X_c\",\n", + " \"M_C_c\"\n", + " ]\n", + " }\n", + "}\n" + ] + } + ], + "source": [ + "with open('output_folder/community_analysis/contributions_of_microbes.json') as f:\n", + " contribs = json.load(f)\n", + "\n", + "print(json.dumps(contribs, indent = 4, sort_keys=True))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The reverse community scopes is a dictionnary with compounds as keys and species that produce them in community as values." + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "{\n", + " \"M_A_c\": [\n", + " \"bact1\",\n", + " \"bact2\",\n", + " \"bact3\"\n", + " ],\n", + " \"M_B_c\": [\n", + " \"bact1\",\n", + " \"bact4\",\n", + " \"bact2\",\n", + " \"bact3\"\n", + " ],\n", + " \"M_C_c\": [\n", + " \"bact6\",\n", + " \"bact5\"\n", + " ],\n", + " \"M_D_c\": [\n", + " \"bact1\"\n", + " ],\n", + " \"M_E_c\": [\n", + " \"bact2\",\n", + " \"bact3\"\n", + " ],\n", + " \"M_F_c\": [\n", + " \"bact1\"\n", + " ],\n", + " \"M_G_c\": [\n", + " \"bact2\",\n", + " \"bact3\"\n", + " ],\n", + " \"M_H_c\": [\n", + " \"bact1\",\n", + " \"bact2\",\n", + " \"bact3\"\n", + " ],\n", + " \"M_K_c\": [\n", + " \"bact6\",\n", + " \"bact5\"\n", + " ],\n", + " \"M_N_c\": [\n", + " \"bact2\",\n", + " \"bact3\"\n", + " ],\n", + " \"M_R_c\": [\n", + " \"bact1\"\n", + " ],\n", + " \"M_S1_c\": [\n", + " \"bact3\"\n", + " ],\n", + " \"M_S2_c\": [\n", + " \"bact2\"\n", + " ],\n", + " \"M_V_c\": [\n", + " \"bact5\"\n", + " ],\n", + " \"M_X_c\": [\n", + " \"bact6\",\n", + " \"bact4\"\n", + " ]\n", + "}\n" + ] + } + ], + "source": [ + "with open('output_folder/community_analysis/rev_cscope.json') as f:\n", + " rev_cscope = json.load(f)\n", + "\n", + "print(json.dumps(rev_cscope, indent = 4, sort_keys=True))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The matrix of the community scope" + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "| \tM_A_c\tM_F_c\tM_R_c\tM_H_c\tM_D_c\tM_B_c\tM_K_c\tM_X_c\tM_C_c\tM_V_c\tM_E_c\tM_G_c\tM_S2_c\tM_N_c\tM_S1_c |\n", + "|:--|\n", + "| bact1\t1\t1\t1\t1\t1\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0 |\n", + "| bact6\t0\t0\t0\t0\t0\t0\t1\t1\t1\t0\t0\t0\t0\t0\t0 |\n", + "| bact4\t0\t0\t0\t0\t0\t1\t0\t1\t0\t0\t0\t0\t0\t0\t0 |\n", + "| bact5\t0\t0\t0\t0\t0\t0\t1\t0\t1\t1\t0\t0\t0\t0\t0 |\n", + "| bact2\t1\t0\t0\t1\t0\t1\t0\t0\t0\t0\t1\t1\t1\t1\t0 |\n", + "| bact3\t1\t0\t0\t1\t0\t1\t0\t0\t0\t0\t1\t1\t0\t1\t1 |\n" + ] + } + ], + "source": [ + "print(pandas.read_csv('output_folder/community_analysis/rev_cscope.tsv').to_markdown(index=False))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Command line from the terminal\n", + "\n", + "Most of the time, you will launch m2m from the terminal rather than from the API. \n", + "The command for community scope computation is the following, it will create all output files:\n", + "\n", + "```sh\n", + "m2m cscope -n ../../test/metabolic_data/tiny_toy/networks/ -s ../../test/metabolic_data/tiny_toy/seeds_community.sbml -o output_folder\n", + "```\n" + ] + }, { "cell_type": "markdown", "metadata": {}, "source": [ "## `m2m addedvalue`: Cooperation potential\n", "\n", - "This part encompasses the computation of the cooperation potential.\n", + "This part encompasses the computation of the cooperation potential, i.e. the set of molecules that can be produced in community but not individually given the seeds provided as input.\n", "\n", "### Cooperation potential formalisation\n", "\n", @@ -187,17 +719,17 @@ "\n", "### An example\n", "\n", - "In the **Figure 2**, the cooperation potential is:\n", + "In the **Figure 4**, the cooperation potential is:\n", "\n", - "$\\coopPot(OrgA,OrgB,OrgC,S) = \\{D, F, H\\} - \\{D, F\\} $\n", + "$\\coopPot(bact1, bact2, bact3, bact4, bact5, bact6) = \\{A, B, C, D, E, F, G, H, K, N, R, S2, S1, V, X\\} - \\{A, B, D, H, N, R, S2\\} $\n", "\n", - "$\\coopPot(OrgA,OrgB,OrgC,S) = \\{H\\} $\n", + "$\\coopPot(OrgA,OrgB,OrgC,S) = \\{C, E, F, G, H, K, S1, V, X\\} $\n", "\n" ] }, { "cell_type": "code", - "execution_count": 5, + "execution_count": 22, "metadata": {}, "outputs": [], "source": [ @@ -206,8 +738,8 @@ "# m2m addedvalue -n data/community -s data/seeds.sbml -o output_folder\n", "from metage2metabo.m2m import community_addedvalue, individual_scope, community_scope\n", "\n", - "networks_path = 'data/community/'\n", - "seeds_path = 'data/seeds.sbml'\n", + "networks_path = '../../test/metabolic_data/tiny_toy/networks/'\n", + "seeds_path = '../../test/metabolic_data/tiny_toy/seeds_community.sbml'\n", "output_folder = 'output_folder'\n", "\n", "iscope_metabolites = individual_scope.iscope(networks_path, seeds_path, output_folder)\n", @@ -218,19 +750,37 @@ }, { "cell_type": "code", - "execution_count": 6, + "execution_count": 23, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ - "The cooperation potential contains 1 metabolites: M_H_c.\n" + "The union of individual scopes contains 7 metabolites: M_A_c, M_B_c, M_D_c, M_H_c, M_N_c, M_R_c, M_S2_c.\n", + "The community scope contains 15 metabolites: M_A_c, M_B_c, M_C_c, M_D_c, M_E_c, M_F_c, M_G_c, M_H_c, M_K_c, M_N_c, M_R_c, M_S1_c, M_S2_c, M_V_c, M_X_c.\n", + "The cooperation potential contains 8 metabolites: M_C_c, M_E_c, M_F_c, M_G_c, M_K_c, M_S1_c, M_V_c, M_X_c.\n" ] } ], "source": [ - "print('The cooperation potential contains {0} metabolites: {1}.'.format(len(addedvalue), ', '.join(addedvalue)))" + "print('The union of individual scopes contains {0} metabolites: {1}.'.format(len(iscope_metabolites), ', '.join(sorted(iscope_metabolites))))\n", + "print('The community scope contains {0} metabolites: {1}.'.format(len(cscope_metabolites), ', '.join(sorted(cscope_metabolites))))\n", + "print('The cooperation potential contains {0} metabolites: {1}.'.format(len(addedvalue), ', '.join(sorted(addedvalue))))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Command line from the terminal\n", + "\n", + "Most of the time, you will launch m2m from the terminal rather than from the API. \n", + "The command for added value computation is the following, it will create all output files:\n", + "\n", + "```sh\n", + "m2m addedvalue -n ../../test/metabolic_data/tiny_toy/networks/ -s ../../test/metabolic_data/tiny_toy/seeds_community.sbml -o output_folder\n", + "```\n" ] }, { @@ -264,8 +814,8 @@ "Green, orange, cyan and purple organisms occur in some minimal communities but not in all, they are alternatives symbionts.\n", "\n", "

\n", - " \n", - "

Figue 3: Key species
\n", + " \n", + "
Figure 6: Key species
\n", "

\n" ] }, @@ -289,28 +839,220 @@ }, { "cell_type": "code", - "execution_count": 7, + "execution_count": 29, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "/Users/cfrioux/wd/code/metage2metabo/tutorials/method_tutorial/output_folder/community_analysis/miscoto_heuiouid.lp\n", + "/Users/cfrioux/.pyenv/versions/metage2metabo/lib/python3.10/site-packages/miscoto/encodings/community_soup.lp\n", + "None\n" + ] + } + ], "source": [ "# Run the minimal community with metage2metabo on the folder of SBML using the seeds files.\n", "# This is similar to the command:\n", "#m2m metacom -n data/community -s data/seeds.sbml -o output_folder\n", - "from metage2metabo.m2m import community_addedvalue, individual_scope, community_scope\n", + "from metage2metabo.m2m import community_addedvalue, individual_scope, community_scope, minimal_community\n", "from metage2metabo import sbml_management\n", "\n", - "networks_path = 'data/community/'\n", - "seeds_path = 'data/seeds.sbml'\n", + "networks_path = '../../test/metabolic_data/tiny_toy/networks/'\n", + "seeds_path = '../../test/metabolic_data/tiny_toy/seeds_community.sbml'\n", "output_folder = 'output_folder'\n", - "target_path = 'targets.sbml'\n", + "target_path = '../../test/metabolic_data/tiny_toy/targets_community.sbml'\n", "\n", "iscope_metabolites = individual_scope.iscope(networks_path, seeds_path, output_folder)\n", "instance_comscope, cscope_metabolites = community_scope.cscope(networks_path, seeds_path, output_folder)\n", "addedvalue = community_addedvalue.addedvalue(iscope_metabolites, cscope_metabolites, output_folder)\n", "\n", - "if len(addedvalue):\n", - " sbml_management.create_species_sbml(addedvalue, target_path)\n", - "instance_mincom = community_scope.instance_community(networks_path, seeds_path, output_folder, target_path)" + "instance_mincom = community_scope.instance_community(networks_path, seeds_path, output_folder, target_path)\n", + "\n", + "# The path contains the instance for Answer Set Programming computation of minimal communities.\n", + "print(instance_mincom)\n", + "\n", + "# let's run mincom\n", + "\n", + "minimal_community.mincom(instance_mincom, seeds_path, target_path, output_folder)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The mincom command creates the output `community_analysis/mincom.json`. \n", + "\n", + "When called with the command line, logs provide an explanation of the results:\n", + "\n", + "```\n", + "m2m mincom -n ../../test/metabolic_data/tiny_toy/networks -s ../../test/metabolic_data/tiny_toy/seeds_community.sbml -t ../../test/metabolic_data/tiny_toy/targets_community.sbml -o output_folder\n", + "######### Creating metabolic instance for the whole community #########\n", + "Created temporary instance file in output_folder/community_analysis/miscoto__3lz5o7z.lp\n", + "\n", + "###############################################\n", + "# #\n", + "# Minimal community selection #\n", + "# #\n", + "###############################################\n", + "\n", + "Running minimal community selection\n", + "/Users/cfrioux/.pyenv/versions/metage2metabo/lib/python3.10/site-packages/miscoto/encodings/community_soup.lp\n", + "\n", + "In the initial and minimal communities 3 targets are producible and 1 remain unproducible.\n", + "\n", + "3 producible targets:\n", + "M_F_c\n", + "M_C_c\n", + "M_H_c\n", + "\n", + "1 still unproducible targets:\n", + "M_foo_c\n", + "\n", + "Minimal communities are available in output_folder/community_analysis/mincom.json\n", + "\n", + "######### One minimal community #########\n", + "# One minimal community enabling the producibility of the target metabolites given as inputs\n", + "Minimal number of bacteria in communities => 3\n", + "\n", + "bact5\n", + "bact1\n", + "bact3\n", + "######### Key species: Union of minimal communities #########\n", + "# Bacteria occurring in at least one minimal community enabling the producibility of the target metabolites given as inputs\n", + "Number of key species => 5\n", + "\n", + "bact1\n", + "bact5\n", + "bact2\n", + "bact6\n", + "bact3\n", + "######### Essential symbionts: Intersection of minimal communities #########\n", + "# Bacteria occurring in ALL minimal communities enabling the producibility of the target metabolites given as inputs\n", + "Number of essential symbionts => 1\n", + "\n", + "bact1\n", + "######### Alternative symbionts: Difference between Union and Intersection #########\n", + "# Bacteria occurring in at least one minimal community but not all minimal communities enabling the producibility of the target metabolites given as inputs\n", + "Number of alternative symbionts => 4\n", + "\n", + "bact2\n", + "bact3\n", + "bact6\n", + "bact5\n", + "\n", + "--- Mincom runtime 4.99 seconds ---\n", + "\n", + "--- Total runtime 5.06 seconds ---\n", + "\n", + "```\n", + "\n", + "The figure below illustrates the one solution that mincom provides, with the underlying minimal metabolic exchanges that would be necessary (can be computed with the `miscoto` package, a dependency of m2m)\n", + "\n", + "

\n", + " \n", + "

Figure 6: One minimal solution enabling the production of the 3 producible targets
\n", + "

\n", + "\n", + "One target cannot be produced. It is the $foo$ metabolite that actually occurs in none of the metabolic networks. \n", + "\n", + "More information can be computed regarding target producibility.\n", + "It is done automatically when calling `m2m metacom`, e.g. `m2m metacom -n ../../test/metabolic_data/tiny_toy/networks -s ../../test/metabolic_data/tiny_toy/seeds_community.sbml -t ../../test/metabolic_data/tiny_toy/targets_community.sbml -o output_folder` and is stored in `output_folder/producibility_targets.json`.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The file `producibility_targets.json` provides such information, stored as a dictionnary:\n", + "\n", + "- Producible targets\n", + " ```\n", + " \"producible\": [\n", + " \"M_H_c\",\n", + " \"M_C_c\",\n", + " \"M_F_c\"\n", + " ```\n", + "- Unproducible targets\n", + " ```\n", + " \"unproducible\": [\n", + " \"M_foo_c\"\n", + " ],\n", + " ```\n", + "- Targets that can be produced by individual species, i.e. do not need cooperation events, together with their producers:\n", + " ```\n", + " \"indiv_producible\": [\n", + " \"M_H_c\"\n", + " ],\n", + " \"individual_producers\": {\n", + " \"M_H_c\": [\n", + " \"bact1\"\n", + " ]\n", + " },\n", + " ```\n", + "- Targets that requires cooperation or cross-feeding for their production, together with their final producers\n", + " ```\n", + " \"com_only_producers\": {\n", + " \"M_H_c\": [\n", + " \"bact3\",\n", + " \"bact2\"\n", + " ],\n", + " \"M_F_c\": [\n", + " \"bact1\"\n", + " ],\n", + " \"M_C_c\": [\n", + " \"bact6\",\n", + " \"bact5\"\n", + " ]\n", + " },\n", + " ```\n", + "- Key species\n", + " ```\n", + " \"key_species\": [\n", + " \"bact3\",\n", + " \"bact2\",\n", + " \"bact1\",\n", + " \"bact6\",\n", + " \"bact5\"\n", + " ],\n", + " ```\n", + "- Targets producible in the minimal community\n", + " ```\n", + " \"mincom_producible\": [\n", + " \"M_H_c\",\n", + " \"M_C_c\",\n", + " \"M_F_c\"\n", + " ],\n", + " ```\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Please note that for computational efficiency, only one solution, the key species and essential/alternative symbionts are computed. Enumerating all equivalent minimal communities takes more time but can be computed. That is the purpose of `m2m analysis`." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Command line from the terminal\n", + "\n", + "Most of the time, you will launch m2m from the terminal rather than from the API. \n", + "The command for minimal community computation is the following, it will create all output files:\n", + "\n", + "```sh\n", + "m2m mincom -n ../../test/metabolic_data/tiny_toy/networks/ -s ../../test/metabolic_data/tiny_toy/seeds_community.sbml -t ../../test/metabolic_data/tiny_toy/targets_community.sbml -o output_folder\n", + "```\n", + "\n", + "`m2m metacom` runs the complete analysis from the individual scope to the computation of target producibility after minimal community selection. \n", + "We advise to use directly this command to explore the metabolic potential of communities. \n", + "\n", + "```sh\n", + "m2m metacom -n ../../test/metabolic_data/tiny_toy/networks/ -s ../../test/metabolic_data/tiny_toy/seeds_community.sbml -t ../../test/metabolic_data/tiny_toy/targets_community.sbml -o output_folder\n", + "```" ] }, { @@ -319,25 +1061,33 @@ "source": [ "## `m2m_analysis`: Visualisation of minimal communities\n", "\n", - "To visualize the results, we will use the m2m_analysis part of m2m." + "To complete the analysis of minimal communities and visualise the results, we will use the m2m_analysis part of m2m." ] }, { "cell_type": "code", - "execution_count": 8, + "execution_count": 31, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ - "/home/abelcour/.local/lib/python3.8/site-packages/miscoto/encodings/community_soup.lp\n" + "/Users/cfrioux/.pyenv/versions/metage2metabo/lib/python3.10/site-packages/miscoto/encodings/community_soup.lp\n" ] } ], "source": [ "from metage2metabo.m2m_analysis import m2m_analysis_workflow\n", "\n", + "# we need a target file in which all molecules can be producible. \n", + "# foo is not producible, so we'll use another target file that does not contain it\n", + "\n", + "networks_path = '../../test/metabolic_data/tiny_toy/networks/'\n", + "seeds_path = '../../test/metabolic_data/tiny_toy/seeds_community.sbml'\n", + "output_folder = 'output_folder'\n", + "target_path = '../../test/metabolic_data/tiny_toy/targets_community_allprod.sbml'\n", + "\n", "m2m_analysis_workflow.run_analysis_workflow(networks_path, target_path, seeds_path, output_folder, None, None)" ] }, @@ -345,29 +1095,43 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Minimal communities:\n", - "- OrgA X OrgB\n", - "- OrgA X OrgC\n", + "Minimal communities are described in `output_folder/json/targets_community_allprod.json`.\n", + "There are 4 of them\n", "\n", - "First, m2m_analysis will compute the solution graph. This graph node corresponds to organism from the community. An edge is drawn between two nodes if the two organisms occur in the same minimal community.\n", + "- bact1 X bact3 x bact 6\n", + "- bact1 X bact3 x bact 5\n", + "- bact1 X bact2 x bact 6\n", + "- bact1 X bact2 x bact 5\n", "\n", - "In this example, m2m_analysis will draw an edge between OrgA and OrgB and another edge between OrgA and OrgC.\n", + "The 4 communities are illustrated below:\n", "\n", "

\n", - " \n", - "

Figue 4: Solution graph
\n", + " \n", + "
Figure 7: All minimal community solutions
\n", "

\n", "\n", - "Then m2m_analysis will compress the solution graph into a powergraph. To do this, m2m_analysis uses powergrasp, which will search for pattern (like hub or clique) and compress them into powernode and poweredge. Here, as node OrgA is a hub in the graph (as connected to the other nodes of the graph), a poweredge will be drawn between OrgA node and a powernode containing both OrgB and OrgC.\n", + "First, m2m_analysis will compute the solution graph. This graph's nodes correspond to organism from the community. An edge is drawn between two nodes if the two organisms occur in the same minimal community.\n", + "\n", + "This graph can be very dense and hard to visualise if a lot of minimal solutions exist. \n", + "It is stored in `output_folder/gml/targets_community_allprod.gml\n", "\n", - "OrgA occurs in all the minimal community, it is an essential symbiont (in dark pink).\n", + "

\n", + " \n", + "

Figure 7: All minimal community solutions
\n", + "

\n", "\n", - "OrgB and OrgC occur in one minimal community but not all, they are alternative symbionts (in blue).\n", + "Then m2m_analysis will compress the solution graph into a [power graph](https://en.wikipedia.org/wiki/Power_graph_analysis). To do this, m2m_analysis uses the `powergrasp` package, which will search for specific pattern (like hubs or cliques) and compress them into power nodes and power edges. Here, as node bact1 is a hub in the graph (as connected to the other nodes of the graph). We notice that bact2 and bact3 have the same roles in minimal communities, likewise for bact5 and bact6. Each pair will belong to a power node, indicating they can replace each other, i.e. a minimal community holds one of the two pair elements. \n", + "\n", + "bact1 occurs in all the minimal communities, it is an essential symbiont (in dark pink).\n", + "\n", + "bact2, bact3, bact5 and bact6 occur in at least one minimal community but not all, they are alternative symbionts (in blue).\n", "\n", "

\n", - " \n", - "

Figue 5: Powergraph
\n", - "

" + " \n", + "
Figue 9: Powergraph
\n", + "

\n", + "\n", + "The power graph information is stored in `output_folder/bbl`. It can be visualised as an html page `html/targets_community_allprod_powergraph.html`." ] }, { @@ -385,19 +1149,18 @@ }, { "cell_type": "code", - "execution_count": 2, + "execution_count": 34, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ - "Average individual scope: 308.70588235294116\n", - "The community scope contains 651 metabolites.\n", - "Cooperation potential: 119 metabolites.\n", - "/mnt/c/Users/Arnaud/Downloads/Work_directory/programs/miscoto/miscoto/encodings/community_soup.lp\n", - "12 Essential symbionts: GCA_003437055,GCA_003437905,GCA_003437815,GCA_003437885,GCA_003437715,GCA_003437295,GCA_003437195,GCA_003437665,GCA_003437595,GCA_003437255,GCA_003438055,GCA_003437375\n", - "5 Alternative symbionts: GCA_003437325,GCA_003437175,GCA_003437785,GCA_003437345,GCA_003437945\n" + "Average individual scope: 239.05882352941177\n", + "The community scope contains 698 metabolites.\n", + "Cooperation potential: 122 metabolites.\n", + "12 Essential symbionts: GCA_003437815,GCA_003437055,GCA_003437195,GCA_003437885,GCA_003438055,GCA_003437375,GCA_003437665,GCA_003437905,GCA_003437715,GCA_003437295,GCA_003437255,GCA_003437595\n", + "5 Alternative symbionts: GCA_003437945,GCA_003437785,GCA_003437345,GCA_003437325,GCA_003437175\n" ] } ], @@ -415,13 +1178,13 @@ "# m2m iscope -n data/community -s data/seeds.sbml -o output_folder\n", "from metage2metabo.m2m import individual_scope\n", "\n", - "networks_path = 'toy_bact'\n", - "seeds_path = 'seeds_toy.sbml'\n", + "networks_path = '../../test/metabolic_data/toy_bact'\n", + "seeds_path = '../../test/metabolic_data/seeds_toy.sbml'\n", "output_folder = 'tutorial_output_folder'\n", "indiv_output = os.path.join(output_folder, 'indiv_scopes')\n", "indiv_scope_json = os.path.join(indiv_output, 'indiv_scopes.json')\n", "key_species_file = os.path.join(output_folder, 'key_species.json')\n", - "target_path = 'targets.sbml'\n", + "target_path = '../../test/metabolic_data/targets_toy.sbml'\n", "\n", "individual_scope.iscope(networks_path, seeds_path, output_folder)\n", "\n", @@ -466,10 +1229,10 @@ "with open(key_species_file) as json_key_species:\n", " key_species_json_data = json.load(json_key_species)\n", " \n", - "nb_essential_symbionts = len(key_species_json_data['targets']['essential_symbionts']['data'])\n", - "essential_symbionts = ','.join(key_species_json_data['targets']['essential_symbionts']['data'])\n", - "nb_alternative_symbionts = len(key_species_json_data['targets']['alternative_symbionts']['data'])\n", - "alternative_symbiotns = ','.join(key_species_json_data['targets']['alternative_symbionts']['data'])\n", + "nb_essential_symbionts = len(key_species_json_data['targets_toy']['essential_symbionts']['data'])\n", + "essential_symbionts = ','.join(key_species_json_data['targets_toy']['essential_symbionts']['data'])\n", + "nb_alternative_symbionts = len(key_species_json_data['targets_toy']['alternative_symbionts']['data'])\n", + "alternative_symbiotns = ','.join(key_species_json_data['targets_toy']['alternative_symbionts']['data'])\n", "print('{0} Essential symbionts: {1}'.format(nb_essential_symbionts, essential_symbionts))\n", "print('{0} Alternative symbionts: {1}'.format(nb_alternative_symbionts, alternative_symbiotns))" ] @@ -483,11 +1246,36 @@ "- result files of m2m are described [here](https://metage2metabo.readthedocs.io/en/latest/output.html).\n", "- result files of m2m_analysis are explained [here](https://metage2metabo.readthedocs.io/en/latest/m2m_analysis.html#m2m-analysis-output-files)." ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Command line from the terminal\n", + "\n", + "Most of the time, you will launch m2m_analysis from the terminal rather than from the API. \n", + "The command is the following, it will create all output files:\n", + "\n", + "```sh\n", + "m2m_analysis workflow -n ../../test/metabolic_data/tiny_toy/networks/ -s ../../test/metabolic_data/tiny_toy/seeds_community.sbml -t ../../test/metabolic_data/tiny_toy/targets_community.sbml -o output_folder\n", + "```\n", + "\n", + "Other options are available, for instance for creating the svg image of the power graph, or taking into account the taxonomy of the community members. Take a look at the command line help `m2m_analysis workflow --help` and the documentation.\n", + "\n", + "Here the whole workflow is run. You can launch parts of the workflow, take a look at the documentation and the help of the tool using `m2m_analysis -h`.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] } ], "metadata": { "kernelspec": { - "display_name": "Python 3", + "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, @@ -501,7 +1289,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.8.5" + "version": "3.10.12" } }, "nbformat": 4,