Add reproducibility info to pipeline status page (#123)

* [BUG] inside unit_tests workflow * Browsing all issues pages from Github API * Get all pages of GitHub issues * [TEST] Updating test for status module * [TEST] fetch several issues * Dealing with single page of issues * Removine per_page query parameter * [TEST] adjusting tests * Add reproducibility columns to status report * [TEST] Add reproducibility columns to status report * [TEST] updating narps_open.data.description tests * Add links to share code + exclusion reasons * : reproducibility score + added source code links * [TEST] : reproducibility score * [DOC] : reproducibility score * [BUG] : reproducibility score * [BUG] : reproducibility score
Inria-Empenn · Nov 14, 2023 · 7b2e33a · 7b2e33a
1 parent 93bb718
commit 7b2e33a
Show file tree

Hide file tree

Showing 9 changed files with 185 additions and 111 deletions.
diff --git a/docs/status.md b/docs/status.md
@@ -1,10 +1,18 @@
 # Access the work progress status pipelines
 
 The class `PipelineStatusReport` of module `narps_open.utils.status` allows to create a report containing the following information for each pipeline:
+* a work progress status : `idle`, `progress`, or `done`;
 * the software it uses (collected from the `categorized_for_analysis.analysis_SW` of the [team description](/docs/description.md)) ;
 * whether it uses data from fMRIprep or not ;
 * a list of issues related to it (the opened issues of the project that have the team ID inside their title or description) ;
-* a work progress status : `idle`, `progress`, or `done`.
+* a list of pull requests related to it (the opened pull requests of the project that have the team ID inside their title or description) ;
+* whether it was excluded from the original NARPS analysis ;
+* a reproducibility rating :
+  * default score is 4;
+  * -1 if the team did not use fmriprep data;
+  * -1 if the team used several pieces of software (e.g.: FSL and AFNI);
+  * -1 if the team used custom or marginal software (i.e.: something else than SPM, FSL, AFNI or nistats);
+  * -1 if the team did not provided his source code.
 
 This allows contributors to best select the pipeline they want/need to contribute to. For this purpose, the GitHub Actions workflow [`.github/workflows/pipeline_status.yml`](/.github/workflows/pipeline_status.yml) allows to dynamically generate the report and to store it in the [project's wiki](https://github.com/Inria-Empenn/narps_open_pipelines/wiki).
 
@@ -55,22 +63,31 @@ python narps_open/utils/status --json
 #         "softwares": "FSL",
 #         "fmriprep": "No",
 #         "issues": {},
-#         "status": "idle"
+#         "excluded": "No",
+#         "reproducibility": 3,
+#         "reproducibility_comment": "",
+#         "issues": {},
+#         "pulls": {},
+#         "status": "2-idle"
 #     },
 #     "0C7Q": {
 #         "softwares": "FSL, AFNI",
 #         "fmriprep": "Yes",
 #         "issues": {},
+#         "excluded": "No",
+#         "reproducibility": 3,
+#         "reproducibility_comment": "",
+#         "issues": {},
+#         "pulls": {},
 #         "status": "idle"
 #     },
 # ...
 
 python narps_open/utils/status --md
-# | team_id | status | softwares used | fmriprep used ? | related issues |
-# | --- |:---:| --- | --- | --- |
-# | 08MQ | :red_circle: | FSL | No |  |
-# | 0C7Q | :red_circle: | FSL, AFNI | Yes |  |
-# | 0ED6 | :red_circle: | SPM | No |  |
-# | 0H5E | :red_circle: | SPM | No |  |
+# ...
+# | team_id | status | main software | fmriprep used ? | related issues | related pull requests | excluded from NARPS analysis | reproducibility |
+# | --- |:---:| --- | --- | --- | --- | --- | --- |
+# | Q6O0 | :green_circle: | SPM | Yes |  |  | No | :star::star::star::black_small_square:<br /> |
+# | UK24 | :orange_circle: | SPM | No | [2](url_issue_2),  |  | No | :star::star::black_small_square::black_small_square:<br /> |
 # ...
 ```
diff --git a/narps_open/data/description/analysis_pipelines_comments.tsv b/narps_open/data/description/analysis_pipelines_comments.tsv
@@ -1,71 +1,71 @@
 teamID	excluded_from_narps_analysis	exclusion_comment	reproducibility	reproducibility_comment
-50GV	no	N/A	?	Uses custom software (Denoiser)
-9Q6R	no	N/A		
-O21U	no	N/A		
-U26C	no	N/A		
-43FJ	no	N/A		
-C88N	no	N/A		
-4TQ6	yes	Resampled image offset and too large compared to template.		
-T54A	no	N/A		
-2T6S	no	N/A		
-L7J7	no	N/A		
-0JO0	no	N/A		
-X1Y5	no	N/A		
-51PW	no	N/A		
-6VV2	no	N/A		
-O6R6	no	N/A		
-C22U	no	N/A	?	Custom Matlab script for white matter PCA confounds
-3PQ2	no	N/A		
-UK24	no	N/A		
-4SZ2	yes	Resampled image offset from template brain.		
-9T8E	no	N/A		
-94GU	no	N/A	?	Multiple software dependencies : SPM + ART + TAPAS + Matlab.
-I52Y	no	N/A		
-5G9K	no	N/A	?	? 
-2T7P	yes	Missing thresholded images.	?	?
-UI76	no	N/A		
-B5I6	no	N/A		
-V55J	yes	Bad histogram : very small values.		
-X19V	no	N/A		
-0C7Q	yes	Appears to be a p-value distribution, with slight excursions below and above zero.		
-R5K7	no	N/A		
-0I4U	no	N/A		
-3C6G	no	N/A		
-R9K3	no	N/A		
-O03M	no	N/A		
-08MQ	no	N/A		
-80GC	no	N/A		
-J7F9	no	N/A		
-R7D1	no	N/A		
-Q58J	yes	Bad histogram : bimodal, zero-inflated with a second distribution centered around 5.		
-L3V8	yes	Rejected due to large amount of missing brain in center.		
-SM54	no	N/A		
-1KB2	no	N/A		
-0H5E	yes	Rejected due to large amount of missing brain in center.		
-P5F3	yes	Rejected due to large amounts of missing data across brain.		
-Q6O0	no	N/A		
-R42Q	no	N/A	?	Uses fMRIflows, a custom software based on NiPype.
-L9G5	no	N/A		
-DC61	no	N/A		
-E3B6	yes	Bad histogram : very long tail, with substantial inflation at a value just below zero.		
-16IN	no	N/A	?	Multiple software dependencies : matlab + SPM + FSL + R + TExPosition + neuroim
-46CD	no	N/A		
-6FH5	yes	Missing much of the central brain.		
-K9P0	no	N/A		
-9U7M	no	N/A		
-VG39	no	N/A		
-1K0E	yes	Used surface-based analysis, only provided data for cortical ribbon.	?	?
-X1Z4	yes	Used surface-based analysis, only provided data for cortical ribbon.	?	Multiple software dependencies : FSL + fmriprep + ciftify + HCP workbench + Freesurfer + ANTs
-I9D6	no	N/A		
-E6R3	no	N/A		
-27SS	no	N/A		
-B23O	no	N/A		
-AO86	no	N/A		
-L1A8	yes	Resampled image much smaller than template brain.	?	?
-IZ20	no	N/A		
-3TR7	no	N/A		
-98BT	yes	Rejected due to very bad normalization.		
-XU70	no	N/A	?	Uses custom software : FSL + 4drealign
-0ED6	no	N/A	?	? 
-I07H	yes	Bad histogram : bimodal, with second distribution centered around 2.5.		
-1P0Y	no	N/A		
+50GV	No	N/A	3	Uses custom software (Denoiser)
+9Q6R	No	N/A	2	
+O21U	No	N/A	3	
+U26C	No	N/A	4	Link to shared analysis code : https://github.com/gladomat/narps
+43FJ	No	N/A	2	
+C88N	No	N/A	3	
+4TQ6	Yes	Resampled image offset and too large compared to template.	3	
+T54A	No	N/A	3	
+2T6S	No	N/A	3	
+L7J7	No	N/A	3	
+0JO0	No	N/A	3	
+X1Y5	No	N/A	2	
+51PW	No	N/A	3	
+6VV2	No	N/A	2	
+O6R6	No	N/A	3	
+C22U	No	N/A	1	Custom Matlab script for white matter PCA confounds
+3PQ2	No	N/A	2	
+UK24	No	N/A	2	
+4SZ2	Yes	Resampled image offset from template brain.	3	
+9T8E	No	N/A	3	
+94GU	No	N/A	1	Multiple software dependencies : SPM + ART + TAPAS + Matlab.
+I52Y	No	N/A	2	
+5G9K	Yes	Values in the unthresholded images are not z / t stats	3	
+2T7P	Yes	Missing thresholded images.	2	Link to shared analysis code : https://osf.io/3b57r
+UI76	No	N/A	3	
+B5I6	No	N/A	3	
+V55J	Yes	Bad histogram : very small values.	2	
+X19V	No	N/A	3	
+0C7Q	Yes	Appears to be a p-value distribution, with slight excursions below and above zero.	2	
+R5K7	No	N/A	2	
+0I4U	No	N/A	2	
+3C6G	No	N/A	2	
+R9K3	No	N/A	3	
+O03M	No	N/A	3	
+08MQ	No	N/A	2	
+80GC	No	N/A	3	
+J7F9	No	N/A	3	
+R7D1	No	N/A	3	Link to shared analysis code : https://github.com/IMTAltiStudiLucca/NARPS_R7D1
+Q58J	Yes	Bad histogram : bimodal, zero-inflated with a second distribution centered around 5.	3	Link to shared analysis code : https://github.com/amrka/NARPS_Q58J
+L3V8	Yes	Rejected due to large amount of missing brain in center.	2	
+SM54	No	N/A	3	
+1KB2	No	N/A	2	
+0H5E	Yes	Rejected due to large amount of missing brain in center.	2	
+P5F3	Yes	Rejected due to large amounts of missing data across brain.	2	
+Q6O0	No	N/A	3	
+R42Q	No	N/A	2	Uses fMRIflows, a custom software based on NiPype. Code available here : https://github.com/ilkayisik/narps_R42Q
+L9G5	No	N/A	2	
+DC61	No	N/A	3	
+E3B6	Yes	Bad histogram : very long tail, with substantial inflation at a value just below zero.	4	Link to shared analysis code : doi.org/10.5281/zenodo.3518407
+16IN	Yes	Values in the unthresholded images are not z / t stats	2	Multiple software dependencies : matlab + SPM + FSL + R + TExPosition + neuroim. Link to shared analysis code : https://github.com/jennyrieck/NARPS
+46CD	No	N/A	1	
+6FH5	Yes	Missing much of the central brain.	2	
+K9P0	No	N/A	3	
+9U7M	No	N/A	2	
+VG39	Yes	Performed small volume corrected instead of whole-brain analysis	3	
+1K0E	Yes	Used surface-based analysis, only provided data for cortical ribbon.	1	
+X1Z4	Yes	Used surface-based analysis, only provided data for cortical ribbon.	1	Multiple software dependencies : FSL + fmriprep + ciftify + HCP workbench + Freesurfer + ANTs
+I9D6	No	N/A	2	
+E6R3	No	N/A	2	
+27SS	No	N/A	2	
+B23O	No	N/A	3	
+AO86	No	N/A	2	
+L1A8	Yes	Not in MNI standard space.	2	
+IZ20	No	N/A	1	
+3TR7	No	N/A	3	
+98BT	Yes	Rejected due to very bad normalization.	2	
+XU70	No	N/A	1	Uses custom software : FSL + 4drealign
+0ED6	No	N/A	2	
+I07H	Yes	Bad histogram : bimodal, with second distribution centered around 2.5.	2	
+1P0Y	No	N/A	2	
diff --git a/narps_open/utils/status.py b/narps_open/utils/status.py
@@ -76,10 +76,18 @@ def generate(self):
 
             # Get software used in the pipeline, from the team description
             description = TeamDescription(team_id)
-            self.contents[team_id]['softwares'] = \
+            self.contents[team_id]['software'] = \
                 description.categorized_for_analysis['analysis_SW']
             self.contents[team_id]['fmriprep'] = description.preprocessing['used_fmriprep_data']
 
+            # Get comments about the pipeline
+            self.contents[team_id]['excluded'] = \
+                description.comments['excluded_from_narps_analysis']
+            self.contents[team_id]['reproducibility'] = \
+                int(description.comments['reproducibility'])
+            self.contents[team_id]['reproducibility_comment'] = \
+                description.comments['reproducibility_comment']
+
             # Get issues and pull requests related to the team
             issues = {}
             pulls = {}
@@ -109,10 +117,11 @@ def generate(self):
             else:
                 self.contents[team_id]['status'] = '1-progress'
 
-        # Sort contents with the following priorities : 1-"status", 2-"softwares" and 3-"fmriprep"
+        # Sort contents with the following priorities :
+        #    1-"status", 2-"softwares", 3-"fmriprep"
         self.contents = OrderedDict(sorted(
             self.contents.items(),
-            key=lambda k: (k[1]['status'], k[1]['softwares'], k[1]['fmriprep'])
+            key=lambda k: (k[1]['status'], k[1]['software'], k[1]['fmriprep'])
             ))
 
     def markdown(self):
@@ -124,14 +133,23 @@ def markdown(self):
         output_markdown += '<br>:red_circle: not started yet\n'
         output_markdown += '<br>:orange_circle: in progress\n'
         output_markdown += '<br>:green_circle: completed\n'
-        output_markdown += '<br><br>The *softwares used* column gives a simplified version of '
+        output_markdown += '<br><br>The *main software* column gives a simplified version of '
         output_markdown += 'what can be found in the team descriptions under the '
         output_markdown += '`general.software` column.\n'
+        output_markdown += '<br><br>The *reproducibility* column rates the pipeline as follows:\n'
+        output_markdown += ' * default score is :star::star::star::star:;\n'
+        output_markdown += ' * -1 if the team did not use fmriprep data;\n'
+        output_markdown += ' * -1 if the team used several pieces of software '
+        output_markdown += '(e.g.: FSL and AFNI);\n'
+        output_markdown += ' * -1 if the team used custom or marginal software '
+        output_markdown += '(i.e.: something else than SPM, FSL, AFNI or nistats);\n'
+        output_markdown += ' * -1 if the team did not provided his source code.\n'
 
         # Start table
-        output_markdown += '| team_id | status | softwares used | fmriprep used ? |'
-        output_markdown += ' related issues | related pull requests |\n'
-        output_markdown += '| --- |:---:| --- | --- | --- | --- |\n'
+        output_markdown += '\n| team_id | status | main software | fmriprep used ? |'
+        output_markdown += ' related issues | related pull requests |'
+        output_markdown += ' excluded from NARPS analysis | reproducibility |\n'
+        output_markdown += '| --- |:---:| --- | --- | --- | --- | --- | --- |\n'
 
         # Add table contents
         for team_key, team_values in self.contents.items():
@@ -146,7 +164,7 @@ def markdown(self):
                 status = ':red_circle:'
 
             output_markdown += f'| {status} '
-            output_markdown += f'| {team_values["softwares"]} '
+            output_markdown += f'| {team_values["software"]} '
             output_markdown += f'| {team_values["fmriprep"]} '
 
             issues = ''
@@ -159,8 +177,15 @@ def markdown(self):
             for issue_number, issue_url in team_values['pulls'].items():
                 pulls += f'[{issue_number}]({issue_url}), '
 
-            output_markdown += f'| {pulls} |\n'
+            output_markdown += f'| {pulls} '
+            output_markdown += f'| {team_values["excluded"]} '
 
+            reproducibility_ranking = ''
+            for _ in range(team_values['reproducibility']):
+                reproducibility_ranking += ':star:'
+            for _ in range(4-team_values['reproducibility']):
+                reproducibility_ranking += ':black_small_square:'
+            output_markdown += f'| {reproducibility_ranking}<br />{team_values["reproducibility_comment"]} |\n'
 
         return output_markdown
 

diff --git a/tests/data/test_description.py b/tests/data/test_description.py
@@ -55,7 +55,7 @@ def test_arguments_properties():
         assert description['analysis.RT_modeling'] == 'duration'
         assert description['categorized_for_analysis.analysis_SW_with_version'] == 'SPM12'
         assert description['derived.func_fwhm'] == '8'
-        assert description['comments.excluded_from_narps_analysis'] == 'no'
+        assert description['comments.excluded_from_narps_analysis'] == 'No'
 
         # 4 - Check properties
         assert isinstance(description.general, dict)
@@ -84,7 +84,7 @@ def test_arguments_properties():
         assert description.analysis['RT_modeling'] == 'duration'
         assert description.categorized_for_analysis['analysis_SW_with_version'] == 'SPM12'
         assert description.derived['func_fwhm'] == '8'
-        assert description.comments['excluded_from_narps_analysis'] == 'no'
+        assert description.comments['excluded_from_narps_analysis'] == 'No'
 
         # 6 - Test another team
         description = TeamDescription('9Q6R')

diff --git a/tests/test_data/data/description/test_markdown.md b/tests/test_data/data/description/test_markdown.md
@@ -97,7 +97,7 @@ Model EVs (2): eq_indiff, eq_range
 * `func_fwhm` : 5
 * `con_fwhm` : 
 ## Comments
-* `excluded_from_narps_analysis` : no
+* `excluded_from_narps_analysis` : No
 * `exclusion_comment` : N/A
-* `reproducibility` : 
+* `reproducibility` : 2
 * `reproducibility_comment` : 
diff --git a/tests/test_data/data/description/test_str.json b/tests/test_data/data/description/test_str.json
@@ -54,8 +54,8 @@
     "derived.excluded_participants": "018, 030, 088, 100",
     "derived.func_fwhm": "5",
     "derived.con_fwhm": "",
-    "comments.excluded_from_narps_analysis": "no",
+    "comments.excluded_from_narps_analysis": "No",
     "comments.exclusion_comment": "N/A",
-    "comments.reproducibility": "",
+    "comments.reproducibility": "2",
     "comments.reproducibility_comment": ""
 }
diff --git a/tests/test_data/utils/status/test_markdown.md b/tests/test_data/utils/status/test_markdown.md
@@ -3,11 +3,18 @@ The *status* column tells whether the work on the pipeline is :
 <br>:red_circle: not started yet
 <br>:orange_circle: in progress
 <br>:green_circle: completed
-<br><br>The *softwares used* column gives a simplified version of what can be found in the team descriptions under the `general.software` column.
-| team_id | status | softwares used | fmriprep used ? | related issues | related pull requests |
-| --- |:---:| --- | --- | --- | --- |
-| Q6O0 | :green_circle: | SPM | Yes |  |  |
-| UK24 | :orange_circle: | SPM | No | [2](url_issue_2),  |  |
-| 2T6S | :orange_circle: | SPM | Yes | [5](url_issue_5),  | [3](url_pull_3),  |
-| 1KB2 | :red_circle: | FSL | No |  |  |
-| C88N | :red_circle: | SPM | Yes |  |  |
+<br><br>The *main software* column gives a simplified version of what can be found in the team descriptions under the `general.software` column.
+<br><br>The *reproducibility* column rates the pipeline as follows:
+ * default score is :star::star::star::star:;
+ * -1 if the team did not use fmriprep data;
+ * -1 if the team used several pieces of software (e.g.: FSL and AFNI);
+ * -1 if the team used custom or marginal software (i.e.: something else than SPM, FSL, AFNI or nistats);
+ * -1 if the team did not provided his source code.
+
+| team_id | status | main software | fmriprep used ? | related issues | related pull requests | excluded from NARPS analysis | reproducibility |
+| --- |:---:| --- | --- | --- | --- | --- | --- |
+| Q6O0 | :green_circle: | SPM | Yes |  |  | No | :star::star::star::black_small_square:<br /> |
+| UK24 | :orange_circle: | SPM | No | [2](url_issue_2),  |  | No | :star::star::black_small_square::black_small_square:<br /> |
+| 2T6S | :orange_circle: | SPM | Yes | [5](url_issue_5),  | [3](url_pull_3),  | No | :star::star::star::black_small_square:<br /> |
+| 1KB2 | :red_circle: | FSL | No |  |  | No | :star::star::black_small_square::black_small_square:<br /> |
+| C88N | :red_circle: | SPM | Yes |  |  | No | :star::star::star::black_small_square:<br /> |