forked from hubmapconsortium/ingest-validation-tools
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
357 changed files
with
20,528 additions
and
485 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -16,7 +16,7 @@ Related files: | |
- [📝 TSV template](https://raw.githubusercontent.com/hubmapconsortium/dataset-metadata-spreadsheet/main/codex/latest/codex.tsv): Alternative for metadata entry. | ||
|
||
|
||
**For the PhenoCycler specification, please click [here](https://hubmapconsortium.github.io/ingest-validation-tools/phenocycler/current/).** See the following link for the set of fields that are required in the OME TIFF file XML header. https://docs.google.com/spreadsheets/d/1YnmdTAA0Z9MKN3OjR3Sca8pz-LNQll91wdQoRPSP6Q4/edit#gid=0 | ||
**For the PhenoCycler specification, please click [here](https://hubmapconsortium.github.io/ingest-validation-tools/phenocycler/current/).** [This link](https://docs.google.com/spreadsheets/d/1YnmdTAA0Z9MKN3OjR3Sca8pz-LNQll91wdQoRPSP6Q4/edit#gid=0) lists the set of fields that are required in the OME TIFF file XML header. | ||
|
||
## Metadata schema | ||
|
||
|
@@ -28,5 +28,33 @@ Related files: | |
<br> | ||
|
||
## Directory schemas | ||
<summary><a href="https://docs.google.com/spreadsheets/d/1pZD2e51e4QkxzIk6xjHPPu1RBZpx5mzoykMmlaDK8rA"><b>Version 2 (use this one)</b> (draft - submission of data prepared using this schema will be supported by Sept. 30) </a></summary> | ||
<summary><b>Version 2 (use this one)</b></summary> | ||
|
||
| pattern | required? | description | dependent on | | ||
| --- | --- | --- | --- | | ||
| <code>extras\/.*</code> | ✓ | Folder for general lab-specific files related to the dataset. [Exists in all assays] | | | ||
| <code>extras\/microscope_hardware\.json</code> | ✓ | **[QA/QC]** A file generated by the micro-meta app that contains a description of the hardware components of the microscope. Email HuBMAP Consortium Help Desk <[email protected]> if help is required in generating this document. | | | ||
| <code>extras\/microscope_settings\.json</code> | | **[QA/QC]** A file generated by the micro-meta app that contains a description of the settings that were used to acquire the image data. Email HuBMAP Consortium Help Desk <[email protected]> if help is required in generating this document. | | | ||
| <code>raw\/.*</code> | ✓ | This is a directory containing raw data. | | | ||
| <code>lab_processed\/images\/[^\/]+\.ome\.tiff</code> | ✓ | OME-TIFF file (multichannel, multi-layered) produced by the experiment. If compressed, must use loss-less compression algorithm. See the following link for the set of fields that are required in the OME TIFF file XML header. <https://docs.google.com/spreadsheets/d/1YnmdTAA0Z9MKN3OjR3Sca8pz-LNQll91wdQoRPSP6Q4/edit#gid=0> | | | ||
| <code>lab_processed\/images\/[^\/]*ome-tiff\.channels\.csv</code> | ✓ | This file should describe any processing that was done to generate the images in each channel of the accommpanying OME TIFF. The file should contain one row per OME TIFF channel. Two columns should be booleans "is this a channel to use for nuclei segmentation" and "is this a channel to use for cell segmentation". | | | ||
| <code>lab_processed\/annotations\/.*</code> | | Directory containing segmentation masks. | | | ||
| <code>lab_processed\/annotations\/[^\/]+\.segmentations\.ome\.tiff</code> | | The segmentation masks should be stored as multi-channel pyramidal OME TIFF bitmasks with one channel per mask, where a single mask contains all instances of a type of object (e.g., all cells, a class of FTUs, etc). The class of objects contained in the mask is documented in the segmentation-masks.csv file. Each individual object in a mask should be represented by a unique integer pixel value starting at 1, with 0 meaning background (e.g., all pixels belonging to the first instance of a T-cell have a value of 1, the pixels for the second instance of a T-cell have a value of 2, etc). The pixel values should be unique within a mask. FTUs and other structural elements should be captured the same way as cells with segmentation masks and the appropriate channel feature definitions. | lab_processed\/annotations\/.* | | ||
| <code>lab_processed\/annotations\/segmentation-masks\.csv</code> | | This file contains details about each mask, with one row per mask. Each column in this file contains details describing the mask (e.g., channel number, mask name, ontological ID, etc). Each mask is stored as a channel in the segmentations.ome.tiff file and the mask name should be ontologically based and linked to the ASCT+B table where possible. The number of rows in this file should equal the number of channels in the segmentations.ome.tiff. For example, one row in this file would ontologically describe cells, if the segmentations.ome.tiff file contained a mask of all cells. A minimum set of fields (required and optional) is included below. If multiple segmentations.ome.tiff files are used, this segmentation-masks.csv file should document the masks across all of the OME TIFF files. | lab_processed\/annotations\/.* | | ||
| <code>lab_processed\/annotations\/[^\/]+-objects\.csv</code> | | This is a matrix where each row describes an individual object (e.g., one row per cell in the case where a mask contains all cells) and columns are features (i.e., object type, marker intensity, classification strategies, etc). One file should be created per mask with the name of the mask prepended to the file name. For example, if there’s a cell segmentation map called “cells” then you would include a file called “cells-objects.csv” and that file would contain one row per cell in the “cells” mask and one column per feature, such as marker intensity and/or cell type. A minimum set of fields (required and optional) is included below. | lab_processed\/annotations\/.* | | ||
| <code>lab_processed\/annotations\/[^\/]+\.geojson</code> | | A GeoJSON file(s) containing the geometries of each object within a mask. For example, if the mask contains multiple FTUs, multiple cells, etc, each of the objects in the mask would be independently documented in the GeoJSON file. There would be a single GeoJSON file per mask and the name of the file should be the name of the mask. If this file is generated by QuPath, the coordinates will be in pixel units with the origin (0, 0) as the top left corner of the full-resolution image. | lab_processed\/annotations\/.* | | ||
| <code>lab_processed\/annotations\/tissue-boundary\.geojson</code> | | **[QA/QC]** If the boundaries of the tissue have been identified (e.g., by manual efforts), then the boundary geometry can be included as a GeoJSON file named “tissue-boundary.geojson”. | lab_processed\/annotations\/.* | | ||
| <code>lab_processed\/annotations\/regions-of-concern\.csv</code> | | This file and the associated GeoJSON file can be used to denote any regions in the image that may contain QA/QC concerns. For example, if there are folds in the tissue, the region of the fold can be highlighted. This file should contain one row per region and include documentation about the region and why it's being flagged. | lab_processed\/annotations\/.* | | ||
| <code>lab_processed\/annotations\/regions-of-concern\.geojson</code> | | This file and the associated CSV file can be used to denote any regions in the image that may contain QA/QC concerns. For example, if there are folds in the tissue, the region of the fold can be highlighted. This file should contain the geometric coordinates of each region being flagged. | lab_processed\/annotations\/.* | | ||
| <code>[^\/]*NAV[^\/]*\.tif</code> (example: <code>NAV.tif</code>) | | Navigational Image showing Region of Interest (Keyance Microscope only) | | | ||
| <code>[^\/]+\.pdf</code> (example: <code>summary.pdf</code>) | | **[QA/QC]** PDF export of Powerpoint slide deck containing the Image Analysis Report | | | ||
| <code>extras\/dir-schema-v2-with-dataset-json</code> | ✓ | Empty file whose presence indicates the version of the directory schema in use | | | ||
| <code>processed\/drv_[^\/]*\/</code> | ✓ | Processed files produced by the Akoya software or alternative software. | | | ||
| <code>raw\/cyc[^\/]*_reg[^\/]*\/.*</code> | ✓ | Intermediary directory | | | ||
| <code>raw\/src_[^\/]*\/</code> | ✓ | Intermediary directory | | | ||
| <code>raw\/cyc[^\/]*_reg[^\/]*\/[^\/]*_z[^\/]*_CH[^\/]*\.tif</code> | ✓ | TIFF files produced by the experiment. General folder format: Cycle(n)_Region(n)_date; General file format: name_tileNumber(n)_zplaneNumber(n)_channelNumber(n) | | | ||
| <code>raw\/src_[^\/]*\/cyc[^\/]*_reg[^\/]*_[^\/]*\/[^\/]+\.gci</code> | | Group Capture Information File (Keyance Microscope only) | | | ||
| <code>raw\/dataset\.json</code> (example: <code>raw/dataset.json</code>) | ✓ | Data processing parameters file. This will include additional CODEX specific metadata needed for the HIVE processing workflow. | | | ||
| <code>raw\/reg_[^\/]*\.png</code> (example: <code>raw/reg_00.png</code>) | | Region overviews | | | ||
| <code>raw\/experiment\.json</code> (example: <code>raw/experiment.json</code>) | | JSON file produced by the Akoya software which contains the metadata for the experiment, including the software version used, microscope parameters, channel names, pixel dimensions, etc. (required for HuBMAP pipeline) | | | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.