Skip to content

Commit

Permalink
Tkakar/cat 673 create builder for seg mask (#92)
Browse files Browse the repository at this point in the history
* Created a working epic builder

* Updated readme

* Adjusted arguments and minor fixes

* Fixed linting and test issues, updated Readme

* Removed epic_builder tests

* Fixed linting

* Removed epic-related coverage for testing

* Minor fixes of readme and args

* Fixed the epic builder

* Fixed reqs, defined tests for epics

* Working segmenation mask

* Test implemented for EPICs

* Undo changes to client

* Updated readme

* Linting

* Fixed test-fixture

* Addressed test coverage

* Adjusted requirements

* Added error handling for epic_entity

* Minor fixes to epic-builders

* Removed req-prod
  • Loading branch information
tkakar authored Oct 28, 2024
1 parent 60fc83e commit 00913fa
Show file tree
Hide file tree
Showing 21 changed files with 760 additions and 115 deletions.
4 changes: 3 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -129,4 +129,6 @@ dmypy.json
.pyre/

# VSCode
.VSCode
.VSCode

.DS_Store
37 changes: 31 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# portal-visualization

Given HuBMAP Dataset JSON, creates a Vitessce configuration.
Given HuBMAP Dataset JSON (e.g. https://portal.hubmapconsortium.org/browse/dataset/004d4f157df4ba07356cd805131dfc04.json), creates a Vitessce configuration.

## Release process

Expand All @@ -23,22 +23,45 @@ $ pip install .
$ src/vis-preview.py --help
usage: vis-preview.py [-h] (--url URL | --json JSON) [--assaytypes_url URL]
[--assets_url URL] [--token TOKEN] [--marker MARKER]
[--to_json]
[--to_json] [--epic_uuid UUID] [--parent_uuid UUID]
[--epic_url EPIC_URL] [--epic_json EPIC_JSON]
Given HuBMAP Dataset JSON, generate a Vitessce viewconf, and load vitessce.io.
optional arguments:
-h, --help show this help message and exit
--url URL URL which returns Dataset JSON
--json JSON File containing Dataset JSON
--assaytypes_url URL AssayType service; default:
https://ingest.api.hubmapconsortium.org/assaytype/
--assaytypes_url URL AssayType service; default: https://ingest-
api.dev.hubmapconsortium.org/assaytype/
--assets_url URL Assets endpoint; default:
https://assets.hubmapconsortium.org
https://assets.dev.hubmapconsortium.org
--token TOKEN Globus groups token; Only needed if data is not public
--marker MARKER Marker to highlight in visualization; Only used in
some visualizations.
--to_json Output viewconf, rather than open in browser.
--epic_uuid UUID uuid of the EPIC dataset.
--parent_uuid UUID Parent uuid - Only needed for an image-pyramid support
dataset.
--epic_url EPIC_URL URL which returns Dataset JSON for the EPIC dataset
--epic_json EPIC_JSON
File containing Dataset JSON for the EPIC dataset
```


```
Notes:
1. The token can be retrieved by looking for Authorization Bearer {token represented by a long string} under `search-api` network calls under the network tab in developer's tool when browsing a dataset in portal while logged in. The token is necessary to access non-public datasets, such as those in QA.
2. The documentation for the `vis-preview.py` script must match the contents of the readme. When a script argument is added or modified, the README must be updated to match the output of `./vis-preview.py --help`.
```


## Build & Testing
```
To build: `python -m build`
`To run the tests `./test.sh`. Install the `flake8` and `autopep8` packages.
```

## Background
Expand All @@ -47,7 +70,9 @@ optional arguments:

Data for the Vitessce visualization almost always comes via raw data that is processed by [ingest-pipeline](https://github.com/hubmapconsortium/ingest-pipeline) airflow dags.
Harvard often contributes our own custom pipelines to these dags that can be found in [portal-containers](https://github.com/hubmapconsortium/portal-containers).
The outputs of these pipelines are then converted into view configurations for Vitessce by the [portal backend](https://github.com/hubmapconsortium/portal-ui/blob/0b43a468fff0256a466a3bf928a83893321ea1d9/context/app/api/client.py#L165),

The outputs of these pipelines are then converted into view configurations for Vitessce by the [portal backend](https://github.com/hubmapconsortium/portal-visualization/blob/main/src/portal_visualization/client.py), The `vis-preview.py` mimics the invocation of `get_view_config_builder` for development and testing purposes independently, i.e., without using the [portal backend](https://github.com/hubmapconsortium/portal-ui/blob/main/context/app/routes_browse.py#L126).

using code in this repo, when a `Dataset` that should be visualized is requested in the client.
The view configurations are built using the [Vitessce-Python API](https://vitessce.github.io/vitessce-python/).

Expand Down
6 changes: 3 additions & 3 deletions requirements-dev.txt
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
pytest==5.2.1
flake8==7.0.0
autopep8==1.4.4
autopep8==2.0.4
pytest-mock==3.7.0
coverage==6.3.1
pyyaml==6.0
coverage==7.6.4
pyyaml==6.0.2
5 changes: 3 additions & 2 deletions src/defaults.json
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
{
"assets_url": "https://assets.hubmapconsortium.org",
"assaytypes_url": "https://ingest.api.hubmapconsortium.org/assaytype/"
"assets_url": "https://assets.dev.hubmapconsortium.org",
"assaytypes_url": "https://ingest-api.dev.hubmapconsortium.org/assaytype/",
"dataset_url":"https://portal.dev.hubmapconsortium.org/browse/dataset/"
}
16 changes: 12 additions & 4 deletions src/portal_visualization/builder_factory.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
SeqFISHViewConfBuilder,
IMSViewConfBuilder,
ImagePyramidViewConfBuilder,
SegImagePyramidViewConfBuilder,
NanoDESIViewConfBuilder,
)
from .builders.anndata_builders import (
Expand All @@ -33,6 +34,7 @@ def process_hints(hints):
is_json = "json_based" in hints
is_spatial = "spatial" in hints
is_support = "is_support" in hints
is_segmentation_base = "segmentation_base" in hints

return (
is_image,
Expand All @@ -44,6 +46,7 @@ def process_hints(hints):
is_json,
is_spatial,
is_support,
is_segmentation_base,
)


Expand All @@ -53,10 +56,10 @@ def process_hints(hints):
# The entity is a dict that contains the entity UUID and metadata.
# `get_assaytype` is a function which takes an entity UUID and returns
# a dict containing the assaytype and vitessce-hints for that entity.
def get_view_config_builder(entity, get_assaytype, parent=None):
def get_view_config_builder(entity, get_assaytype, parent=None, epic_uuid=None):
if entity.get("uuid") is None:
raise ValueError("Provided entity does not have a uuid")
assay = get_assaytype(entity)
assay = get_assaytype(entity.get('uuid'))
assay_name = assay.get("assaytype")
hints = assay.get("vitessce-hints", [])
(
Expand All @@ -69,11 +72,16 @@ def get_view_config_builder(entity, get_assaytype, parent=None):
is_json,
is_spatial,
is_support,
is_segmentation_base
) = process_hints(hints)

# vis-lifted image pyramids
if parent is not None:
if is_support and is_image:
# TODO: For now epic (base image's) support datasets doesn't have any hints
if epic_uuid is not None or is_segmentation_base:
return SegImagePyramidViewConfBuilder

elif is_support and is_image:
ancestor_assaytype = get_assaytype(parent).get("assaytype")
if SEQFISH == ancestor_assaytype:
# e.g. parent = c6a254b2dc2ed46b002500ade163a7cc
Expand Down Expand Up @@ -134,5 +142,5 @@ def get_view_config_builder(entity, get_assaytype, parent=None):


def has_visualization(entity, get_assaytype, parent=None):
builder = get_view_config_builder(entity, get_assaytype, parent)
builder = get_view_config_builder(entity, get_assaytype, parent, epic_uuid=None)
return builder != NullViewConfBuilder
4 changes: 2 additions & 2 deletions src/portal_visualization/builders/anndata_builders.py
Original file line number Diff line number Diff line change
Expand Up @@ -125,7 +125,7 @@ def _set_up_marker_gene(self, marker):
if (marker_index >= 0):
marker = ensembl_ids[marker_index]
else:
pass
pass # pragma: no cover
# Encoding Version 0.2.0
# https://anndata.readthedocs.io/en/latest/fileformat-prose.html#categorical-arrays
# Our pipeline currently does not use this encoding version
Expand Down Expand Up @@ -243,7 +243,7 @@ def _setup_anndata_view_config(self, vc, dataset):
# This ensures that the view config is valid for datasets with and without a spatial view
spatial = self._add_spatial_view(dataset, vc)

views = list(filter(lambda v: v is not None, [
views = list(filter(lambda v: v is not None, [ # pragma: no cover
cell_sets, gene_list, scatterplot, cell_sets_expr, heatmap, spatial]))

self._views = views
Expand Down
26 changes: 24 additions & 2 deletions src/portal_visualization/builders/base_builders.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,9 @@
import urllib
from collections import namedtuple
from abc import ABC, abstractmethod

# import json
# import requests
# from pathlib import Path

ConfCells = namedtuple('ConfCells', ['conf', 'cells'])

Expand Down Expand Up @@ -74,7 +76,10 @@ def _build_assets_url(self, rel_path, use_token=True):
'https://example.com/uuid/rel_path/to/clusters.ome.tiff?token=groups_token'
"""
base_url = urllib.parse.urljoin(self._assets_endpoint, f"{self._uuid}/{rel_path}")
uuid = self._uuid
if hasattr(self, "_epic_uuid"): # pragma: no cover
uuid = self._epic_uuid
base_url = urllib.parse.urljoin(self._assets_endpoint, f"{uuid}/{rel_path}")
token_param = urllib.parse.urlencode({"token": self._groups_token})
return f"{base_url}?{token_param}" if use_token else base_url

Expand Down Expand Up @@ -118,6 +123,23 @@ def _get_file_paths(self):
"""
return [file["rel_path"] for file in self._entity["files"]]

# def _get_epic_entity(self):
# TODO: might need this if we decide to read the epic_entity on run time
# request_init = self._get_request_init()
# file_path = Path(__file__).resolve().parent.parent.parent / 'defaults.json'
# print(file_path)
# defaults = json.load(file_path.open())
# # headers = {"headers": {"Authorization": f"Bearer {self._groups_token}"}}
# url = f'{defaults['dataset_url']}/{self._epic_uuid}.json'
# print(url)
# response = requests.get(url, request_init)
# if response.status_code == 403:
# raise Exception('Protected data: Download JSON via browser; Redo with --json')
# response.raise_for_status()
# json_str = response.text
# entity = json.loads(json_str)
# return entity


class _DocTestBuilder(ViewConfBuilder): # pragma: no cover
# The doctests on the methods in this file need a concrete class to instantiate:
Expand Down
Loading

0 comments on commit 00913fa

Please sign in to comment.