Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom dataloader registry support #2932

Open
wants to merge 83 commits into
base: main
Choose a base branch
from

Conversation

ori-kron-wis
Copy link
Collaborator

No description provided.

@ori-kron-wis ori-kron-wis added this to the scvi-tools 1.2 milestone Aug 7, 2024
@ori-kron-wis ori-kron-wis self-assigned this Aug 7, 2024
@ori-kron-wis ori-kron-wis linked an issue Aug 7, 2024 that may be closed by this pull request
Copy link

codecov bot commented Aug 11, 2024

Codecov Report

Attention: Patch coverage is 50.43860% with 113 lines in your changes missing coverage. Please review.

Project coverage is 82.50%. Comparing base (835d17a) to head (31e1d44).

Files with missing lines Patch % Lines
src/scvi/model/base/_base_model.py 37.60% 73 Missing ⚠️
src/scvi/model/_scanvi.py 58.13% 18 Missing ⚠️
src/scvi/model/_scvi.py 47.82% 12 Missing ⚠️
src/scvi/model/base/_archesmixin.py 75.00% 8 Missing ⚠️
src/scvi/model/base/_save_load.py 75.00% 1 Missing ⚠️
src/scvi/model/base/_training_mixin.py 50.00% 1 Missing ⚠️

❗ There is a different number of reports uploaded between BASE (835d17a) and HEAD (31e1d44). Click for more details.

HEAD has 13 uploads less than BASE
Flag BASE (835d17a) HEAD (31e1d44)
16 3
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2932      +/-   ##
==========================================
- Coverage   90.14%   82.50%   -7.65%     
==========================================
  Files         181      181              
  Lines       15644    15548      -96     
==========================================
- Hits        14103    12828    -1275     
- Misses       1541     2720    +1179     
Files with missing lines Coverage Δ
src/scvi/data/_utils.py 87.57% <100.00%> (+0.53%) ⬆️
src/scvi/external/stereoscope/_model.py 92.40% <ø> (ø)
src/scvi/external/stereoscope/_module.py 96.33% <ø> (ø)
src/scvi/model/_amortizedlda.py 94.11% <ø> (ø)
src/scvi/model/_autozi.py 95.40% <ø> (ø)
src/scvi/model/_condscvi.py 95.74% <ø> (ø)
src/scvi/model/_jaxscvi.py 92.30% <ø> (ø)
src/scvi/model/_linear_scvi.py 94.87% <ø> (ø)
src/scvi/model/_multivi.py 75.08% <ø> (ø)
src/scvi/model/_peakvi.py 87.09% <ø> (ø)
... and 7 more

... and 28 files with indirect coverage changes

adata.obs["batch"] = adata.obs[batch_keys].agg("".join, axis=1).astype("category")

scvi.model.SCVI.prepare_query_anndata(adata, save_path)
scvi.model.SCVI.load_query_data(registry=datamodule.registry, reference_model=save_path)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should have more tests that actually fail - using different genes without prepare_query_anndata and different batch categories. Assert that it fails.


scvi.model.SCVI.prepare_query_anndata(adata, model_census2)

scvi.model.SCVI.setup_anndata(adata, batch_key="batch") # needed?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

checking that an AnnData model can be trained using datamodule. Do we really want it?


user_attributes_model_census3 = model_census3._get_user_attributes()
pprint(user_attributes_model_census3)
_ = model_census3.get_elbo()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uses AnnData for inference?

scvi.model.SCVI.prepare_query_anndata(adata, model_census3)
scvi.model.SCVI.load_query_data(adata, model_census3)

datamodule_inference = CensusSCVIDataModule(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check here that using different genes and different batches fails. You can take much fewer cells here, like 1000.

# Create a dataloder of a CZI module
datapipe = datamodule_inference.datapipe
dataloader = experiment_dataloader(datapipe, num_workers=0, persistent_workers=False)
mapped_dataloader = (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's this?


model = SCVI(adata, n_latent=n_latent)
model.train(max_epochs=1)
dataloader = model._make_data_loader(adata)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does model._make_data_loader exist for all models? We should then add the test to the other models as well?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the dataloader sufficient to also setup the model and does setup_datamodule work for it?

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fix custom dataloader registry
2 participants