Mouse Kidney Atlas

We present the Mouse Kidney Atlas (MKA), a comprehensive atlas of cellular heterogeneity in the healthy mouse kidney, which we generated by carefully integrating data from eight publicly available studies. We integrate these datasets using scVI and scANVI. To overcome annotation inconsistencies we learn the relationship between cell type transcriptomic profiles across datasets using scHPL. This model is then able to automatically label unseen cell populations with unprecedented resolution and accuracy. We demonstrate the significance of our atlas by obtaining robust and novel markers for poorly described cell types.

The MKA is publicly available to download, visualize and interact with at cellxgene

For more details refer to: A comprehensive mouse kidney atlas enables rare cell population characterization and robust marker discovery

File descriptions

models: Files containing the trained models used in the manuscript
notebooks: notebooks used to generate the figures presented in the manuscript
- QC_scVI_scANVI : Figure 1
- scHPL_ManualReannotation : Figure 2 and 3
  
  Supplementary Figures 1, 2 and 3
- scHPL_Evaluation : Figure 4
  
  Supplementary Figure 4 and 5
- Downstream_analyses : Figure 5
  
  Supplementary Figure 6
MKA_Metamarkers.xlsx Excel file with the identified metamarkers for each cell type label in the MKA.
- Rank: Overall ranking for this gene within a cell type. The higher the ranking the better the marker is for the given population accounting for batch differences and number of datasets in which the gene is detected.
- AUROC: Area under the receiver-operator curve. This value is an indication of how good the gene is in a classification scenario. For example, Podxl has an AUROC value of 0.9, which means that this gene is very good at classifying Podocytes as such.
functions.py helper functions used across the code
hyper_tune.py Ray tune implementation to optimize scVI model hyperparameters

Using the trained models

If you want to use the models for your own research you will need the HVG-filtered matrix we trained these on. You can find the AnnData object at Zenodo. Once downloaded, you can:

import os
import scvi
import scanpy as sc

os.chdir("MKA")
adata = sc.read_h5ad("adata.h5ad")
atlas_model = scvi.model.SCANVI.load("models/scANVI_model_full", adata=adata)

Hyperparameter Optimization

Ray tune was used train 1000 different hyperparameter and model configurations.

The tracked metrics at each training epoch were 'elbo_validation', 'reconstruction_loss' and 'silhouette_score'. Batch and cell type silhouette scores computed on the latent space were used as objective functions to maximize during training.

The search space was defined as follows:

model configuration
- dropout rate: loguniform distribution between 1e-4 and 1e-1
- number of layers: random integer between 1 and 3
- number of latent dimensions: random integer between 20 and 31
plan configuration
- learning rate: loguniform distribution between 1e-4 and 1e-1
atlas architecture
- subset: random boolean (True/ False).
The purpose of this parameter is to test the effect of filtering the feature space
- number of hvgs: random choice between 2000 and 8000 in 1000 increments
- continious_covariates: random choice between 'pct_counts_mt' and None
- categorical_covariates: random choice between 'Source' and None
'Source' in this case refers to either nuclei or cell as the starting material
number of epochs: random number between 100 and 201

Datasets

The following table contains all studies included in the MKA

Publication	Abbbreviation	Accession number
Wu et al., 2019	Wu19	GSE119531
Miao et al., 2021	Miao21	GSE157079
Park et al., 2018	Park18	GSE107585
Kirita et al., 2020	Kirita20	GSE139107
Dumas et al., 2020	Dumas20	E-MTAB-8145
Conway et al., 2020	Conway20	GSE140023
Hinze et al., 2021	Hinze21	GSE145690
Janosevic et al., 2021	Janosevic21	GSE151658

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
models		models
notebooks		notebooks
.gitignore		.gitignore
Figure1-workflow.png		Figure1-workflow.png
LICENSE		LICENSE
MKA_Metamarkers.xlsx		MKA_Metamarkers.xlsx
README.md		README.md
functions.py		functions.py
hyper_tune.py		hyper_tune.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Mouse Kidney Atlas

File descriptions

Using the trained models

Hyperparameter Optimization

Datasets

About

Releases

Packages

Languages

License

nrclaudio/MKA

Folders and files

Latest commit

History

Repository files navigation

Mouse Kidney Atlas

File descriptions

Using the trained models

Hyperparameter Optimization

Datasets

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages