Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cellphone DB (and other database versions) #60

Open
SimonDMurray opened this issue Sep 22, 2023 · 5 comments
Open

Cellphone DB (and other database versions) #60

SimonDMurray opened this issue Sep 22, 2023 · 5 comments
Labels
enhancement New feature or request

Comments

@SimonDMurray
Copy link

Hi,

Sorry if this information is available somewhere but I cannot find it. I would like to know what version of cellphoneDB (and other databsases) you have inside the liana-py package.

I am aware you import your databases from OmniPathR (unless I am mistaken) and have checked their version of cellphoneDB: saezlab/OmnipathR#85

I wanted to double check that your version matches theirs and if not could it be updated?

Thanks,
Simon

@dbdimitrov
Copy link
Collaborator

Hi @SimonDMurray,

Currently, the CellPhoneDB accessible via LIANA will be CellPhoneDBv2, as while it's imported via OmniPath, the database in LIANA+ is versioned independently. If you wish to use CellPhoneDBv4 - you can directly obtain it via OmniPath's python (https://github.com/saezlab/omnipath) or R API and then just feed it to LIANA as a dataframe.

I plan to update it also in LIANA, but this also includes some infrastructural changes which we're currently working on (#9).

@dbdimitrov dbdimitrov added the enhancement New feature or request label Oct 27, 2023
@dbdimitrov
Copy link
Collaborator

Hi @SimonDMurray,

A bit delayed but I was checking the new CPDB resource and one can get it the following way:

import pandas as pd
import numpy as np

import requests
import io

# read csv from link
# https://github.com/ventolab/cellphonedb-data/blob/master/data/interaction_input.csv
resource = requests.get('https://raw.githubusercontent.com/ventolab/cellphonedb-data/master/data/interaction_input.csv').content
resource = io.StringIO(resource.decode('utf-8'))
resource = pd.read_csv(resource, sep=',')
# keep only PPIs
resource = resource[resource['is_ppi']][['interactors']]
# replace + with _
resource['interactors'] = resource['interactors'].apply(lambda x: x.replace('+', '_'))
# if interactors contains two '-' replace the first one with '&
resource['interactors'] = resource['interactors'].apply(lambda x: x.replace('-', '&', 1) if x.count('-') == 2 else x)
# split by - and expand
resource = resource['interactors'].str.split('-', expand=True)
# replace & with - in the first column
resource[0] = resource[0].apply(lambda x: x.replace('&', '-'))
resource.columns = ['ligand', 'receptor']

Then it's as simple as passing the resource to the resource parameter to any liana function you would like to use.

@feriobrahmana
Copy link

Hi @dbdimitrov,

I’m new here and just came across your LIANA+ paper—it seems incredibly useful for my work!

However, I still have a few questions, even after reading this thread:

  1. Based on your tutorial and your answer in this discussion, it seems possible to feed LIANA+ with a specific database (e.g., CellChatv2) that isn’t included in OmniPath. Is that correct?
  2. If I understand correctly, the key requirement is ensuring the format of the dataframe being fed into LIANA+. Could you clarify how the dataframe should be formatted? I tried searching for this but couldn’t quite figure out what to look for.
  3. Lastly, if I use a custom database, can I still utilize the rank-aggregation method to analyze results across several methods in parallel?

Thank you so much!

Best,
Ferio

@dbdimitrov
Copy link
Collaborator

dbdimitrov commented Dec 12, 2024

Hi @feriobrahmana,

Yes, you can feed any database to liana, to see the format you can just do liana.resource.select_resource(). This will return a dataframe, with the columns that it expects. Then you can feed a DataFrame with the same set of columns ('ligand', 'receptor') to the resource parameter of any method. Likewise, you can provide a list of tuples to the interactions parameter; e.g. [('ligand1', 'receptor1_receptor3'), ('ligand2', 'receptor2'), ...].

Note that '_' designates the separation of complex subunits.

Any method in liana will work with these parameters, including rank-aggregation.

Hope this helps.

Daniel

@feriobrahmana
Copy link

Ah, okay! I see...

Thank you so much for the swift and clear response, @dbdimitrov!

And, yes, I think your answer has helped a lot!

Will start to try and find a great utilization of LIANA+ then, hehe!

Best,
Ferio

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants