Skip to content

Zack's Braindump

n1zea144 edited this page Sep 19, 2017 · 1 revision

Mongo

Some mongodb basics for checking genome nexus stuff:
Login: mongo -u genome_nexus -p --authenticationDatabase admin annotator
Query: db.vep.annotation.find().limit(4);
Query specific: db.vep.annotation.find({"_id":"3:g.178916920_178916922delGAA”})
Delete: db.vep.annotation.deleteOne({"_id":"3:g.178916920_178916922delGAA"});

Requeue

To requeue samples to cvr manually, put ids in a file, one id per line. Use the script located in the pipelines repo: pipelines/importer/src/main/scripts/requeue.py

python requeue.py --properties-file [portal properties file] --samples-file [list of samples file]

Check data from CVR for specific sample

If you need to find the json for a sample sent by CVR, you can either requeue it and do a wget on the webservice:

wget "http://draco.mskcc.org:9770/create_session/<USER>/<PASS>/0"
Get the session id from the response, use it here:
wget "http://draco.mskcc.org:9770/cbio_retrieve_variants/<SESSION>/0"

OR

You can use this python script which will revert the cvr_data.json file until it finds the sample. This lets you see what we last received from CVR. You can modifiy it to report every revision number the sample appears pretty easily.

python cmo-pipelines/import-scripts/find_sample.py --directory <path to top-level msk-impact repo> --sample <SAMPLE_ID>

When using this, make sure the path is to the top level repo, not the end dir (cbio-portal-data/msk-impact, not cbio-portal-data/msk-impact/msk-impact)

Darwin DB2 direct query

To Use xquarts for DB2 Darwin:
ssh -l heinsz -Y dashi-dev.cbio.mskcc.org
~/local/dsdriver/clpplus/bin/clpplus DVCBPAPS/[email protected]:51013/DVPDB01

Basic queries:
set current schema DVCBIO;
select count(*) from DVCBIO.DEMOGRAPHICS_V;

Synapse Upload

To upload data to synapse for genie, there is a script located at cbio-portal-data/genie/msk_submission_to_genie/upload-all.sh. You just pass your synapse username and password on the command line as arguments to it. You must run the script in the same directory as where the data is.

Synapse Download

At the moment, we do not automatically download data from synapse for genie. However, it is extremely simple to do if you have the synapse id of the project directory you need to download.

On the website, you can navigate to the project data you need and click the "How to Download" button near the top of the page, which brings up a box with a single command line operation that will download the data. Pretty much you just do something like this on the command line:

synapse get -r syn5521835

Automating this process would be fairly straightforward as the synapse id for the directories where the data lives should not change.

Dashi-dev already has the synapse client installed, and it is really easy to install on your local if you need to, pip install synapseclient should work. Synapse has the client documented fairly well.

couchDB flask app

If this is ever revived, I have code here, with some documentation on the wiki: https://github.com/zheins/portalFlask