[WIP] Create JSON files for frontend consumption #29

dhimmel · 2016-10-05T16:58:48Z

Work in progress (WIP).

dhimmel · 2016-10-05T17:16:06Z

This pull request creates a mapping from gene (as Entrez GeneIDs) to the list of mutated samples (as TCGA sample IDs). This dictionary/JSON obejct is called gene_to_mutated_samples. As a JSON text file, it was 20.68 MB and 2.97 MB when gzip compressed.

This pull request also creates disease_to_samples, a dictionary of disease acronym to sample ID. This files is small (0.17 MB) and thus not a concern.

The goal of disease_to_samples and gene_to_mutated_samples was to allow the frontend to load these entire objects and then perform efficient set operations to get sample/positive/negative counts. For example, the user may have selected diseases = {'GBM', 'COAD', 'LUNG'} and mutations = {2641, 340024}.

The frontend would do the equivalent of this python in javascript:

mutated_samples = set()
for mutation in mutations:
    mutated_samples |= gene_to_mutated_samples[mutation]

selected_samples = set()
for disease in diseases:
    selected_samples |= disease_to_samples[disease]

# counts
n_samples = len(selected_samples)
n_positives = len(selected_samples & mutated_samples)
n_negatives = n_samples - n_positives

Alerting @bdolly, @awm33, @cgreene for discussion on how to proceed.

My questions are:

Is 20.68 MB too big to pass to a browser?
Will the payload be compressed in transit?
Will the payload be cached?
Will this consume too much browser memory (RAM)?
Should we switch to an int ID for samples to cut down this size?
Or should we just have the frontend query the backend for these stats?

awm33 · 2016-10-06T01:25:22Z

Is 20.68 MB too big to pass to a browser?

Depends, I assume most people will be using this from a desktop with Wifi or a wired connection. So, from a pure transmitting bytes standpoint alone, no.

Will the payload be compressed in transit?

We can / should set up gzip compression on the server

Will the payload be cached?

If the correct headers are set by the server, yes. Other methods could be used as well, beyond HTTP caching, like localStorage.

Will this consume too much browser memory (RAM)?

Maybe. I'd be more worried about the access time. JavaScript is single threaded, if we were to calculate something like this client-side, I would use a web worker.

Should we switch to an int ID for samples to cut down this size?

It's the access performance, which should be hashmaps in JS, I don't think that would buy you much, if anything.

Or should we just have the frontend query the backend for these stats?

I would lean towards this for performance and API reasons. If we are also thinking of others using our API, this would make it easier for them. We're already using the django filter plugin which allows for querying on related model fields. This would be added to the /samples endpoint. We may want to use the field selection plugin to limit how much data is returned, assuming you just need the ids.

bdolly · 2016-10-07T14:26:22Z

@awm33 so I like the idea of using the field selection plugin to do this with rather than a large json file on app load. I think firing off small request on user keystroke doing search will be effecient as the plugin will return smaller faster responses

awm33 · 2016-10-09T20:11:50Z

@bdolly Cool

I created an issue/task for this cognoma/core-service#33

Create sample lookup JSON objects for frontend

0ac37c7

dhimmel force-pushed the json branch from 94c42aa to 0ac37c7 Compare October 5, 2016 19:00

awm33 mentioned this pull request Oct 9, 2016

Add filter to and field selection to /samples cognoma/core-service#33

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Create JSON files for frontend consumption #29

[WIP] Create JSON files for frontend consumption #29

dhimmel commented Oct 5, 2016

dhimmel commented Oct 5, 2016 •

edited

Loading

awm33 commented Oct 6, 2016

bdolly commented Oct 7, 2016

awm33 commented Oct 9, 2016

[WIP] Create JSON files for frontend consumption #29

Are you sure you want to change the base?

[WIP] Create JSON files for frontend consumption #29

Conversation

dhimmel commented Oct 5, 2016

dhimmel commented Oct 5, 2016 • edited Loading

awm33 commented Oct 6, 2016

bdolly commented Oct 7, 2016

awm33 commented Oct 9, 2016

dhimmel commented Oct 5, 2016 •

edited

Loading