Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ideas to discuss #38

Open
teresa-m opened this issue Feb 17, 2022 · 2 comments
Open

Ideas to discuss #38

teresa-m opened this issue Feb 17, 2022 · 2 comments
Labels
help wanted Extra attention is needed question Further information is requested

Comments

@teresa-m
Copy link
Member

teresa-m commented Feb 17, 2022

Important:

  • Which model should be our core model?
  • What figure/table should be in the main text or supplement
  • should we store the models/ datasets somewhere zenodo
  • should we give a possible use case for Cherri. Where it can be helpful to solve a problem, close a gap?

For later:

  • parralization of IntaRNA calles in pos/neg data
  • is there a standard file format for RRIs we can use for Eval mode?
  • a new method of hutter group to deep learning for tabular data could be used to build a model
  • build in cross-validation for the model to also report an f1 score
  • check memory consumption of feature selection + optimization and see if the parallel is an issue here
  • homologs: Find homologes in human and mouse training data #32
  • go away from genome-only sequences with context. e.g. give directly a mRNA
  • update IntaRNA to genomic version IntaRNA
@teresa-m teresa-m added help wanted Extra attention is needed question Further information is requested labels Feb 17, 2022
@teresa-m
Copy link
Member Author

Possible applications

  • postprocessing of sRNA, miRNA, gRNA,
  • off-target predictions
  • way to pre-filter RRIs for experiments

@teresa-m
Copy link
Member Author

teresa-m commented Oct 24, 2022

Tasks:

  • get conda pacakge running
  • Galaxy wrapper?
  • update documentaion + paper on how to change IntaRNA paramteres
  • Make the training data merge better usable
  • Make a more generic input data possible (no need for Chira but it needs a header)
  • Get functional testing/ GitHub actions
  • Test the train and eval test calls (Maybe someone without prior knowledge)
  • Affiliations of @martin and Egg
  • Maybe:train models for single classes e.g. miRNA-mRNA, snoRNA-rRNA
  • Maybe: Get LIGR-seq, SPLASH running. Or we take a CLIP-like method as an additional validation set.
  • Maybe : Report feature importance

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed question Further information is requested
Projects
None yet
Development

No branches or pull requests

1 participant