Background
Data
Models
Timeline
Repo Structure
Logistics
Resources
Contact
Provide a clearly defined goal for the project and what this repository contains. Outline what the final deliverable is and how it can be accessed.
Provide detailed background information about the project. Readers should be able to have a clear understanding of the goals and methods used. This section should also give the reader instructions on how to navigate the repository.
Mention what data is being used for the project and where it is stored. Mention any data privacy concerns.
Mention what models are being used or developed as part of the project. Provide information on how to use these models (link to HuggingFace etc.)
Outline an estimated timeline for the project.
Provide details on how readers can use the repository to replicate your results, run any final deliverables like apps or inference models that you may have trained/fine-tuned. Ensure that the repository is well structued and folders are named appropriately. Follow best practices (Notebook naming convention {00s for templates, 10s for data loading and prep, 20s for exploratory analysis, 30s for feature engineering, 40s for model development/training, 50s for evaluation and inference})
Sprint planning:
Backlog Grooming:
Coding Meeting:
Sprint Restrospective:
Demo:
Data location:
Slack channel:
Provide any useful resources to get readers up to speed with the project here.
- Python usage: Whirlwind Tour of Python, Jake VanderPlas (Book, Notebooks)
- Data science packages in Python: Python Data Science Handbook, Jake VanderPlas
- HuggingFace: Website, Course/Training, Inference using pipelines, Fine tuning models
- fast.ai: Course, Quick start
- h2o: Resources, documentation, and API links
- nbdev: Overview, Tutorial
- Git tutorials: Simple Guide, Learn Git Branching
- ACCRE how-to guides: DSI How-tos
Provide contact information of Project Lead, Principal Investigators, and Team Members.