togethercomputer · bparanj · Dec 26, 2024
diff --git a/faq.md b/faq.md
@@ -0,0 +1,104 @@
+## What problem does the OpenChatKit solve?
+
+Based on the README, OpenChatKit solves several key problems in the open-source AI space by providing:
+
+1. A flexible foundation for building both specialized and general-purpose language models. It allows developers and researchers to create custom AI applications by providing pre-trained base models that can be adapted for specific uses.
+
+2. A comprehensive toolkit that includes:
+- Instruction-tuned language models (including a 20B parameter chat model and a 7B parameter model)
+- A moderation model for safety
+- An extensible retrieval system that can incorporate up-to-date information from custom repositories
+
+3. Access to open-source models trained on the OIG-43M dataset, which was created through collaboration between Together, LAION, and Ontocord.ai. This provides an alternative to closed, proprietary models.
+
+4. Tools for both training and inference, including:
+- Code for training and fine-tuning various models (GPT-NeoXT-Chat-Base-20B, Llama-2-7B-32K-beta, Pythia-Chat-Base-7B)
+- Testing infrastructure for inference
+- Capability to augment models with additional context through a retrieval index
+
+The project essentially addresses the need for accessible, customizable, and comprehensive open-source language models that can be adapted for various applications while providing the necessary tools and infrastructure for both development and deployment.
+
+## What are the advantages of using the OpenChatKit?
+
+Based on the documentation, here are the key advantages of using OpenChatKit:
+
+1. Open Source Accessibility
+- Released under Apache 2.0 license, making it freely available for use and modification
+- Complete transparency in the codebase and training process
+- Ability to customize and extend the models for specific needs
+
+2. Comprehensive Toolset
+- Provides multiple pre-trained models of different sizes (7B and 20B parameters)
+- Includes both training and inference code
+- Built-in moderation model for safety considerations
+- Extensible retrieval system for incorporating up-to-date information
+
+3. Flexibility and Scalability
+- Can be used for both specialized and general-purpose applications
+- Supports multiple model architectures (GPT-NeoXT, Llama-2, Pythia)
+- Allows fine-tuning on custom datasets
+- Can run on various hardware configurations, including consumer hardware
+
+4. Development Features
+- Built-in monitoring tools (Loguru and Weights & Biases integration)
+- Command-line test harness for easy model interaction
+- Support for multiple GPUs
+- Conversion tools for different model formats
+
+5. Community and Support
+- Backed by collaboration between multiple organizations (Together, LAION, Ontocord.ai)
+- Built on established models from Eleuther AI
+- Open for community contributions through the OpenDataHub
+- Evaluated on HELM by Stanford's Center for Research on Foundation Models
+
+6. Technical Capabilities
+- Long context support (32K tokens with Llama-2-7B-32K-beta)
+- Experimental retrieval-augmented model support for Wikipedia integration
+- 8-bit Adam optimization support for efficient training
+- Checkpoint saving and model conversion utilities
+
+These advantages make OpenChatKit particularly useful for researchers, developers, and organizations looking to build and customize their own AI applications with open-source tools and models.
+
+## Is OpenChatKit the Right Choice?
+
+What questions do I need to ask myself to check whether OpenChatKit is the right choice for my use case?
+
+Based on the documentation, here are the key questions you should ask yourself to evaluate if OpenChatKit is suitable for your use case:
+
+1. Resource Requirements
+- Do I have the necessary computational resources (GPUs, memory) to run these models?
+- Can my infrastructure handle models ranging from 7B to 20B parameters?
+- Do I need to run this on consumer hardware or do I have access to more powerful computing resources?
+
+2. Technical Compatibility
+- Am I comfortable working with PyTorch and Python?
+- Do I have the technical capability to handle model fine-tuning and deployment?
+- Can I meet the basic requirements (Miniconda, Git LFS, etc.)?
+
+3. Model Characteristics
+- Do I need a general-purpose chat model or a specialized one?
+- Is the context length (up to 32K tokens with Llama-2) sufficient for my use case?
+- Will I need the experimental retrieval-augmented capabilities?
+- Is the model's performance (based on HELM evaluations) adequate for my needs?
+
+4. Development Requirements
+- Do I need to fine-tune the model on custom data?
+- Will I need to modify or extend the base models?
+- How important is monitoring and logging (through Loguru or Weights & Biases) for my use case?
+
+5. Legal and Licensing
+- Is the Apache 2.0 license compatible with my intended use?
+- Am I comfortable with using open-source models?
+- Do I need proprietary or commercial support?
+
+6. Integration Needs
+- How will this integrate with my existing systems?
+- Do I need the moderation model component?
+- Will I need to use the retrieval system for up-to-date information?
+
+7. Support and Maintenance
+- Am I prepared to handle updates and maintenance myself?
+- Do I have the expertise to troubleshoot issues?
+- Is community support sufficient for my needs?
+
+Considering these questions will help determine if OpenChatKit aligns with your technical requirements, resources, and project goals.