Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create faq.md #181

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
104 changes: 104 additions & 0 deletions faq.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
## What problem does the OpenChatKit solve?

Based on the README, OpenChatKit solves several key problems in the open-source AI space by providing:

1. A flexible foundation for building both specialized and general-purpose language models. It allows developers and researchers to create custom AI applications by providing pre-trained base models that can be adapted for specific uses.

2. A comprehensive toolkit that includes:
- Instruction-tuned language models (including a 20B parameter chat model and a 7B parameter model)
- A moderation model for safety
- An extensible retrieval system that can incorporate up-to-date information from custom repositories

3. Access to open-source models trained on the OIG-43M dataset, which was created through collaboration between Together, LAION, and Ontocord.ai. This provides an alternative to closed, proprietary models.

4. Tools for both training and inference, including:
- Code for training and fine-tuning various models (GPT-NeoXT-Chat-Base-20B, Llama-2-7B-32K-beta, Pythia-Chat-Base-7B)
- Testing infrastructure for inference
- Capability to augment models with additional context through a retrieval index

The project essentially addresses the need for accessible, customizable, and comprehensive open-source language models that can be adapted for various applications while providing the necessary tools and infrastructure for both development and deployment.

## What are the advantages of using the OpenChatKit?

Based on the documentation, here are the key advantages of using OpenChatKit:

1. Open Source Accessibility
- Released under Apache 2.0 license, making it freely available for use and modification
- Complete transparency in the codebase and training process
- Ability to customize and extend the models for specific needs

2. Comprehensive Toolset
- Provides multiple pre-trained models of different sizes (7B and 20B parameters)
- Includes both training and inference code
- Built-in moderation model for safety considerations
- Extensible retrieval system for incorporating up-to-date information

3. Flexibility and Scalability
- Can be used for both specialized and general-purpose applications
- Supports multiple model architectures (GPT-NeoXT, Llama-2, Pythia)
- Allows fine-tuning on custom datasets
- Can run on various hardware configurations, including consumer hardware

4. Development Features
- Built-in monitoring tools (Loguru and Weights & Biases integration)
- Command-line test harness for easy model interaction
- Support for multiple GPUs
- Conversion tools for different model formats

5. Community and Support
- Backed by collaboration between multiple organizations (Together, LAION, Ontocord.ai)
- Built on established models from Eleuther AI
- Open for community contributions through the OpenDataHub
- Evaluated on HELM by Stanford's Center for Research on Foundation Models

6. Technical Capabilities
- Long context support (32K tokens with Llama-2-7B-32K-beta)
- Experimental retrieval-augmented model support for Wikipedia integration
- 8-bit Adam optimization support for efficient training
- Checkpoint saving and model conversion utilities

These advantages make OpenChatKit particularly useful for researchers, developers, and organizations looking to build and customize their own AI applications with open-source tools and models.

## Is OpenChatKit the Right Choice?

What questions do I need to ask myself to check whether OpenChatKit is the right choice for my use case?

Based on the documentation, here are the key questions you should ask yourself to evaluate if OpenChatKit is suitable for your use case:

1. Resource Requirements
- Do I have the necessary computational resources (GPUs, memory) to run these models?
- Can my infrastructure handle models ranging from 7B to 20B parameters?
- Do I need to run this on consumer hardware or do I have access to more powerful computing resources?

2. Technical Compatibility
- Am I comfortable working with PyTorch and Python?
- Do I have the technical capability to handle model fine-tuning and deployment?
- Can I meet the basic requirements (Miniconda, Git LFS, etc.)?

3. Model Characteristics
- Do I need a general-purpose chat model or a specialized one?
- Is the context length (up to 32K tokens with Llama-2) sufficient for my use case?
- Will I need the experimental retrieval-augmented capabilities?
- Is the model's performance (based on HELM evaluations) adequate for my needs?

4. Development Requirements
- Do I need to fine-tune the model on custom data?
- Will I need to modify or extend the base models?
- How important is monitoring and logging (through Loguru or Weights & Biases) for my use case?

5. Legal and Licensing
- Is the Apache 2.0 license compatible with my intended use?
- Am I comfortable with using open-source models?
- Do I need proprietary or commercial support?

6. Integration Needs
- How will this integrate with my existing systems?
- Do I need the moderation model component?
- Will I need to use the retrieval system for up-to-date information?

7. Support and Maintenance
- Am I prepared to handle updates and maintenance myself?
- Do I have the expertise to troubleshoot issues?
- Is community support sufficient for my needs?

Considering these questions will help determine if OpenChatKit aligns with your technical requirements, resources, and project goals.