Skip to content

Chatbot with large language models

toncho11 edited this page Apr 5, 2023 · 17 revisions

If you want to build a chat bot using the very popular large language models (LLM) there are several important factors to consider.

Limitations of the model (or not)

  • is it downloadable and executable locally
    • maybe the model is downloadable, but not the weights (you need to apply for them) and thus you might need to use alternative pre-trained weights
  • is your machine actually capable of running it
  • is it only available as remote API (free or paid)

Type of the model

  • generative
  • generative instruct model

Which model?

The following have been selected to be downloadable, (rather) executable on a local computer, no API models with good generative qualities. Please note that these models are also selected because they can potentially be used to construct more general chat bot - one that is not so domain or task oriented. One should also consider the number of parameters and experiment with every model of interest.

  • GODEL
  • LLAMA - chat bot example
  • LLAMA with Alpaca-LoRA
    • it is a 7B model that can run 16 of RAM
      • GPU requires 16 GB or it can be in hybrid/offload mode that can potentially use less, but it is hard to load the model in this mode when with limited resources
      • Runs very slowly on CPU for inference. Problems when running on Windows and with 4GB GPU
    • chat bot example
    • example console code
    • example notebook that can be used on Google Colab with GPU
  • GPT4ALL
  • GPT-J - chat bot example
    • requires API keys
  • FastChat - an open platform for training, serving, and evaluating large language model based chatbots

Full list of LLM