Escrow 1024.17 Doc Chat Assistant

Simplified chat assistant design diagram

How to interact with the Escrow 1024.17 Doc Chat Assistant

Set the environment variables via .env file

Create a .env file in the root of the repo and set the following environment variables:

OPENAI_API_KEY=<your openai api key>
GROQ_API_KEY=<your groq api key> # required if you are running prompt evals, as the evals uses Mixtral-8b model from groq api to evaluate the prompts against openai's gpt-3.5-turbo model

the app will use the api keys to interact with the openai api and the groq api using this .env file, you do not need to export it manually.

Run the chat assistant

Easiest way to interact with the chat assistant is via the strreamlit app. To run the app, without worrying about the dependencies, you can use the docker.
To build and run our app in docker container, we use docker compose.

From the root of the repo run the following command:

docker compose up --build

The comand above will bild the container and run the streamlit app. The app will be available at http://localhost:8501 in your browser.

Please check the terminal for logs and errors if any.

NOTE: It will take time to build the container as well as for the streamlit app to start. As the embedding model is large, it will take time to load the model, create index and start the app. (To save time and resources the vector database is not hosted on the cloud, it gets created on the fly when the app starts. This is not the best practice and should be avoided in production.)

You can run the app without Docker as well.

First install the dependencies

$ poetry install #first install the dependencies
$ poetry shell #activate the virtual environment

Chat Via Terminal

$ python chat_assistant.py

Chat via Streamlit App

$ streamlit run app.py

Escrow 1024.17 data preparation

Please refer to the escrow_data_retriever.ipynb notebook for the data preparation steps. The notebook explains how the data was retrieved from the Escrow 1024.17 website and how the data was preprocessed to create the dataset for RAG indexingl.

RAG evaluation data

Final rag eval results: RAG evaluation results
I ran a experiments on Direct RAG along with advanced retrieval methods like Sentence Window and Auto-Merging Retrival. Furthermore, experiments included prameter variance as well to find the best configuration for Retrueving Escrow 1024.17 documents. The evaluations was run on these queries. The results are as follows:

The best configuration (good balance of answer and context relevance, and groundedness) was found to be:

Sentence retrieval window: 1
chunk size: 128
effective retrieved context length (node): 384 characters

Notes: although not the cheapest configuration, it was the most effective in terms of Groundedness, answer relevance and context relevance.

Prompt Template

The prompt template is created as follows (learnt from the papers below):

# Role
(Role-play prompting is an effective stratigy where we assign a specific role to play during the interaction. This helps the model to "immerse" itself in the role and provide more accurate and relevant answers) ref 1

# Task
(a direct description of what we want the model to do. One technique that works well is to use chain-of-thought prompting ref 2 to guide the model through the task)

# Specifics (provide most inportant notes regarding the task. Integrating Emotional Stimuli ref 3 has showin to increase response quality and accuracy)

# Context
(what is the environment in which the task is to be performed. Fairness-guided Few-shot Prompting ref 4 has shown that providing context helps the model to understand the task better)

# Examples
(giving a few q/a pairs of example questions and answers can help the model understand the task better. This is a good practice to follow. Rethinking the Role of Demonstrations ref 6 explains this in detail)

# Notes
(Additional and repeted notes that can help the model do the task better . lost in the middle paper ref 7 shows that the llms are good at remembering the start and the end of the context better that the middle. so it is important to repeat the task and the context in the notes section briefly. Though newer models are better at finding needle in a hay stack, it is still a good practice to follow)

Prompt Engineering References

Prompt Evaluation

We are using promptfoo for evaluating our prompts. To run the evaluation, first install promptfoo

$ bun add promptfoo # or npm install -g promptfoo

Then form the eval command from the root folder of this repo as follows:

$ cd prompt_eval_cloud
$ promptfoo eval

make sure the GROQ_API_KEY and OPENAI_API_KEY is set in the .env file as this eval evaluated models from the OpenAI API (gpt-3.5-turbo) and GROQ API(mixtral-8b).

Note: To save time and resources, the evaluation is not thurogh and only a few prompts are evaluated.

To get the detailed view of the evalutaiton, run the following command:

$ promptfoo view -y

A new tab with the following view will open in your browser:

fine-tuneing

Refer the following notebook on how the dataset was generated and the model.

Refer the following notebook for see how the model was finetuned on the Escrow 1024.17 documents.
(note: this notebook is a colab motebook and it was easy to run the experiments on the google cloud with powerful gpus.)

The finetuned gemma model is available at huggingface

Download the model and place it in the fine_tuned_model folder in the repo and from the root of the repo run the following command to interact create ollama model.

$ ollama create escrow_gemma -f ./ModelfileGemma

to interact with the model run the following command:

$ ollama run escrow_gemma:latest

NOTE The model finetuning dataset consisted only the positive q/a pairs and no relevent context q/a, to get better performance we need to include negative q/a pairs as well along with some chat data. This will help the model to understand the context better and provide more accurate responses as intended for this application.

Fine-Tuned Model Evaluation

Please check out the link to see the evaluation of the fine-tuned model. The model was evaluated on these test and compared to open source models: 'llama3-8b' and 'gemma-8b'.

In the evaluation, the fine-tuned model is named 'escrow_gemma:latest'.

Deploy fine-tuned model to AWS (for future referance)

import json
import sagemaker
import boto3
from sagemaker.huggingface import HuggingFaceModel, get_huggingface_llm_image_uri

try:
	role = sagemaker.get_execution_role()
except ValueError:
	iam = boto3.client('iam')
	role = iam.get_role(RoleName='sagemaker_execution_role')['Role']['Arn']

# Hub Model configuration. https://huggingface.co/models
hub = {
	'HF_MODEL_ID':'pyrotank41/gemma-7b-it-escrow-merged-gguf',
	'SM_NUM_GPUS': json.dumps(1)
}



# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
	image_uri=get_huggingface_llm_image_uri("huggingface",version="1.4.2"),
	env=hub,
	role=role, 
)

# deploy model to SageMaker Inference
predictor = huggingface_model.deploy(
	initial_instance_count=1,
	instance_type="ml.g5.2xlarge",
	container_startup_health_check_timeout=300,
  )
  
# send request
predictor.predict({
	"inputs": "What is the escrow 1024.17 document?",
})

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
.streamlit		.streamlit
__pycache__		__pycache__
application_llm_eval		application_llm_eval
escrow_data		escrow_data
eval_results		eval_results
generated_qa_data		generated_qa_data
media		media
prompt_eval_cloud		prompt_eval_cloud
prompt_eval_local		prompt_eval_local
prompts		prompts
rag_index_storage/escrow_data		rag_index_storage/escrow_data
utility		utility
.DS_Store		.DS_Store
.gitignore		.gitignore
ModlefileGemma		ModlefileGemma
README.md		README.md
app.py		app.py
chat_assistant.py		chat_assistant.py
convert_rag_eval_to_png.py		convert_rag_eval_to_png.py
docker-compose.yml		docker-compose.yml
dockerfile		dockerfile
escro_data_retriever.ipynb		escro_data_retriever.ipynb
generate_dataset_finetune.ipynb		generate_dataset_finetune.ipynb
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
rag.py		rag.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Escrow 1024.17 Doc Chat Assistant

Simplified chat assistant design diagram

How to interact with the Escrow 1024.17 Doc Chat Assistant

Set the environment variables via .env file

Run the chat assistant

You can run the app without Docker as well.

Chat Via Terminal

Chat via Streamlit App

Escrow 1024.17 data preparation

RAG evaluation data

Prompt Template

Prompt Engineering References

Prompt Evaluation

fine-tuneing

Fine-Tuned Model Evaluation

Deploy fine-tuned model to AWS (for future referance)

About

Releases

Packages

Languages

pyrotank41/FinanceEscroAIAgent

Folders and files

Latest commit

History

Repository files navigation

Escrow 1024.17 Doc Chat Assistant

Simplified chat assistant design diagram

How to interact with the Escrow 1024.17 Doc Chat Assistant

Set the environment variables via .env file

Run the chat assistant

You can run the app without Docker as well.

Chat Via Terminal

Chat via Streamlit App

Escrow 1024.17 data preparation

RAG evaluation data

Prompt Template

Prompt Engineering References

Prompt Evaluation

fine-tuneing

Fine-Tuned Model Evaluation

Deploy fine-tuned model to AWS (for future referance)

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages