multimodal multiturn chathistory aware agent in langchain - tool selection biased by first chat #28886

roggyy · 2024-12-23T10:26:18Z

roggyy
Dec 23, 2024

Checked other resources

I added a very descriptive title to this question.
I searched the LangChain documentation with the integrated search.
I used the GitHub search to find a similar question and didn't find it.

Commit to Help

I commit to help with one of those options 👆

Example Code

import os
import asyncio
from langchain_openai import AzureChatOpenAI
from langchain.schema import (
    SystemMessage,
    HumanMessage,
    AIMessage,
    ChatMessage
)
from langchain.agents import AgentExecutor, create_tool_calling_agent
from langchain.tools import StructuredTool
from langchain.prompts import ChatPromptTemplate
from langchain.schema.messages import BaseMessage
from langchain import hub

# Step 1: Define a Dummy RAG Tool and Quick Assessment Tool
async def dummy_rag_tool(query: str) -> str:
    return f"Dummy RAG response for query: {query}"

async def dummy_quick_assessment_tool(query: str) -> str:
    return f"Quick assessment generated for query: {query}"

rag_response_tool = StructuredTool.from_function(
    coroutine=dummy_rag_tool,
    name="RAGTool",
    description="A dummy RAG tool for generating responses."
)
quick_assessment_tool = StructuredTool.from_function(
    coroutine=dummy_quick_assessment_tool,
    name="QuickAssessmentTool",
    description="A dummy tool for generating quick assessments."
)

# Step 2: Initialize the LLM
llm = AzureChatOpenAI(
    openai_api_version=os.getenv("AZURE_OPENAI_API_VERSION"),
    azure_deployment=os.getenv("AZURE_OPENAI_CHAT_DEPLOYMENT_NAME_GPT4o"),
    max_tokens=1024,
)

prompt = hub.pull("hwchase17/openai-tools-agent")

# Step 3: Define the Agent Prompt
agent_prompt = "You are a multimodal assistant capable of processing text and images."

# Step 4: Define the Multimodal Messages
url = "https://www.gstatic.com/webp/gallery3/1.sm.png"  
messages = [
    SystemMessage(content=f"{agent_prompt}"),
    HumanMessage(
        content=[
            {"type": "text", "text": "what is this image about."},
            *(  # Add image data only if `image` is provided
                [{"type": "image_url", "image_url": {"url": f"data:image/png;base64,{url}"}}] if url else []
            ),
        ]
    ),
]

# Step 5: Create a Dummy Chat History
chat_history = [
    HumanMessage(content="create an assessment on machine learning."),
    AIMessage(content="your assessment is generated and saved."),
]

# Step 6: Create the Agent and Executor
tools = [rag_response_tool, quick_assessment_tool]
agent = create_tool_calling_agent(llm, [rag_response_tool,quick_assessment_tool], prompt)
agent_executor = AgentExecutor(agent=agent, tools=[rag_response_tool,quick_assessment_tool], verbose=True)

# Step 7: Execute the Agent
async def main():
    response = await agent_executor.ainvoke({"input": messages, "chat_history": chat_history})
    print("Agent Response:", response)


import nest_asyncio
nest_asyncio.apply()

asyncio.run(main())

Description

Aim: i am building an langchain based multimodal multitrun chat history aware agent that has multiple tools.

issue: tool calling after initial chat is being biased by the initial chat and the agent keeps selecting tool used initially.when the model was not multi modal it was working fine,but this starts happening when i integrated multimodal chat.

also minor suggestions required:

dynamically include the n or no image if provided into a prompt.
how can i pass some configs/vars to the tools instead of relying on model to extract it from prompt.ex- user_id etc

thanks for helping, i am fairly new to langchain ,so might have missed out some details ,pls feel free to to ask required details.pls use another publically available image url if model is unable to get the image.

System Info

pip install os
pip install langchain==0.3.4
pip install nest-asyncio==1.6.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

multimodal multiturn chathistory aware agent in langchain - tool selection biased by first chat #28886

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

multimodal multiturn chathistory aware agent in langchain - tool selection biased by first chat #28886

roggyy Dec 23, 2024

Checked other resources

Commit to Help

Example Code

Description

System Info

Replies: 0 comments

roggyy
Dec 23, 2024