Skip to content

Commit

Permalink
docs[patch]: Update integration docs for AzureOpenAIEmbeddings (#25311)
Browse files Browse the repository at this point in the history
#24856

---------

Co-authored-by: Isaac Francisco <[email protected]>
Co-authored-by: isaac hershenson <[email protected]>
  • Loading branch information
3 people authored Aug 14, 2024
1 parent b4e3bdb commit 27def6b
Showing 1 changed file with 156 additions and 90 deletions.
246 changes: 156 additions & 90 deletions docs/docs/integrations/text_embedding/azureopenai.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -2,195 +2,261 @@
"cells": [
{
"cell_type": "raw",
"id": "0aed0743",
"id": "afaf8039",
"metadata": {},
"source": [
"---\n",
"keywords: [AzureOpenAIEmbeddings]\n",
"sidebar_label: AzureOpenAI\n",
"---"
]
},
{
"cell_type": "markdown",
"id": "c3852491",
"id": "9a3d6f34",
"metadata": {},
"source": [
"# Azure OpenAI\n",
"# AzureOpenAIEmbeddings\n",
"\n",
"Let's load the Azure OpenAI Embedding class with environment variables set to indicate to use Azure endpoints."
"This will help you get started with AzureOpenAI embedding models using LangChain. For detailed documentation on `AzureOpenAIEmbeddings` features and configuration options, please refer to the [API reference](https://api.python.langchain.com/en/latest/embeddings/langchain_openai.embeddings.azure.AzureOpenAIEmbeddings.html).\n",
"\n",
"## Overview\n",
"### Integration details\n",
"\n",
"import { ItemTable } from \"@theme/FeatureTables\";\n",
"\n",
"<ItemTable category=\"text_embedding\" item=\"AzureOpenAI\" />\n",
"\n",
"## Setup\n",
"\n",
"To access AzureOpenAI embedding models you'll need to create an Azure account, get an API key, and install the `langchain-openai` integration package.\n",
"\n",
"### Credentials\n",
"\n",
"You’ll need to have an Azure OpenAI instance deployed. You can deploy a version on Azure Portal following this [guide](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/create-resource?pivots=web-portal).\n",
"\n",
"Once you have your instance running, make sure you have the name of your instance and key. You can find the key in the Azure Portal, under the “Keys and Endpoint” section of your instance.\n",
"\n",
"```bash\n",
"AZURE_OPENAI_ENDPOINT=<YOUR API ENDPOINT>\n",
"AZURE_OPENAI_API_KEY=<YOUR_KEY>\n",
"AZURE_OPENAI_API_VERSION=\"2024-02-01\"\n",
"```"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "228faf0c",
"execution_count": 8,
"id": "36521c2a",
"metadata": {},
"outputs": [],
"source": [
"%pip install --upgrade --quiet langchain-openai"
"import getpass\n",
"import os\n",
"\n",
"if not os.getenv(\"OPENAI_API_KEY\"):\n",
" os.environ[\"OPENAI_API_KEY\"] = getpass.getpass(\"Enter your AzureOpenAI API key: \")"
]
},
{
"cell_type": "markdown",
"id": "c84fb993",
"metadata": {},
"source": [
"If you want to get automated tracing of your model calls you can also set your [LangSmith](https://docs.smith.langchain.com/) API key by uncommenting below:"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "8a6ed30d-806f-4800-b5fd-d04126be9060",
"execution_count": 9,
"id": "39a4953b",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"\n",
"os.environ[\"AZURE_OPENAI_API_KEY\"] = \"...\"\n",
"os.environ[\"AZURE_OPENAI_ENDPOINT\"] = \"https://<your-endpoint>.openai.azure.com/\""
"# os.environ[\"LANGCHAIN_TRACING_V2\"] = \"true\"\n",
"# os.environ[\"LANGCHAIN_API_KEY\"] = getpass.getpass(\"Enter your LangSmith API key: \")"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "20179bc7-3f71-4909-be12-d38bce009b18",
"cell_type": "markdown",
"id": "d9664366",
"metadata": {},
"outputs": [],
"source": [
"from langchain_openai import AzureOpenAIEmbeddings\n",
"### Installation\n",
"\n",
"embeddings = AzureOpenAIEmbeddings(\n",
" azure_deployment=\"<your-embeddings-deployment-name>\",\n",
" openai_api_version=\"2023-05-15\",\n",
")"
"The LangChain AzureOpenAI integration lives in the `langchain-openai` package:"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "f8cb9dca-738b-450f-9986-5c3efd3c6eb3",
"execution_count": null,
"id": "64853226",
"metadata": {},
"outputs": [],
"source": [
"text = \"this is a test document\""
"%pip install -qU langchain-openai"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "0fae0295-b117-4a5a-8b98-500c79306551",
"cell_type": "markdown",
"id": "45dd1724",
"metadata": {},
"outputs": [],
"source": [
"query_result = embeddings.embed_query(text)"
"## Instantiation\n",
"\n",
"Now we can instantiate our model object and generate chat completions:"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "65a01ddd-0bbf-444f-a87f-93af25ef902c",
"execution_count": 11,
"id": "9ea7a09b",
"metadata": {},
"outputs": [],
"source": [
"doc_result = embeddings.embed_documents([text])"
"from langchain_openai import AzureOpenAIEmbeddings\n",
"\n",
"embeddings = AzureOpenAIEmbeddings(\n",
" model=\"text-embedding-3-large\",\n",
" # dimensions: Optional[int] = None, # Can specify dimensions with new text-embedding-3 models\n",
" # azure_endpoint=\"https://<your-endpoint>.openai.azure.com/\", If not provided, will read env variable AZURE_OPENAI_ENDPOINT\n",
" # api_key=... # Can provide an API key directly. If missing read env variable AZURE_OPENAI_API_KEY\n",
" # openai_api_version=..., # If not provided, will read env variable AZURE_OPENAI_API_VERSION\n",
")"
]
},
{
"cell_type": "markdown",
"id": "77d271b6",
"metadata": {},
"source": [
"## Indexing and Retrieval\n",
"\n",
"Embedding models are often used in retrieval-augmented generation (RAG) flows, both as part of indexing data as well as later retrieving it. For more detailed instructions, please see our RAG tutorials under the [working with external knowledge tutorials](/docs/tutorials/#working-with-external-knowledge).\n",
"\n",
"Below, see how to index and retrieve data using the `embeddings` object we initialized above. In this example, we will index and retrieve a sample document in the `InMemoryVectorStore`."
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "45771052-68ca-4e03-9c4f-a0c7796d9442",
"execution_count": 5,
"id": "d817716b",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[-0.012222584727053133,\n",
" 0.0072103982392216145,\n",
" -0.014818063280923775,\n",
" -0.026444746872933557,\n",
" -0.0034330499700826883]"
"'LangChain is the framework for building context-aware reasoning applications'"
]
},
"execution_count": 6,
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"doc_result[0][:5]"
"# Create a vector store with a sample text\n",
"from langchain_core.vectorstores import InMemoryVectorStore\n",
"\n",
"text = \"LangChain is the framework for building context-aware reasoning applications\"\n",
"\n",
"vectorstore = InMemoryVectorStore.from_texts(\n",
" [text],\n",
" embedding=embeddings,\n",
")\n",
"\n",
"# Use the vectorstore as a retriever\n",
"retriever = vectorstore.as_retriever()\n",
"\n",
"# Retrieve the most similar text\n",
"retrieved_documents = retriever.invoke(\"What is LangChain?\")\n",
"\n",
"# show the retrieved document's content\n",
"retrieved_documents[0].page_content"
]
},
{
"cell_type": "markdown",
"id": "e66ec1f2-6768-4ee5-84bf-a2d76adc20c8",
"metadata": {},
"source": [
"## [Legacy] When using `openai<1`"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1b40f827",
"id": "e02b9855",
"metadata": {},
"outputs": [],
"source": [
"# set the environment variables needed for openai package to know to reach out to azure\n",
"import os\n",
"## Direct Usage\n",
"\n",
"Under the hood, the vectorstore and retriever implementations are calling `embeddings.embed_documents(...)` and `embeddings.embed_query(...)` to create embeddings for the text(s) used in `from_texts` and retrieval `invoke` operations, respectively.\n",
"\n",
"You can directly call these methods to get embeddings for your own use cases.\n",
"\n",
"os.environ[\"OPENAI_API_TYPE\"] = \"azure\"\n",
"os.environ[\"OPENAI_API_BASE\"] = \"https://<your-endpoint.openai.azure.com/\"\n",
"os.environ[\"OPENAI_API_KEY\"] = \"your AzureOpenAI key\"\n",
"os.environ[\"OPENAI_API_VERSION\"] = \"2023-05-15\""
"### Embed single texts\n",
"\n",
"You can embed single texts or documents with `embed_query`:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "bb36d16c",
"execution_count": 6,
"id": "0d2befcd",
"metadata": {},
"outputs": [],
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[-0.0011676070280373096, 0.007125577889382839, -0.014674457721412182, -0.034061674028635025, 0.01128\n"
]
}
],
"source": [
"from langchain_openai import OpenAIEmbeddings\n",
"\n",
"embeddings = OpenAIEmbeddings(deployment=\"your-embeddings-deployment-name\")"
"single_vector = embeddings.embed_query(text)\n",
"print(str(single_vector)[:100]) # Show the first 100 characters of the vector"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "228abcbb",
"cell_type": "markdown",
"id": "1b5a7d03",
"metadata": {},
"outputs": [],
"source": [
"text = \"This is a test document.\""
"### Embed multiple texts\n",
"\n",
"You can embed multiple texts with `embed_documents`:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "60dd7fad",
"execution_count": 7,
"id": "2f4d6e97",
"metadata": {},
"outputs": [],
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[-0.0011966148158535361, 0.007160289213061333, -0.014659193344414234, -0.03403077274560928, 0.011280\n",
"[-0.005595256108790636, 0.016757294535636902, -0.011055258102715015, -0.031094247475266457, -0.00363\n"
]
}
],
"source": [
"query_result = embeddings.embed_query(text)"
"text2 = (\n",
" \"LangGraph is a library for building stateful, multi-actor applications with LLMs\"\n",
")\n",
"two_vectors = embeddings.embed_documents([text, text2])\n",
"for vector in two_vectors:\n",
" print(str(vector)[:100]) # Show the first 100 characters of the vector"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "83bc1a72",
"cell_type": "markdown",
"id": "98785c12",
"metadata": {},
"outputs": [],
"source": [
"doc_result = embeddings.embed_documents([text])"
"## API Reference\n",
"\n",
"For detailed documentation on `AzureOpenAIEmbeddings` features and configuration options, please refer to the [API reference](https://api.python.langchain.com/en/latest/embeddings/langchain_openai.embeddings.azure.AzureOpenAIEmbeddings.html).\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "aaad49f8",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
Expand All @@ -204,7 +270,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.5"
"version": "3.9.6"
}
},
"nbformat": 4,
Expand Down

0 comments on commit 27def6b

Please sign in to comment.