Skip to content

Commit

Permalink
[Partner] Gemini Embeddings (#14690)
Browse files Browse the repository at this point in the history
Add support for Gemini embeddings in the langchain-google-genai package
  • Loading branch information
hinthornw authored Dec 14, 2023
1 parent 3449fce commit 1e21a3f
Show file tree
Hide file tree
Showing 13 changed files with 606 additions and 55 deletions.
220 changes: 220 additions & 0 deletions docs/docs/integrations/text_embedding/google_generative_ai.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,220 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "afab8b36-10bb-4795-bc98-75ab2d2081bb",
"metadata": {},
"source": [
"# Google Generative AI Embeddings\n",
"\n",
"Connect to Google's generative AI embeddings service using the `GoogleGenerativeAIEmbeddings` class, found in the [langchain-google-genai](https://pypi.org/project/langchain-google-genai/) package."
]
},
{
"cell_type": "markdown",
"id": "63545b38-9d56-4312-8f61-8d4f1e7a3b1b",
"metadata": {},
"source": [
"## Installation"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d2f6a3cd-379f-4dff-a449-d3a9f3196f2a",
"metadata": {},
"outputs": [],
"source": [
"%pip install -U langchain-google-genai"
]
},
{
"cell_type": "markdown",
"id": "25f3f88e-164e-400d-b371-9fa488baba19",
"metadata": {},
"source": [
"## Credentials"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ec89153f-8999-4aab-a21b-0bfba1cc3893",
"metadata": {},
"outputs": [],
"source": [
"import getpass\n",
"import os\n",
"\n",
"if \"GOOGLE_API_KEY\" not in os.environ:\n",
" os.environ[\"GOOGLE_API_KEY\"] = getpass(\"Provide your Google API key here\")"
]
},
{
"cell_type": "markdown",
"id": "f2437b22-e364-418a-8c13-490a026cb7b5",
"metadata": {},
"source": [
"## Usage"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "eedc551e-a1f3-4fd8-8d65-4e0784c4441b",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[0.05636945, 0.0048285457, -0.0762591, -0.023642512, 0.05329321]"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from langchain_google_genai import GoogleGenerativeAIEmbeddings\n",
"\n",
"embeddings = GoogleGenerativeAIEmbeddings(model=\"models/embedding-001\")\n",
"vector = embeddings.embed_query(\"hello, world!\")\n",
"vector[:5]"
]
},
{
"cell_type": "markdown",
"id": "2b2bed60-e7bd-4e48-83d6-1c87001f98bd",
"metadata": {},
"source": [
"## Batch\n",
"\n",
"You can also embed multiple strings at once for a processing speedup:"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "6ec53aba-404f-4778-acd9-5d6664e79ed2",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(3, 768)"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"vectors = embeddings.embed_documents(\n",
" [\n",
" \"Today is Monday\",\n",
" \"Today is Tuesday\",\n",
" \"Today is April Fools day\",\n",
" ]\n",
")\n",
"len(vectors), len(vectors[0])"
]
},
{
"cell_type": "markdown",
"id": "1482486f-5617-498a-8a44-1974d3212dda",
"metadata": {},
"source": [
"## Task type\n",
"`GoogleGenerativeAIEmbeddings` optionally support a `task_type`, which currently must be one of:\n",
"\n",
"- task_type_unspecified\n",
"- retrieval_query\n",
"- retrieval_document\n",
"- semantic_similarity\n",
"- classification\n",
"- clustering\n",
"\n",
"By default, we use `retrieval_document` in the `embed_documents` method and `retrieval_query` in the `embed_query` method. If you provide a task type, we will use that for all methods."
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "a223bb25-2b1b-418e-a570-2f543083132e",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Note: you may need to restart the kernel to use updated packages.\n"
]
}
],
"source": [
"%pip install --quiet matplotlib scikit-learn"
]
},
{
"cell_type": "code",
"execution_count": 33,
"id": "f1f077db-8eb4-49f7-8866-471a8528dcdb",
"metadata": {},
"outputs": [],
"source": [
"query_embeddings = GoogleGenerativeAIEmbeddings(\n",
" model=\"models/embedding-001\", task_type=\"retrieval_query\"\n",
")\n",
"doc_embeddings = GoogleGenerativeAIEmbeddings(\n",
" model=\"models/embedding-001\", task_type=\"retrieval_document\"\n",
")"
]
},
{
"cell_type": "markdown",
"id": "79bd4a5e-75ba-413c-befa-86167c938caf",
"metadata": {},
"source": [
"All of these will be embedded with the 'retrieval_query' task set\n",
"```python\n",
"query_vecs = [query_embeddings.embed_query(q) for q in [query, query_2, answer_1]]\n",
"```\n",
"All of these will be embedded with the 'retrieval_document' task set\n",
"```python\n",
"doc_vecs = [doc_embeddings.embed_query(q) for q in [query, query_2, answer_1]]\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "9e1fae5e-0f84-4812-89f5-7d4d71affbc1",
"metadata": {},
"source": [
"In retrieval, relative distance matters. In the image above, you can see the difference in similarity scores between the \"relevant doc\" and \"simil stronger delta between the similar query and relevant doc on the latter case."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
13 changes: 13 additions & 0 deletions libs/partners/google-genai/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,3 +56,16 @@ The value of `image_url` can be any of the following:
- A local file path
- A base64 encoded image (e.g., ``)
- A PIL image



## Embeddings

This package also adds support for google's embeddings models.

```
from langchain_google_genai import GoogleGenerativeAIEmbeddings
embeddings = GoogleGenerativeAIEmbeddings(model="models/embedding-001")
embeddings.embed_query("hello, world!")
```
45 changes: 44 additions & 1 deletion libs/partners/google-genai/langchain_google_genai/__init__.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,46 @@
"""**LangChain Google Generative AI Integration**
This module integrates Google's Generative AI models, specifically the Gemini series, with the LangChain framework. It provides classes for interacting with chat models and generating embeddings, leveraging Google's advanced AI capabilities.
**Chat Models**
The `ChatGoogleGenerativeAI` class is the primary interface for interacting with Google's Gemini chat models. It allows users to send and receive messages using a specified Gemini model, suitable for various conversational AI applications.
**Embeddings**
The `GoogleGenerativeAIEmbeddings` class provides functionalities to generate embeddings using Google's models.
These embeddings can be used for a range of NLP tasks, including semantic analysis, similarity comparisons, and more.
**Installation**
To install the package, use pip:
```python
pip install -U langchain-google-genai
```
## Using Chat Models
After setting up your environment with the required API key, you can interact with the Google Gemini models.
```python
from langchain_google_genai import ChatGoogleGenerativeAI
llm = ChatGoogleGenerativeAI(model="gemini-pro")
llm.invoke("Sing a ballad of LangChain.")
```
## Embedding Generation
The package also supports creating embeddings with Google's models, useful for textual similarity and other NLP applications.
```python
from langchain_google_genai import GoogleGenerativeAIEmbeddings
embeddings = GoogleGenerativeAIEmbeddings(model="models/embedding-001")
embeddings.embed_query("hello, world!")
```
""" # noqa: E501
from langchain_google_genai.chat_models import ChatGoogleGenerativeAI
from langchain_google_genai.embeddings import GoogleGenerativeAIEmbeddings

__all__ = ["ChatGoogleGenerativeAI"]
__all__ = ["ChatGoogleGenerativeAI", "GoogleGenerativeAIEmbeddings"]
4 changes: 4 additions & 0 deletions libs/partners/google-genai/langchain_google_genai/_common.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
class GoogleGenerativeAIError(Exception):
"""
Custom exception class for errors associated with the `Google GenAI` API.
"""
Loading

0 comments on commit 1e21a3f

Please sign in to comment.