Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Amazon Bedrock Knowledge base as a RAG Engine (retriever) #427

Merged
merged 50 commits into from
Aug 7, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
50 commits
Select commit Hold shift + click to select a range
7ef5741
feat(kb): initial
massi-ang Mar 4, 2024
8c33443
feat(bedrock_kb): initial
massi-ang Mar 7, 2024
9767716
Merge remote-tracking branch 'origin/main' into feat_bedrock_kb
massi-ang Mar 7, 2024
e4af35c
Merge remote-tracking branch 'origin/main' into feat_bedrock_kb
massi-ang Mar 7, 2024
e92cd0b
feat(bedrock_kb): knowledge base support
massi-ang Mar 7, 2024
17c38de
Merge remote-tracking branch 'origin/main' into feat_bedrock_kb
massi-ang Mar 8, 2024
32142d5
Merge remote-tracking branch 'origin/main' into feat_bedrock_kb
massi-ang Mar 11, 2024
242489b
feat(bedrock_kb): delete workspaces
massi-ang Mar 13, 2024
c1aac6e
feat(bedrock_kb): config
massi-ang Mar 13, 2024
dc10d8d
Merge remote-tracking branch 'origin/main' into feat_bedrock_kb
massi-ang Mar 13, 2024
e95cecf
feat(bedrock_kb): hybrid search
massi-ang Mar 13, 2024
b038f22
feat(bedrock_kb): upgrade boto3
massi-ang Mar 13, 2024
c8a0fba
Merge branch 'main' of github.com:aws-samples/aws-genai-llm-chatbot i…
massi-ang Mar 15, 2024
065227d
fix(cohere_embeddings): correct set the `input_type`
massi-ang Mar 21, 2024
020b2d4
Merge branch 'fix_cohere_embeddings' into feat_bedrock_kb
massi-ang Mar 21, 2024
2a1645f
Merge branch 'main' of github.com:aws-samples/aws-genai-llm-chatbot i…
massi-ang Mar 26, 2024
ec0eb7c
feat(kb): add Bedrock KB to the welcome page
massi-ang Mar 27, 2024
a4e10b5
fix(no unused var): correct ignore
massi-ang Mar 29, 2024
78ff010
Merge branch 'main' of github.com:aws-samples/aws-genai-llm-chatbot i…
massi-ang Mar 29, 2024
e8640b1
Merge branch 'main' into feat_bedrock_kb
massi-ang Mar 29, 2024
f7a0137
fix: correct enum values
massi-ang Mar 29, 2024
0848760
Merge branch 'main' of github.com:aws-samples/aws-genai-llm-chatbot i…
massi-ang Apr 10, 2024
815490e
feat(bedrock_kb): metadata filters
massi-ang Apr 19, 2024
2ba5efb
feat(bedrock_kb): dedup
massi-ang Apr 23, 2024
147142e
feat(llama3): adapter
massi-ang Apr 23, 2024
2ad2191
Merge branch 'main' into feat_bedrock_kb
massi-ang Apr 23, 2024
765236a
Merge branch 'feat_llama3' into feat_bedrock_kb
massi-ang Apr 23, 2024
d8aead6
feat(bedrock_kb): merge to latest
massi-ang Apr 23, 2024
7a5cfcd
chore: updating code to use langchain-community
massi-ang May 7, 2024
d36c5ea
Merge branch 'chore-langchain-upgrade' into feat_bedrock_kb
massi-ang May 7, 2024
5ba2164
Merge branch 'main' of github.com:aws-samples/aws-genai-llm-chatbot i…
massi-ang May 14, 2024
409dd3f
Merge branch 'main' of github.com:aws-samples/aws-genai-llm-chatbot i…
massi-ang Jun 13, 2024
746d4dd
Merge branch 'main' of github.com:aws-samples/aws-genai-llm-chatbot i…
massi-ang Jun 28, 2024
bac229c
Merge branch 'main' of github.com:aws-samples/aws-genai-llm-chatbot i…
massi-ang Jun 28, 2024
b143fa3
feat(bedrock): fix error
massi-ang Jun 28, 2024
8a5ef88
Merge branch 'feat_bedrock_kb' of github.com:aws-samples/aws-genai-ll…
massi-ang Jun 28, 2024
6a36795
fix(bedrock_kb): magic cli
massi-ang Jun 28, 2024
8436514
Merge branch 'main' into feat_bedrock_kb
massi-ang Jul 9, 2024
92796c7
fix: correct path to compiled files
massi-ang Aug 5, 2024
ff66d30
Merge branch 'fix_config_path' into feat_bedrock_kb
massi-ang Aug 5, 2024
3a31dc7
Merge branch 'feat_bedrock_kb' of github.com:aws-samples/aws-genai-ll…
massi-ang Aug 5, 2024
382e112
chore: removed wrongly tracked file
massi-ang Aug 6, 2024
208b976
fix: fix review feedback
massi-ang Aug 6, 2024
d4b62d4
fix: remove feature under development
massi-ang Aug 6, 2024
00f976e
Merge branch 'main' of github.com:aws-samples/aws-genai-llm-chatbot i…
massi-ang Aug 6, 2024
6d6149a
fix: typescript errors
massi-ang Aug 6, 2024
293e5de
fix: ignore code un `dist` to avoid recursive builds
massi-ang Aug 6, 2024
871dcc1
Merge branch 'fix_tsconfig_error' into feat_bedrock_kb
massi-ang Aug 6, 2024
d36f8d9
tests: update test snapshot and fix call order
massi-ang Aug 6, 2024
5058bb9
test: fix rag tests
massi-ang Aug 6, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions bin/config.ts
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,9 @@ export function getConfig(): SystemConfig {
createIndex: false,
enterprise: false,
},
knowledgeBase: {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: suggestion bedrock or bedrockKnowledgeBase. Other properties name the service.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do they? I see aurora, kendra, ...

enabled: false,
},
},
embeddingsModels: [
{
Expand Down
87 changes: 87 additions & 0 deletions cli/magic-config.ts
Original file line number Diff line number Diff line change
Expand Up @@ -156,6 +156,8 @@ const embeddingModels = [
? config.llms?.sagemaker.length > 0
: false;
options.huggingfaceApiSecretArn = config.llms?.huggingfaceApiSecretArn;
options.enableSagemakerModelsSchedule =
config.llms?.sagemakerSchedule?.enabled;
options.enableSagemakerModelsSchedule =
config.llms?.sagemakerSchedule?.enabled;
options.timezonePicker = config.llms?.sagemakerSchedule?.timezonePicker;
Expand Down Expand Up @@ -194,6 +196,7 @@ const embeddingModels = [
(m) => m.default
)[0].name;
options.kendraExternal = config.rag.engines.kendra.external;
options.kbExternal = config.rag.engines.knowledgeBase?.external ?? [];
options.kendraEnterprise = config.rag.engines.kendra.enterprise;

// Advanced settings
Expand Down Expand Up @@ -586,6 +589,7 @@ async function processCreateOptions(options: any): Promise<void> {
{ message: "Aurora", name: "aurora" },
{ message: "OpenSearch", name: "opensearch" },
{ message: "Kendra (managed)", name: "kendra" },
{ message: "Bedrock KnowldgeBase", name: "knowledgeBase" },
],
validate(choices: any) {
return (this as any).skipped || choices.length > 0
Expand Down Expand Up @@ -694,6 +698,82 @@ async function processCreateOptions(options: any): Promise<void> {
});
newKendra = kendraInstance.newKendra;
}

// Knowledge Bases
let newKB =
answers.enableRag && answers.ragsToEnable.includes("knowledgeBase");
const kbExternal: any[] = [];
const existingKBIndices = Array.from(options.kbExternal || []);
while (newKB === true) {
const existingIndex: any = existingKBIndices.pop();
const kbQ = [
{
type: "input",
name: "name",
message: "KnowledgeBase source name",
validate(v: string) {
return RegExp(/^\w[\w-_]*\w$/).test(v);
},
initial: existingIndex?.name,
},
{
type: "autocomplete",
limit: 8,
name: "region",
choices: ["us-east-1", "us-west-2"],
message: `Region of the Bedrock Knowledge Base index${
existingIndex?.region ? " (" + existingIndex?.region + ")" : ""
}`,
initial: ["us-east-1", "us-west-2"].indexOf(existingIndex?.region),
},
{
type: "input",
name: "roleArn",
message:
"Cross account role Arn to assume to call the Bedrock KnowledgeBase, leave empty if not needed",
validate: (v: string) => {
const valid = iamRoleRegExp.test(v);
return v.length === 0 || valid;
},
initial: existingIndex?.roleArn ?? "",
},
{
type: "input",
name: "knowledgeBaseId",
message: "Bedrock KnowledgeBase ID",
validate(v: string) {
return /[A-Z0-9]{10}/.test(v);
},
initial: existingIndex?.knowledgeBaseId,
},
{
type: "confirm",
name: "enabled",
message: "Enable this knowledge base",
initial: existingIndex?.enabled ?? true,
},
{
type: "confirm",
name: "newKB",
message: "Do you want to add another Bedrock KnowledgeBase source",
initial: false,
},
];
const kbInstance: any = await enquirer.prompt(kbQ);
const ext = (({ enabled, name, roleArn, knowledgeBaseId, region }) => ({
enabled,
name,
roleArn,
knowledgeBaseId,
region,
}))(kbInstance);
if (ext.roleArn === "") ext.roleArn = undefined;
kbExternal.push({
...ext,
});
newKB = kbInstance.newKB;
}

const modelsPrompts = [
{
type: "select",
Expand Down Expand Up @@ -1078,6 +1158,10 @@ async function processCreateOptions(options: any): Promise<void> {
external: [{}],
enterprise: false,
},
knowledgeBase: {
enabled: false,
external: [{}],
},
},
embeddingsModels: [{}],
crossEncoderModels: [{}],
Expand Down Expand Up @@ -1107,6 +1191,9 @@ async function processCreateOptions(options: any): Promise<void> {
config.rag.engines.kendra.createIndex || kendraExternal.length > 0;
config.rag.engines.kendra.external = [...kendraExternal];
config.rag.engines.kendra.enterprise = answers.kendraEnterprise;
config.rag.engines.knowledgeBase.enabled =
config.rag.engines.knowledgeBase.external.length > 0;
config.rag.engines.knowledgeBase.external = [...kbExternal];

console.log("\n✨ This is the chosen configuration:\n");
console.log(JSON.stringify(config, undefined, 2));
Expand Down
2 changes: 2 additions & 0 deletions lib/chatbot-api/functions/api-handler/index.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
from routes.documents import router as documents_router
from routes.kendra import router as kendra_router
from routes.user_feedback import router as user_feedback_router
from routes.bedrock_kb import router as bedrock_kb_router

tracer = Tracer()
logger = Logger()
Expand All @@ -32,6 +33,7 @@
app.include_router(documents_router)
app.include_router(kendra_router)
app.include_router(user_feedback_router)
app.include_router(bedrock_kb_router)


@logger.inject_lambda_context(
Expand Down
21 changes: 21 additions & 0 deletions lib/chatbot-api/functions/api-handler/routes/bedrock_kb.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
import genai_core.parameters
import genai_core.bedrock_kb
from pydantic import BaseModel
from aws_lambda_powertools import Logger, Tracer
from aws_lambda_powertools.event_handler.appsync import Router

tracer = Tracer()
router = Router()
logger = Logger()


class KendraDataSynchRequest(BaseModel):
workspaceId: str
massi-ang marked this conversation as resolved.
Show resolved Hide resolved


@router.resolver(field_name="listBedrockKnowledgeBases")
@tracer.capture_method
def list_bedrock_kbs():
indexes = genai_core.bedrock_kb.list_bedrock_kbs()

return indexes
5 changes: 5 additions & 0 deletions lib/chatbot-api/functions/api-handler/routes/rag.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,11 @@ def engines():
"name": "Amazon Kendra",
"enabled": engines.get("kendra", {}).get("enabled", False) == True,
},
{
"id": "bedrock_kb",
"name": "Bedrock Knowledge Bases",
"enabled": engines.get("knowledgeBase", {}).get("enabled", False) == True,
},
]

return ret_value
53 changes: 53 additions & 0 deletions lib/chatbot-api/functions/api-handler/routes/workspaces.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
import re
import genai_core.types
import genai_core.kendra
import genai_core.bedrock_kb
import genai_core.parameters
import genai_core.workspaces
from pydantic import BaseModel
Expand Down Expand Up @@ -55,6 +56,13 @@ class CreateWorkspaceKendraRequest(BaseModel):
useAllData: bool


class CreateWorkspaceBedrockKBRequest(BaseModel):
kind: str
name: str
knowledgeBaseId: str
hybridSearch: bool


@router.resolver(field_name="listWorkspaces")
@tracer.capture_method
def list_workspaces():
Expand Down Expand Up @@ -115,6 +123,16 @@ def create_kendra_workspace(input: dict):
return ret_value


@router.resolver(field_name="createBedrockKBWorkspace")
@tracer.capture_method
def create_bedrock_kb_workspace(input: dict):
config = genai_core.parameters.get_config()

request = CreateWorkspaceBedrockKBRequest(**input)
ret_value = _create_workspace_bedrock_kb(request, config)
return ret_value


def _create_workspace_aurora(request: CreateWorkspaceAuroraRequest, config: dict):
workspace_name = request.name.strip()
embedding_models = config["rag"]["embeddingsModels"]
Expand Down Expand Up @@ -291,6 +309,39 @@ def _create_workspace_kendra(request: CreateWorkspaceKendraRequest, config: dict
)


def _create_workspace_bedrock_kb(
request: CreateWorkspaceBedrockKBRequest, config: dict
):
workspace_name = request.name.strip()
kbs = genai_core.bedrock_kb.list_bedrock_kbs()

workspace_name_match = name_regex.match(workspace_name)
workspace_name_is_match = bool(workspace_name_match)
if (
len(workspace_name) == 0
or len(workspace_name) > 100
or not workspace_name_is_match
):
raise genai_core.types.CommonError("Invalid workspace name")

knowledge_base = None
for current in kbs:
if current["id"] == request.knowledgeBaseId:
knowledge_base = current
break

if knowledge_base is None:
raise genai_core.types.CommonError("Knowledge Base id not found")

return _convert_workspace(
genai_core.workspaces.create_workspace_bedrock_kb(
workspace_name=workspace_name,
knowledge_base=knowledge_base,
hybrid_search=request.hybridSearch,
)
)


def _convert_workspace(workspace: dict):
kendra_index_external = workspace.get("kendra_index_external")

Expand Down Expand Up @@ -320,6 +371,8 @@ def _convert_workspace(workspace: dict):
"kendraIndexId": workspace.get("kendra_index_id"),
"kendraIndexExternal": kendra_index_external,
"kendraUseAllData": workspace.get("kendra_use_all_data", kendra_index_external),
"knowledgeBaseId": workspace.get("knowledge_base_id"),
"knowledgeBaseExternal": workspace.get("knowledge_base_external"),
"createdAt": workspace.get("created_at"),
"updatedAt": workspace.get("updated_at"),
}
27 changes: 27 additions & 0 deletions lib/chatbot-api/rest-api.ts
Original file line number Diff line number Diff line change
Expand Up @@ -188,6 +188,33 @@ export class ApiResolvers extends Construct {
);
}

if (props.config.rag.engines.knowledgeBase.enabled) {
for (const item of props.config.rag.engines.knowledgeBase.external ||
[]) {
if (item.roleArn) {
apiHandler.addToRolePolicy(
new iam.PolicyStatement({
actions: ["sts:AssumeRole"],
resources: [item.roleArn],
})
);
} else {
apiHandler.addToRolePolicy(
new iam.PolicyStatement({
actions: ["bedrock:Retrieve"],
resources: [
`arn:${cdk.Aws.PARTITION}:bedrock:${
item.region ?? cdk.Aws.REGION
}:${cdk.Aws.ACCOUNT_ID}:knowledge-base/${
item.knowledgeBaseId
}`,
],
})
);
}
}
}

for (const item of props.config.rag.engines.kendra.external ?? []) {
if (item.roleArn) {
apiHandler.addToRolePolicy(
Expand Down
18 changes: 18 additions & 0 deletions lib/chatbot-api/schema/schema.graphql
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,13 @@ input CreateWorkspaceKendraInput {
useAllData: Boolean!
}

input CreateWorkspaceBedrockKBInput {
name: String!
kind: String!
knowledgeBaseId: String!
hybridSearch: Boolean!
}

input CreateWorkspaceOpenSearchInput {
name: String!
kind: String!
Expand Down Expand Up @@ -150,6 +157,12 @@ type KendraIndex @aws_cognito_user_pools {
external: Boolean!
}

type BedrockKB @aws_cognito_user_pools {
id: String!
name: String!
external: Boolean!
}

input ListDocumentsInput {
workspaceId: String!
documentType: String!
Expand Down Expand Up @@ -302,6 +315,8 @@ type Workspace @aws_cognito_user_pools {
kendraIndexId: String
kendraIndexExternal: Boolean
kendraUseAllData: Boolean
knowledgeBaseId: String
knowledgeBaseExternal: Boolean
createdAt: AWSDateTime!
updatedAt: AWSDateTime!
}
Expand All @@ -315,6 +330,8 @@ type Channel @aws_iam @aws_cognito_user_pools {
type Mutation {
createKendraWorkspace(input: CreateWorkspaceKendraInput!): Workspace!
@aws_cognito_user_pools
createBedrockKBWorkspace(input: CreateWorkspaceBedrockKBInput!): Workspace!
@aws_cognito_user_pools
createOpenSearchWorkspace(input: CreateWorkspaceOpenSearchInput!): Workspace!
@aws_cognito_user_pools
createAuroraWorkspace(input: CreateWorkspaceAuroraInput!): Workspace!
Expand Down Expand Up @@ -358,6 +375,7 @@ type Query {
@aws_cognito_user_pools
getSession(id: String!): Session @aws_cognito_user_pools
listKendraIndexes: [KendraIndex!]! @aws_cognito_user_pools
listBedrockKnowledgeBases: [BedrockKB!]! @aws_cognito_user_pools
isKendraDataSynching(workspaceId: String!): Boolean @aws_cognito_user_pools
listDocuments(input: ListDocumentsInput!): DocumentsResult!
@aws_cognito_user_pools
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@

You are an helpful assistant that provides concise answers to user questions with as little sentences as possible and at maximum 3 sentences. You do not repeat yourself. You avoid bulleted list or emojis.{EOD}{{chat_history}}{USER_HEADER}

{{input}}{EOD}{ASSISTANT_HEADER}"""
Context: {{input}}{EOD}{ASSISTANT_HEADER}"""

Llama3QAPrompt = f"""{BEGIN_OF_TEXT}{SYSTEM_HEADER}

Expand Down
Loading
Loading