-
Notifications
You must be signed in to change notification settings - Fork 42
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add argument completion and sampling specifications
The commit adds two new specification documents to MCP: argument completion for prompts/resources and sampling for LLM text generation. Argument completion enables servers to provide contextual suggestions for argument values, while sampling defines how servers can request LLM generations via clients with user approval and permission controls.
- Loading branch information
Showing
2 changed files
with
395 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,190 @@ | ||
--- | ||
title: Argument Completion | ||
type: docs | ||
weight: 6 | ||
--- | ||
Argument Completion enables servers to provide argument completion for prompt and resource URI arguments. Clients can request completion options for specific arguments, and servers can return ranked suggestions. This allows clients to build rich user interfaces with intelligent argument completion for argument values. | ||
|
||
> **_NOTE:_** Argument Completion in MCP is similar to traditional IDE argument completion - it provides contextual suggestions based on available options, rather than AI-powered completion. The server maintains a fixed set of valid values for each argument and returns matching suggestions based on partial input. | ||
## Capabilities | ||
|
||
Support for argument completion is not indicated by a dedicated capability - servers that expose prompts or resources with arguments implicitly support argument completion for those arguments. Clients may attempt argument completion requests for any prompt or resource argument. | ||
|
||
## Concepts | ||
|
||
### Completion References | ||
|
||
When requesting completions, clients must specify what is being completed using a reference type: | ||
|
||
- `ref/prompt`: References a prompt by name | ||
- `ref/resource`: References a resource by URI | ||
|
||
The reference identifies the context for completion suggestions. | ||
|
||
### Completion Results | ||
|
||
Servers return an array of completion values ranked by relevance, with a maximum of 100 items per response. If more results are available beyond the first 100, servers MUST set `hasMore: true` in the response to indicate that additional results can be retrieved with subsequent requests. The optional `total` field allows servers to specify the complete number of matches available, even if not all are returned in a single response. | ||
|
||
> **_NOTE:_** MCP does not currently support pagination of completion results - clients that need more than the first 100 matches must issue new completion requests with more specific argument values to narrow down the results. | ||
## Use Cases | ||
|
||
Common use cases for argument completion include: | ||
|
||
### Prompt Argument Completion | ||
|
||
A client requesting completion options for a prompt argument: | ||
|
||
```json | ||
{ | ||
"ref": { | ||
"type": "ref/prompt", | ||
"name": "code_review" | ||
}, | ||
"argument": { | ||
"name": "language", | ||
"value": "py" | ||
} | ||
} | ||
``` | ||
|
||
The server might respond with language suggestions: | ||
|
||
```json | ||
{ | ||
"completion": { | ||
"values": ["python", "pytorch", "pyside", "pyyaml"], | ||
"total": 10, | ||
"hasMore": true | ||
} | ||
} | ||
``` | ||
|
||
### Resource URI Completion | ||
|
||
A client completing a path in a resource URI template: | ||
|
||
```json | ||
{ | ||
"ref": { | ||
"type": "ref/resource", | ||
"uri": "file://{path}" | ||
}, | ||
"argument": { | ||
"name": "path", | ||
"value": "/home/user/doc" | ||
} | ||
} | ||
``` | ||
|
||
The server could respond with matching paths: | ||
|
||
```json | ||
{ | ||
"completion": { | ||
"values": [ | ||
"/home/user/documents", | ||
"/home/user/docker", | ||
"/home/user/downloads" | ||
], | ||
"hasMore": false | ||
} | ||
} | ||
``` | ||
|
||
## Diagram | ||
|
||
The following diagram visualizes a typical argument completion interaction between client and server: | ||
|
||
```mermaid | ||
sequenceDiagram | ||
participant Client | ||
participant Server | ||
Note over Client,Server: Client requests completion options | ||
Client->>Server: completion/complete | ||
Server-->>Client: CompleteResult | ||
Note over Client,Server: Client may request more specific results | ||
opt New completions requested | ||
Client->>Server: completion/complete | ||
Server-->>Client: CompleteResult | ||
end | ||
``` | ||
|
||
## Messages | ||
|
||
This section defines the protocol messages for argument completion in the Model Context Protocol (MCP). | ||
|
||
### Requesting Completions | ||
|
||
#### Request | ||
|
||
To get completion suggestions, the client MUST send a `completion/complete` request. | ||
|
||
Method: `completion/complete` | ||
Params: | ||
- `ref`: A `PromptReference` or `ResourceReference` indicating what is being completed | ||
- `argument`: Object containing: | ||
- `name`: The name of the argument being completed | ||
- `value`: The current value to get completions for | ||
|
||
Example: | ||
```json | ||
{ | ||
"jsonrpc": "2.0", | ||
"id": 1, | ||
"method": "completion/complete", | ||
"params": { | ||
"ref": { | ||
"type": "ref/prompt", | ||
"name": "code_review" | ||
}, | ||
"argument": { | ||
"name": "language", | ||
"value": "py" | ||
} | ||
} | ||
} | ||
``` | ||
|
||
#### Response | ||
|
||
The server MUST respond with a `CompleteResult` containing: | ||
|
||
- `completion`: Object containing: | ||
- `values`: Array of completion suggestions (maximum 100) | ||
- `total`: Optional total number of matches available | ||
- `hasMore`: Optional boolean indicating if additional results exist | ||
|
||
Example: | ||
```json | ||
{ | ||
"jsonrpc": "2.0", | ||
"id": 1, | ||
"result": { | ||
"completion": { | ||
"values": ["python", "pytorch", "pyside"], | ||
"total": 10, | ||
"hasMore": true | ||
} | ||
} | ||
} | ||
``` | ||
|
||
## Error Handling | ||
|
||
Servers MUST return appropriate errors if: | ||
- The referenced prompt or resource does not exist | ||
- The argument name is invalid | ||
- Completion cannot be provided for other reasons | ||
|
||
Clients SHOULD be prepared to handle cases where completion is temporarily unavailable or returns errors. | ||
|
||
## Security Considerations | ||
|
||
Implementations MUST carefully consider: | ||
- Rate limiting completion requests to prevent abuse | ||
- Access control for sensitive completion suggestions | ||
- Validation of completion inputs to prevent injection attacks |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,205 @@ | ||
--- | ||
title: Sampling | ||
type: docs | ||
weight: 7 | ||
description: "MCP protocol specification for language model sampling and text generation" | ||
draft: false | ||
params: | ||
author: Anthropic | ||
keywords: ["mcp", "sampling", "llm", "protocols"] | ||
--- | ||
|
||
Sampling enables servers to request generations from a language model via the client, enabling clients to have control over which model to use and what prompts are accepted. Clients can approve or reject incoming sampling requests, and control permissions around which servers can access which models. Servers can optionally request context from other MCP servers to be included in prompts. Because sampling requests go from server to client, this is the only request type in MCP that flows in this direction. | ||
|
||
## Capabilities | ||
|
||
Clients indicate support for sampling by including a `sampling` capability in their `ClientCapabilities` during initialization. The `sampling` capability SHOULD be an empty object: | ||
|
||
```json | ||
{ | ||
"capabilities": { | ||
"sampling": {} | ||
} | ||
} | ||
``` | ||
|
||
Servers SHOULD check for this capability before attempting to use sampling functionality. | ||
|
||
## Concepts | ||
|
||
### Sampling Request | ||
|
||
A Sampling Request in the Model Context Protocol (MCP) represents a request from a server to generate text from a language model via the client. Each request contains messages to send to the model, optional system prompts, and sampling parameters like temperature and maximum tokens. The client has full discretion over which model to use and whether to approve the request. | ||
|
||
### Message Content | ||
|
||
Message content can be either text or images, allowing for multimodal interactions where supported by the model. Text content is provided directly as strings, while image content must be base64 encoded with an appropriate MIME type. | ||
|
||
## Use Cases | ||
|
||
Common use cases for sampling include generating responses in chat interfaces, code completion, and content generation. Here are some example sampling scenarios: | ||
|
||
### Chat Response | ||
|
||
A server requesting a chat response: | ||
|
||
```json | ||
{ | ||
"messages": [ | ||
{ | ||
"role": "user", | ||
"content": { | ||
"type": "text", | ||
"text": "What is the capital of France?" | ||
} | ||
} | ||
], | ||
"maxTokens": 100, | ||
"temperature": 0.7 | ||
} | ||
``` | ||
|
||
### Image Analysis | ||
|
||
A server requesting analysis of an image: | ||
|
||
```json | ||
{ | ||
"messages": [ | ||
{ | ||
"role": "user", | ||
"content": { | ||
"type": "image", | ||
"data": "base64_encoded_image_data", | ||
"mimeType": "image/jpeg" | ||
} | ||
}, | ||
{ | ||
"role": "user", | ||
"content": { | ||
"type": "text", | ||
"text": "Describe what you see in this image." | ||
} | ||
} | ||
], | ||
"maxTokens": 200 | ||
} | ||
``` | ||
|
||
## Diagram | ||
|
||
The following diagram visualizes the sampling request flow between | ||
server and client: | ||
|
||
```mermaid | ||
sequenceDiagram | ||
participant Server | ||
participant Client | ||
participant User | ||
participant LLM | ||
Note over Server,Client: Server requests sampling | ||
Server->>Client: sampling/createMessage | ||
opt User approval | ||
Client->>User: Request approval | ||
User-->>Client: Approve request | ||
end | ||
Client->>LLM: Forward request | ||
LLM-->>Client: Generated response | ||
opt User approval | ||
Client->>User: Review response | ||
User-->>Client: Approve response | ||
end | ||
Client-->>Server: CreateMessageResult | ||
``` | ||
|
||
## Messages | ||
|
||
This section defines the protocol messages for sampling in the Model Context Protocol (MCP). | ||
|
||
### Creating a Message | ||
|
||
#### Request | ||
|
||
To request sampling from an LLM via the client, the server MUST send a `sampling/createMessage` request. | ||
|
||
Method: `sampling/createMessage` | ||
Params: | ||
- `messages`: Array of `SamplingMessage` objects representing the conversation history | ||
- `systemPrompt`: Optional system prompt to use | ||
- `includeContext`: Optional request to include context from MCP servers | ||
- `temperature`: Optional sampling temperature | ||
- `maxTokens`: Maximum tokens to generate | ||
- `stopSequences`: Optional array of sequences that will stop generation | ||
- `metadata`: Optional provider-specific metadata | ||
|
||
Example: | ||
```json | ||
{ | ||
"jsonrpc": "2.0", | ||
"id": 1, | ||
"method": "sampling/createMessage", | ||
"params": { | ||
"messages": [ | ||
{ | ||
"role": "user", | ||
"content": { | ||
"type": "text", | ||
"text": "What is the capital of France?" | ||
} | ||
} | ||
], | ||
"systemPrompt": "You are a helpful assistant.", | ||
"maxTokens": 100, | ||
"temperature": 0.7, | ||
"includeContext": "none" | ||
} | ||
} | ||
``` | ||
|
||
#### Response | ||
|
||
The client MUST respond with a `CreateMessageResult` containing: | ||
|
||
- `role`: The role of the message (always "assistant") | ||
- `content`: The generated content | ||
- `model`: The name of the model used | ||
- `stopReason`: Why generation stopped | ||
|
||
Example: | ||
```json | ||
{ | ||
"jsonrpc": "2.0", | ||
"id": 1, | ||
"result": { | ||
"role": "assistant", | ||
"content": { | ||
"type": "text", | ||
"text": "The capital of France is Paris." | ||
}, | ||
"model": "gpt-4", | ||
"stopReason": "endTurn" | ||
} | ||
} | ||
``` | ||
|
||
## Error Handling | ||
|
||
Clients MUST be prepared to handle both user rejection of sampling requests and model API errors. Common error scenarios include: | ||
|
||
- User denies the sampling request | ||
- Model API is unavailable | ||
- Invalid sampling parameters | ||
- Context length exceeded | ||
|
||
The client SHOULD return appropriate error responses to the server in these cases. | ||
|
||
## Security Considerations | ||
|
||
Implementations MUST carefully consider the security implications of allowing servers to request model generations, including: | ||
|
||
- User consent and approval of sampling requests | ||
- Permissions around which servers can access which models | ||
- Content filtering and moderation | ||
- Rate limiting to prevent abuse | ||
- Privacy considerations around included context |