Skip to content

Commit

Permalink
Add argument completion and sampling specifications
Browse files Browse the repository at this point in the history
The commit adds two new specification documents to MCP: argument completion for prompts/resources and sampling for LLM text generation. Argument completion enables servers to provide contextual suggestions for argument values, while sampling defines how servers can request LLM generations via clients with user approval and permission controls.
  • Loading branch information
dsp-ant committed Oct 3, 2024
1 parent 28ef483 commit acb2479
Show file tree
Hide file tree
Showing 2 changed files with 395 additions and 0 deletions.
190 changes: 190 additions & 0 deletions docs/spec/completion.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,190 @@
---
title: Argument Completion
type: docs
weight: 6
---
Argument Completion enables servers to provide argument completion for prompt and resource URI arguments. Clients can request completion options for specific arguments, and servers can return ranked suggestions. This allows clients to build rich user interfaces with intelligent argument completion for argument values.

> **_NOTE:_** Argument Completion in MCP is similar to traditional IDE argument completion - it provides contextual suggestions based on available options, rather than AI-powered completion. The server maintains a fixed set of valid values for each argument and returns matching suggestions based on partial input.
## Capabilities

Support for argument completion is not indicated by a dedicated capability - servers that expose prompts or resources with arguments implicitly support argument completion for those arguments. Clients may attempt argument completion requests for any prompt or resource argument.

## Concepts

### Completion References

When requesting completions, clients must specify what is being completed using a reference type:

- `ref/prompt`: References a prompt by name
- `ref/resource`: References a resource by URI

The reference identifies the context for completion suggestions.

### Completion Results

Servers return an array of completion values ranked by relevance, with a maximum of 100 items per response. If more results are available beyond the first 100, servers MUST set `hasMore: true` in the response to indicate that additional results can be retrieved with subsequent requests. The optional `total` field allows servers to specify the complete number of matches available, even if not all are returned in a single response.

> **_NOTE:_** MCP does not currently support pagination of completion results - clients that need more than the first 100 matches must issue new completion requests with more specific argument values to narrow down the results.
## Use Cases

Common use cases for argument completion include:

### Prompt Argument Completion

A client requesting completion options for a prompt argument:

```json
{
"ref": {
"type": "ref/prompt",
"name": "code_review"
},
"argument": {
"name": "language",
"value": "py"
}
}
```

The server might respond with language suggestions:

```json
{
"completion": {
"values": ["python", "pytorch", "pyside", "pyyaml"],
"total": 10,
"hasMore": true
}
}
```

### Resource URI Completion

A client completing a path in a resource URI template:

```json
{
"ref": {
"type": "ref/resource",
"uri": "file://{path}"
},
"argument": {
"name": "path",
"value": "/home/user/doc"
}
}
```

The server could respond with matching paths:

```json
{
"completion": {
"values": [
"/home/user/documents",
"/home/user/docker",
"/home/user/downloads"
],
"hasMore": false
}
}
```

## Diagram

The following diagram visualizes a typical argument completion interaction between client and server:

```mermaid
sequenceDiagram
participant Client
participant Server
Note over Client,Server: Client requests completion options
Client->>Server: completion/complete
Server-->>Client: CompleteResult
Note over Client,Server: Client may request more specific results
opt New completions requested
Client->>Server: completion/complete
Server-->>Client: CompleteResult
end
```

## Messages

This section defines the protocol messages for argument completion in the Model Context Protocol (MCP).

### Requesting Completions

#### Request

To get completion suggestions, the client MUST send a `completion/complete` request.

Method: `completion/complete`
Params:
- `ref`: A `PromptReference` or `ResourceReference` indicating what is being completed
- `argument`: Object containing:
- `name`: The name of the argument being completed
- `value`: The current value to get completions for

Example:
```json
{
"jsonrpc": "2.0",
"id": 1,
"method": "completion/complete",
"params": {
"ref": {
"type": "ref/prompt",
"name": "code_review"
},
"argument": {
"name": "language",
"value": "py"
}
}
}
```

#### Response

The server MUST respond with a `CompleteResult` containing:

- `completion`: Object containing:
- `values`: Array of completion suggestions (maximum 100)
- `total`: Optional total number of matches available
- `hasMore`: Optional boolean indicating if additional results exist

Example:
```json
{
"jsonrpc": "2.0",
"id": 1,
"result": {
"completion": {
"values": ["python", "pytorch", "pyside"],
"total": 10,
"hasMore": true
}
}
}
```

## Error Handling

Servers MUST return appropriate errors if:
- The referenced prompt or resource does not exist
- The argument name is invalid
- Completion cannot be provided for other reasons

Clients SHOULD be prepared to handle cases where completion is temporarily unavailable or returns errors.

## Security Considerations

Implementations MUST carefully consider:
- Rate limiting completion requests to prevent abuse
- Access control for sensitive completion suggestions
- Validation of completion inputs to prevent injection attacks
205 changes: 205 additions & 0 deletions docs/spec/sampling.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,205 @@
---
title: Sampling
type: docs
weight: 7
description: "MCP protocol specification for language model sampling and text generation"
draft: false
params:
author: Anthropic
keywords: ["mcp", "sampling", "llm", "protocols"]
---

Sampling enables servers to request generations from a language model via the client, enabling clients to have control over which model to use and what prompts are accepted. Clients can approve or reject incoming sampling requests, and control permissions around which servers can access which models. Servers can optionally request context from other MCP servers to be included in prompts. Because sampling requests go from server to client, this is the only request type in MCP that flows in this direction.

## Capabilities

Clients indicate support for sampling by including a `sampling` capability in their `ClientCapabilities` during initialization. The `sampling` capability SHOULD be an empty object:

```json
{
"capabilities": {
"sampling": {}
}
}
```

Servers SHOULD check for this capability before attempting to use sampling functionality.

## Concepts

### Sampling Request

A Sampling Request in the Model Context Protocol (MCP) represents a request from a server to generate text from a language model via the client. Each request contains messages to send to the model, optional system prompts, and sampling parameters like temperature and maximum tokens. The client has full discretion over which model to use and whether to approve the request.

### Message Content

Message content can be either text or images, allowing for multimodal interactions where supported by the model. Text content is provided directly as strings, while image content must be base64 encoded with an appropriate MIME type.

## Use Cases

Common use cases for sampling include generating responses in chat interfaces, code completion, and content generation. Here are some example sampling scenarios:

### Chat Response

A server requesting a chat response:

```json
{
"messages": [
{
"role": "user",
"content": {
"type": "text",
"text": "What is the capital of France?"
}
}
],
"maxTokens": 100,
"temperature": 0.7
}
```

### Image Analysis

A server requesting analysis of an image:

```json
{
"messages": [
{
"role": "user",
"content": {
"type": "image",
"data": "base64_encoded_image_data",
"mimeType": "image/jpeg"
}
},
{
"role": "user",
"content": {
"type": "text",
"text": "Describe what you see in this image."
}
}
],
"maxTokens": 200
}
```

## Diagram

The following diagram visualizes the sampling request flow between
server and client:

```mermaid
sequenceDiagram
participant Server
participant Client
participant User
participant LLM
Note over Server,Client: Server requests sampling
Server->>Client: sampling/createMessage
opt User approval
Client->>User: Request approval
User-->>Client: Approve request
end
Client->>LLM: Forward request
LLM-->>Client: Generated response
opt User approval
Client->>User: Review response
User-->>Client: Approve response
end
Client-->>Server: CreateMessageResult
```

## Messages

This section defines the protocol messages for sampling in the Model Context Protocol (MCP).

### Creating a Message

#### Request

To request sampling from an LLM via the client, the server MUST send a `sampling/createMessage` request.

Method: `sampling/createMessage`
Params:
- `messages`: Array of `SamplingMessage` objects representing the conversation history
- `systemPrompt`: Optional system prompt to use
- `includeContext`: Optional request to include context from MCP servers
- `temperature`: Optional sampling temperature
- `maxTokens`: Maximum tokens to generate
- `stopSequences`: Optional array of sequences that will stop generation
- `metadata`: Optional provider-specific metadata

Example:
```json
{
"jsonrpc": "2.0",
"id": 1,
"method": "sampling/createMessage",
"params": {
"messages": [
{
"role": "user",
"content": {
"type": "text",
"text": "What is the capital of France?"
}
}
],
"systemPrompt": "You are a helpful assistant.",
"maxTokens": 100,
"temperature": 0.7,
"includeContext": "none"
}
}
```

#### Response

The client MUST respond with a `CreateMessageResult` containing:

- `role`: The role of the message (always "assistant")
- `content`: The generated content
- `model`: The name of the model used
- `stopReason`: Why generation stopped

Example:
```json
{
"jsonrpc": "2.0",
"id": 1,
"result": {
"role": "assistant",
"content": {
"type": "text",
"text": "The capital of France is Paris."
},
"model": "gpt-4",
"stopReason": "endTurn"
}
}
```

## Error Handling

Clients MUST be prepared to handle both user rejection of sampling requests and model API errors. Common error scenarios include:

- User denies the sampling request
- Model API is unavailable
- Invalid sampling parameters
- Context length exceeded

The client SHOULD return appropriate error responses to the server in these cases.

## Security Considerations

Implementations MUST carefully consider the security implications of allowing servers to request model generations, including:

- User consent and approval of sampling requests
- Permissions around which servers can access which models
- Content filtering and moderation
- Rate limiting to prevent abuse
- Privacy considerations around included context

0 comments on commit acb2479

Please sign in to comment.