Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Authentication support #133
base: main
Are you sure you want to change the base?
RFC: Authentication support #133
Changes from 2 commits
f68e207
90309c2
1dadaa3
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To double-check my understanding: does the client provide the
redirect_uri
? (I.e., a local client might host a temporary server available atlocalhost:NNNN
to receive the callback, whereas a client running on a HTTP server might providehttps://my-mcp-host.com/
?)I ask because we do (something similar to) a OAuth2.0 device flow on mcp.run that might sidestep the need to have the callback land back at a local server. (The MCP server requests a code/authorization url from mcp.run, then polls for the code to become valid while presenting the authorization url to the user.) I'm not sure if it's in-scope for this proposal, though!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, exactly. Often times "public" desktop apps embed a web server, bind to localhost, and use the resulting port for their callback URL. Server-side apps would just use their domain and standard port.
One benefit of decoupling the MCP Server and OAuth server, is that authorization flows (and introducing new ones, such as the device flow) don't impact the MCP server itself. That functionality is delegated to the OAuth server, which can be extended as necessary to support new forms of authentication, new flows, etc - without impacting the MCP server itself (just as is the case with typical HTTP/REST APIs).
I suspect the reason to show the redirect-based authorization code flow is simply because they are more common. We should avoid making assumptions that limit interactions to just that flow, however.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The details about redirection are a bit scant here… does this mean literally HTTP 30x?
And should the
GET /authorize
be happening in a web browser context?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, I'l add more comments. GET /authorize happens either in a browser or outside of a browser, up to client. The redirected URl needs to happen in the browser. I think clients will do /authorize in the browser already and rely on redirect mechanics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly something to consider here, depending on whether keeping the SSE transport browser-compatible is a goal:
EventSource
doesn't support setting additional headers, at least in the browser. (It does support sendingCookie
headers automatically using the{ withCredentials }
EventSourceInit
option, however.)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about this more - maybe it wasn't clear from #101 and the comments, but whats the rationale between implementing these endpoints directly on the MCP server versus advertising it (#101 (comment))? Is it because it would keep the spec simpler for Clients to implement with
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it would be easier. The only, standard compliant way to advertise would be through well-known OpenID/OAuth discovery, which significantly blows up the scope. Every other mechanism would be unusual and non-standard.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where would the
<mcp_server>/token
endpoint be used?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you land on this instead of returning the URLs in the 401? This feels a bit more constraining for servers, while making things easier for clients—not convinced that's the right tradeoff, though I don't feel too strongly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to stay within HTTP verbs that libraries can easily understand. Providing URLs as part of the body is just something I've never seen and will require custom handling on the client side. The body cannot be introspected by proxies and other standard HTTP infrastructure. I decided I want to stay within the standard HTTP paradigm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is written here seems fine, but i am not aware that there is already a strict definition of an MCP server url structure. I don't see URL anywhere mentioned in the spec website. Is this actually defined yet?
As an example when implementing a client using the SSE client library, my impression was I would need to give a base url to hand out was directly the endpoint e.g.
https://example.com/some-path/see
and everything in relation is undefined (e.g. the post endpoint is only discovered based on the response, or by convention, but not actually part of the spec)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're correct about the current state. It may be the case that we want to make this stricter/more standardized going forward, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely liking how this becomes more intuitive as we leverage HTTP standards more, but I think we could go a bit further in leveraging OAuth standards.
If I'm understanding correctly, the current implementation requires that the MCP server performs some of the roles that an OAuth server could be responsible for. I'd like to leave room for us to decouple the auth server and the MCP server down the road.
In some cases, I do think MCP servers should store intermediary access and refresh tokens (e.g. when the MCP server owns its own resources, or when the MCP server does not want to give the client a token that can be used directly with some 3rd party API). Even in this case, I think the MCP server should still return some access token and some refresh token, in line with the OAuth standard. This is (A) more secure, (B) more expected, and (C) leaves us the option later to separate the auth flow from the MCP server entirely.
I think we could benefit from removing the concept of "session token" and replace it with "client's OAuth2 access token". I'm thinking that will be more familiar to most developers. This would leave the door open for us to generalize the OAuth flow to allow clients to get their tokens from centralized servers instead of through the MCP server, if the demand for that arises. I think demand for this is quite possible down the road. For example, in a world where a client talks to hundreds of MCP servers, it may become valuable to have a centralized MCP OAuth server.
Maybe we could take a middleground approach:
This will:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your point about reusing OAuth access tokens and refresh tokens down to the client makes sense to me (and would address the other thread wherein we're a bit confused about how/when the client refreshes its token).
FWIW: From my reading, I don't think anything here requires that the MCP server specifically hosts the OAuth flow. The URLs issued could point directly to an external auth server AFAICT.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, separation of concerns and keeping the MCP server more simpler I feel might be a better approach. #133 (comment) might be a much simpler way if the server just advertises it 🤔
Agreed, that was what I was thinking as well and what I am personally looking into
Loving this as well, keeps it closer to the standard and gives the flexibility / freedom for the MCP server to handle its own session by creating its own OAuth tokens (session-based auth) or pass through OAuth tokens from another provider (SSO, social login)
In the case of a spec, we could mention that an
accessToken
is always returned and anrefreshToken
is optionally returned - this will also give the freedom for MCP servers and/or a decoupled authentication server to either pass a singular access token and optionally a way for the client to refresh the tokenThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be Servers instead of Clients, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clients don't control this, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to think more about this and would love to get input. I think this actualy might be correct but the refresh flow is not declared. The question here is, should session tokens be refreshed? You are right that SERVERS should always refresh since they hold oauth tokens. There is an additional questions if the session token for the client should also be refreshed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my mind there are two options -
My opinion is that the second option (fully managed by the client) would be much more simpler and provides more freedom for the spec - it keeps it simple without needing to include more complex session management in the MCP server implementation, which I feel like is a separate concern and gives developers for Clients the freedom to do what they need with the OAuth tokens (ie. implement their own session management)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ultimately, at some point, the client will need to update the auth token that it passes to the MCP server in the request header.
If the client is not doing it's own refreshes, then that implies the server needs to return back the token to the client. I don't think that is right, otherwise why would the client pass anything at all....
I'm wondering if the client actually needs to do /all/ the steps here, and the server not do any of the exchange/refresh. I realize the motivation behind the current proposal since the common case is the server is acting as a client to a third party.
Edit: This comment was made with a misunderstanding of part of the problem statement. Please resolve.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahhh wait, correct me if I am wrong, but is the motivation for this spec for authentication for the MCP server to an external resource or is it authentication for the MCP client to MCP server 🤔?
Reading this spec more, it seems to be more of the former (MCP Server -> External Resource), since the MCP server would be managing the refresh token
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To the specific question(s) above: my understanding is that session tokens don't refresh, and will eventually expire. If that happens, the client should restart the auth flow—at which point, the server could utilize its OAuth refresh token.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, I now have the complete mental model. Not sure if this is appropriate to include in the spec but it could make a nice intro on the problem being solved here or assumptions made.
Not to anchor on home assistant too much here, but i realize it is fairly similar in that regard. I wrote up a quick TL;DR of how it works. This also touches on Nick's comment about using OAuth for the client/server auth too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jspahrsummers - your formulation of the problem statement here is quite helpful and agreed that it would be good to add it to the spec / documentation as an introduction (cc @dsp-ant)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jspahrsummers Makes sense - agreed that clients will not have knowledge so there needs to be some sort of way of retrieving those, either through the MCP server itself or a separate auth service that the MCP server potentially advertises. Given the other discussions below, I'm feeling it should be a separate auth service while keeping the MCP server as a "resource" server according the the OAuth spec.
This also opens up the case where MCP servers that require third-party service resources can either have their own separate auth service or even at a simpler level, advertise the third-party's auth service to retrieve the tokens for the client to make authenticated calls with (sso, social login)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure I fully follow. Can you write up a proposal or a diagram?
One thing I want to highlight. If we add proper token exchange and refresh between client and server to this proposal, we get a nice property: An MCP server can just be it's own oauth service OR it can choose to forward to another OAuth service. Both have the same flow if we get this correct and hence it becomes easy to chain as well as simple to the client.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does a server do this validation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
It's not clear what's the mechanism for MCP Clients to register their redirect URIs in the MCP Server.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also share this concern - we talked at one point about using the Origin header, however I'm fuzzy on the details on how this would work. I can imagine some potential stateful ways of solving it but haven't thought it through properly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thoughts about representing this as an error instead? It's a bit out of the normal initialization flow, and missing some fields that we normally require (e.g., capabilities), which will make parsing more annoying.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not an error. Its a legitimate correct respond. We can say that the respond of initialize is EITHER an upgrade or the standard initialize response, e.g.
InitializeResponse = UpgradeResponse | ServerInitializeResponse
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to standardize a list of transport strings if we pursue this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clients should be allowed to reject transports they don't support, somehow.