-
Notifications
You must be signed in to change notification settings - Fork 215
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow ContainerRegistryProvider to skip Match ContainerRegistryUrls when query the token #793
base: master
Are you sure you want to change the base?
Conversation
This issue is currently awaiting triage. If the repository mantainers determine this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Hi @cy-google. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
one nit on the variable name, lgtm otherwise |
/ok-to-test |
/lgtm |
/retest |
/assign sergeykanzhelev |
if g.UseRegistryFromImage { | ||
if registry, _, found := strings.Cut(image, "/"); found { | ||
cfg[registry] = entry | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if url cutting by /
failed - we will have no indication of an issue anywhere. Perhaps some warning log can be added or even better - error response returned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. No error response can be returned but just klog.Errorf
} | ||
return cfg | ||
} | ||
|
||
// Add our entry for each of the supported container registry URLs | ||
for _, k := range containerRegistryUrls { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't remember the logic too well. But here is the question. Old logic added all wildcarded urls. My thinking here - if image url will redirect from one gcr.io blob to another, we will still use the auth token. With the new approach, no redirects will have the auth token.
Is it a real issue that may happen? Who would be the best person to review it from security perspective?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if image url will redirect from one gcr.io blob to another, we will still use the auth token. With the new approach, no redirects will have the auth token
I think the change for the new logic is that we don't have allowed endpoints for registry anymore(line 55), instead, we will get the token no matter what the registry is in the image url. I don't fully sure what the re-direct is in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the redirect happens within the container runtime pull attempt, it doesn't have a chance to go get another credential for another registry... I don't actually know what happens if the container runtime is asked to pull an image from registry foo
using a credential config like {"foo":{...}}
and the registry redirects to registry bar
... does the container runtime look for an entry in the credential matching bar
or reuse the entry for foo
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@liggitt you are right, it will be on container runtime side. Auth will only be reused for the same domain
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a way to return a wildcard to indicate whatever registry the request gets redirected to can use the credential? Is that a reasonable thing to do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does the container runtime look for an entry in the credential matching bar or reuse the entry for foo
My understanding is that only the registry in the key of the credential config can be pulled(otherwise, it will try to pull the image with no credential from bar)
is there a way to return a wildcard to indicate whatever registry the request gets redirected to can use the credential? Is that a reasonable thing to do?
I don't know if there is a way to do so without changing the logic in kubelet. Not sure who can tell if it's reasonable. @SergeyKanzhelev may know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This thread got marked as outdated, continued it at #793 (comment)
New changes are detected. LGTM label has been removed. |
pkg/gcpcredential/gcpcredential.go
Outdated
if registry, _, found := strings.Cut(image, "/"); found { | ||
cfg[registry] = entry | ||
} else { | ||
klog.Errorf("Invalid image registry URL: %s. The URL must contain a '/' character to separate the registry domain from the image path. Please check the URL and ensure it's correctly formatted.", image) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this goes to stderr, right? that only gets surfaced / logged in the kubelet if the plugin invocation errors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, I did this because other errors are also be printed in stderr by using klog.Errorf
. I wonder if there is a reason for us to change the Provide function to return this specific error instead?
pkg/gcpcredential/gcpcredential.go
Outdated
} | ||
} else { | ||
// Add our entry for each of the supported container registry URLs | ||
for _, k := range containerRegistryUrls { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so the two potential downsides of this change comparing to the existing logic:
-
caching and reading of credentials will be done for each sub-domain separately instead of reading once and using it from the cache by the wildecard. How expensive is the read of credentials? Was this slow down considered and validated?
-
each subdomain will keep it's own entry now in the kubelet. I don't think it may lead to memory leak. At least I am not aware of any "dynamic" subdomains that may grow uncontrollably. So this risk is minimal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the cache level for this was already per image. There's no slow-down or leak increase if that is the case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As Jordan said, I think every time we pull an image by kueblet, we will use the domain of the registry, e.g. {"kind":"CredentialProviderResponse","apiVersion":"credentialprovider.kubelet.k8s.io/v1","cacheKeyType":"Image","cacheDuration":"0s","auth":{"gcr.io":{"username":"_token","password":"xxx"}}}
, so AFAIK, we don't haveusing it from the cache by the wildcard
for now, instead, we query the credential config everytime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think kubelet has this function: https://github.com/kubernetes/kubernetes/blob/4114a9b4e45a4df96f0383d87b2649640a6ffbf1/pkg/credentialprovider/keyring.go#L205C6-L205C15 and old logic was returning the cache entry that included globs. So token were reused.
I am not implying it is a huge overhead to call this thing multiple times. I just want to make sure we are making this change very intentionally and understand the cost of it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is at least my recollection on how it suppose to work. I didn't find a good test for it from the fast glance over
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think kubelet has this function: https://github.com/kubernetes/kubernetes/blob/4114a9b4e45a4df96f0383d87b2649640a6ffbf1/pkg/credentialprovider/keyring.go#L205C6-L205C15 and old logic was returning the cache entry that included globs. So token were reused.
That is used to decide if a given configured plugin should even be called to provide a credential for an image here:
// isImageAllowed returns true if the image matches against the list of allowed matches by the plugin.
func (p *pluginProvider) isImageAllowed(image string) bool {
for _, matchImage := range p.matchImages {
if matched, _ := credentialprovider.URLsMatchStr(matchImage, image); matched {
return true
}
}
return false
}
It is not used to decide the cache key.
For credentials returned from the plugin with a cache key level of "image", which this defaults to:
cloud-provider-gcp/cmd/auth-provider-gcp/provider/provider.go
Lines 93 to 96 in 041c3f4
keyType := os.Getenv(cacheTypeKey) if keyType == "" { return credentialproviderapi.ImagePluginCacheKeyType, nil }
... the credentials are only cached under the exact image:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As long as this plugin is being used with image-level credential caching (which is the default) or registry-level caching, there is no difference in the number of times the plugin would be invoked to get credentials by the kubelet for images from a given registry
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, sorry for the noise
// Currently, this is only used by auth-provider-gcp. | ||
if g.UseRegistryFromImage { | ||
if registry, _, found := strings.Cut(image, "/"); found { | ||
cfg[registry] = entry |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
continuing the thread from #793 (comment)
By using only the exact image from the registry (instead of partially wildcarded registries), if the registry forwards the image pull request to a different domain or a subdomain, the container runtime will not use the credential when re-attempting the pull against the redirected location
Previously, this returned credentials usable against "container.cloud.google.com", "gcr.io", "*.gcr.io", "*.pkg.dev"
, so any container registry forwards among those domains could use the credentials.
I don't really know what forwards to expect, so I have a hard time knowing whether this is an issue or not.
Options I see:
- Expect no forwards, return exactly the registry requested. If forwards to a different domain happen, no credentials will be used and the pull will fail if authentication is required.
- Add the registry for the requested image alongside the existing wildcard domains. This ensures the credential returned is usable for the exact domain of the image, plus all existing possible forwarded domains, but not any other forwarded domains outside the hard-coded set.
- Return the credential with a global wildcard (e.g.
"*"
) that allows using it against any forwarded domain. I'm not 100% sure"*"
works that way in the container registry or not for credential matching, and I'm not 100% sure we want credentials to be usable against any forwarded domain
1 is most strict / secure, but risks breaking existing forwarding.
2 ensures we won't regress anything that is currently working, but may not work for forwarding to new domains not in the hard-coded set.
3 (if "*" works as a global wildcard) is most permissive, most likely to work with any forwarding, but could allow credentials to be sent to unintended registries if the original registry forwarded out to them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc: @samuelkarp
Return the credential with a global wildcard (e.g. "") that allows using it against any forwarded domain. I'm not 100% sure "" works that way in the container registry or not for credential matching, and I'm not 100% sure we want credentials to be usable against any forwarded domain
this is not ideal and may lead to information disclosure. Similar to this CVE: GHSA-742w-89gc-8m9c. Yes, it is less critical since the redirect will be done by the reasonably trusted domains. But still concerning.
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: cy-google The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
change looks good to me, please squash to a single commit and we can merge |
…f when query the token Update the new field name from SkipContainerRegistryUrlsMatching to UseRegistryFromImage Update the logic in Provide function to handle wrong format image and use else for old logic Adopt option 2 mentioned in https://github.com/kubernetes/cloud-provider-gcp/pull/793/files#r1904560074 This will only append the registry of the requested image if it's not hardcoded in containerRegistryUrls Stop logging error when we can't cut image by "/" Also not removing the check for container.cloud.google.com
Done. Thanks @SergeyKanzhelev can you please take a look too please? |
We will have the credential provider unconditionally attach the the access token to the registry in the image request, and then rely on the Kubelet config to restrict which registries the access token will be sent to