Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable Prometheus metrics exporter for ingress-nginx controller. #297

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

achton
Copy link
Contributor

@achton achton commented May 2, 2024

What does this PR do?

To better track how the ingress-nginx controller is doing, we should export metrics from it to Prometheus/Grafana.

Any specific requests for how the PR should be reviewed?

This is not yet rolled out to the cluster, so eyeball it (and the docs) and do a Task-based rollout while monitoring the effects.

There are some dashboards available which could be imported for easy checking of metrics.

Available metrics are listed here.

@achton achton requested review from hypesystem and kasperg May 2, 2024 14:42
Copy link
Contributor

@hypesystem hypesystem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, rul det gerne ud i platformen og let's get it merged!

@hypesystem
Copy link
Contributor

Du må gerne køre denne ud at your convenience - men sig lige til på Zulip så jeg kan reenable alerts til Zulip :-)

@hypesystem
Copy link
Contributor

hypesystem commented Jun 26, 2024

This PR was in a bit of an unclear state because we talked about it on Zulip. Basically we'd like to scale up prometheus and loki to be confident in their ability to handle more data ingress before we execute and run this. So it's still waiting a bit until we get over the hump of getting all the libraries live.

@ITViking
Copy link
Contributor

are we in a good enough place to get this in now? @hypesystem and @achton

@achton
Copy link
Contributor Author

achton commented Sep 19, 2024

My impression is "yes", but I think you guys are better suited to answer that question.
I've resolved conflicts for ya :-)

@ITViking
Copy link
Contributor

ITViking commented Jan 9, 2025

This PR was in a bit of an unclear state because we talked about it on Zulip. Basically we'd like to scale up prometheus and loki to be confident in their ability to handle more data ingress before we execute and run this. So it's still waiting a bit until we get over the hump of getting all the libraries live.

@hypesystem So what needs doing is to give loki and prometheus more resources or replicas or both - have I understood this correctly?

@hypesystem
Copy link
Contributor

hypesystem commented Jan 14, 2025

@ITViking It kind of depends on how the resource use and storage of Loki is looking at the moment. We were waiting to monitor with all sites live to have an idea.

Look at the resource use over time for the loki and prometheus pods to see how well-provisioned they are.

If they don't spike too much over their resource requests we should be fine.

Next step is looking at their storage, to make sure we are likely to be able to handle a significant amount of more data.

If we also look far from our limits there, it should be fine to merge this. Alternatively, we need bigger disks for Loki and/or Prometheus which is a bit more difficult without throwing away data.


A final question is if there's a good driver for adding this extra data export/whether it will be worth the potential extra cost of moving data around.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants