Enable Prometheus metrics exporter for ingress-nginx controller. #297

achton · 2024-05-02T14:37:44Z

What does this PR do?

To better track how the ingress-nginx controller is doing, we should export metrics from it to Prometheus/Grafana.

Any specific requests for how the PR should be reviewed?

This is not yet rolled out to the cluster, so eyeball it (and the docs) and do a Task-based rollout while monitoring the effects.

There are some dashboards available which could be imported for easy checking of metrics.

Available metrics are listed here.

hypesystem

Nice, rul det gerne ud i platformen og let's get it merged!

hypesystem · 2024-05-08T10:00:33Z

Du må gerne køre denne ud at your convenience - men sig lige til på Zulip så jeg kan reenable alerts til Zulip :-)

hypesystem · 2024-06-26T11:59:24Z

This PR was in a bit of an unclear state because we talked about it on Zulip. Basically we'd like to scale up prometheus and loki to be confident in their ability to handle more data ingress before we execute and run this. So it's still waiting a bit until we get over the hump of getting all the libraries live.

ITViking · 2024-09-19T11:12:54Z

are we in a good enough place to get this in now? @hypesystem and @achton

achton · 2024-09-19T11:18:01Z

My impression is "yes", but I think you guys are better suited to answer that question.
I've resolved conflicts for ya :-)

ITViking · 2025-01-09T13:27:09Z

This PR was in a bit of an unclear state because we talked about it on Zulip. Basically we'd like to scale up prometheus and loki to be confident in their ability to handle more data ingress before we execute and run this. So it's still waiting a bit until we get over the hump of getting all the libraries live.

@hypesystem So what needs doing is to give loki and prometheus more resources or replicas or both - have I understood this correctly?

hypesystem · 2025-01-14T15:23:49Z

@ITViking It kind of depends on how the resource use and storage of Loki is looking at the moment. We were waiting to monitor with all sites live to have an idea.

Look at the resource use over time for the loki and prometheus pods to see how well-provisioned they are.

If they don't spike too much over their resource requests we should be fine.

Next step is looking at their storage, to make sure we are likely to be able to handle a significant amount of more data.

If we also look far from our limits there, it should be fine to merge this. Alternatively, we need bigger disks for Loki and/or Prometheus which is a bit more difficult without throwing away data.

A final question is if there's a good driver for adding this extra data export/whether it will be worth the potential extra cost of moving data around.

Enable Prometheus metrics exporter for ingress-nginx controller.

f5e5a98

achton requested review from hypesystem and kasperg May 2, 2024 14:42

hypesystem approved these changes May 3, 2024

View reviewed changes

Merge branch 'main' into feature/ingress-nginx-metrics

a2eae7d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable Prometheus metrics exporter for ingress-nginx controller. #297

Enable Prometheus metrics exporter for ingress-nginx controller. #297

achton commented May 2, 2024 •

edited

Loading

hypesystem left a comment

hypesystem commented May 8, 2024

hypesystem commented Jun 26, 2024 •

edited

Loading

ITViking commented Sep 19, 2024

achton commented Sep 19, 2024

ITViking commented Jan 9, 2025

hypesystem commented Jan 14, 2025 •

edited

Loading

Enable Prometheus metrics exporter for ingress-nginx controller. #297

Are you sure you want to change the base?

Enable Prometheus metrics exporter for ingress-nginx controller. #297

Conversation

achton commented May 2, 2024 • edited Loading

What does this PR do?

Any specific requests for how the PR should be reviewed?

hypesystem left a comment

Choose a reason for hiding this comment

hypesystem commented May 8, 2024

hypesystem commented Jun 26, 2024 • edited Loading

ITViking commented Sep 19, 2024

achton commented Sep 19, 2024

ITViking commented Jan 9, 2025

hypesystem commented Jan 14, 2025 • edited Loading

achton commented May 2, 2024 •

edited

Loading

hypesystem commented Jun 26, 2024 •

edited

Loading

hypesystem commented Jan 14, 2025 •

edited

Loading