-
Notifications
You must be signed in to change notification settings - Fork 183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add k8s.namespace.phase metric and attribute #1668
base: main
Are you sure you want to change the base?
Conversation
9b13449
to
7e057ce
Compare
model/k8s/metrics.yaml
Outdated
stability: experimental | ||
brief: "Operational status: `1` (true) or `0` (false) for each of the possible phases" | ||
instrument: updowncounter | ||
unit: "1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be empty unit?
"1" is for utilization:
Instruments for utilization metrics (that measure the fraction out of a total) are dimensionless and SHOULD use the default unit 1 (the unity).
Ref: https://opentelemetry.io/docs/specs/semconv/general/metrics/#instrument-units
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I followed what
semantic-conventions/model/hardware/common-metrics.yaml
Lines 49 to 65 in a596f19
- id: metric.hw.status | |
type: metric | |
metric_name: hw.status | |
stability: experimental | |
brief: "Operational status: `1` (true) or `0` (false) for each of the possible states" | |
instrument: updowncounter | |
unit: "1" | |
extends: metric.hw.attributes | |
note: > | |
`hw.status` is currently specified as an *UpDownCounter* but would ideally be represented using a | |
[*StateSet* as defined in OpenMetrics](https://github.com/prometheus/OpenMetrics/blob/v1.0.0/specification/OpenMetrics.md#stateset). | |
This semantic convention will be updated once *StateSet* is specified in OpenTelemetry. This planned change | |
is not expected to have any consequence on the way users query their timeseries backend to retrieve the | |
values of `hw.status` over time. | |
attributes: | |
- ref: hw.state | |
requirement_level: required |
@open-telemetry/specs-semconv-maintainers any suggestion here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remmember we had discussion in collector contrib about this -> open-telemetry/opentelemetry-collector-contrib#10553 (comment)
According to the general rules for semantic conventions, unit 1 is used for utilization metrics, i.e. percentages, ratios, and fractions.
When there are 2 states only (true or false), it's simple: you push 1 for true, and 0 for false, which works well with k8s.container.ready. IMPORTANT: Unit must be empty, and metric type must be UpDownCounter (users will typically use sum(k8s.container.ready) to count the number of containers that are ready).
Also these comments:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank's for the context here!
If I remove the unit
completely then make fix
fails:
× Invalid metric definition in "/home/weaver/source/k8s/metrics.yaml".
│ group_id=`metric.k8s.namespace.phase`. This group contains a metric type
│ but the unit is not set.
Should that be an empty string instead (unit: ""
), or something else?
ping @open-telemetry/specs-semconv-approvers for validation here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should be "{status}"
To some extent, this metric measures a count of success/failure statuses (it's just capped at 1)
Instruments that measure an integer count of something SHOULD only use annotations with curly braces to give additional meaning without the leading default unit (1). For example, use {packet}, {error}, {fault}, etc.
(from https://opentelemetry.io/docs/specs/semconv/general/metrics/#instrument-units)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't it be {namespace}
, since it represents a count of namespaces with the given phase? If I aggregated it across a cluster, it would give me a count of namespaces broken down by phase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, you're absolutely right @dashpole, I reacted to the wrong metric- #1668 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you all for the feedback here! I changed the unit to {namespace}
.
This PR was marked stale due to lack of activity. It will be closed in 7 days. |
9824295
to
2eb9c03
Compare
Signed-off-by: ChrsMark <[email protected]>
2eb9c03
to
c50df58
Compare
Part of #1032
Changes
This PR adds the
k8s.namespace.phase
as metric along with its respective attributek8s.namespace.phase
.This metric is already in use by the Collector and specifically the
k8scluster
receiver.The introduced attribute is new in order to follow the modeling that is already in use by
hw
metrics andjmx
metrics mentioned at #1032 (comment) (see #1554).Collector implementation: https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/9dfa2f7813b11500d001622f3d8c1dd8d9ac58fd/receiver/k8sclusterreceiver/internal/namespace/namespaces.go#L14
K8s API ref: https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.30/#namespacestatus-v1-core
Merge requirement checklist
[chore]