-
Notifications
You must be signed in to change notification settings - Fork 251
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add basic exponential histogram #1736
base: main
Are you sure you want to change the base?
Conversation
👋 This pull request has been marked as stale because it has been open with no activity. You can: comment on the issue or remove the stale label to hold stale off for a while, add the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, @xuan-cao-swi!
# Contains the implementation of the ExponentialBucketHistogram aggregation | ||
# https://opentelemetry.io/docs/specs/otel/metrics/data-model/#exponentialhistogram |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line might be too long, but the link might look better in the YARD doc UI...
# Contains the implementation of the ExponentialBucketHistogram aggregation | |
# https://opentelemetry.io/docs/specs/otel/metrics/data-model/#exponentialhistogram | |
# Contains the implementation of the {https://opentelemetry.io/docs/specs/otel/metrics/data-model/#exponentialhistogram ExponentialBucketHistogram} aggregation |
MIN_SCALE = -10 | ||
MAX_SIZE = 160 | ||
|
||
# The default boundaries is calculated based on default max_size and max_scale value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# The default boundaries is calculated based on default max_size and max_scale value | |
# The default boundaries are calculated based on default max_size and max_scale values |
|
||
# The default boundaries is calculated based on default max_size and max_scale value | ||
def initialize( | ||
aggregation_temporality: ENV.fetch('OTEL_EXPORTER_OTLP_METRICS_TEMPORALITY_PREFERENCE', :delta), # TODO: aggregation_temporality should be renamed to collect_aggregation_temporality for clear definition |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand why this should be renamed. Can you tell me more? Is there a spec reference?
record_min_max: true, | ||
zero_threshold: 0 | ||
) | ||
@data_points = {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe @data_points
was refactored out of the other aggregations when views were merged in. Why save them as an instance variable in this aggregation?
end | ||
|
||
def validate_scale(scale) | ||
return scale unless scale > MAX_SCALE || scale < MIN_SCALE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure how performant it is, but another way to implement this could be:
(MIN_SCALE..MAX_SCALE).cover?(scale)
|
||
def downscale(change, positive, negative) | ||
return if change == 0 | ||
raise 'Invalid change of scale' if change.negative? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this get caught by the error handler? I'm concerned if we raise
, we might crash a user's app.
module Aggregation | ||
module ExponentialHistogram | ||
# Buckets is the fundamental building block of exponential histogram that store bucket/boundary value | ||
class Buckets |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this class have test coverage?
let(:start_time) { (Time.now.to_r * 1_000_000_000).to_i } | ||
let(:end_time) { ((Time.now + 60).to_r * 1_000_000_000).to_i } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about using Process.clock_gettime(Process::CLOCK_REALTIME, :nanosecond)
?
# The corresponding Go test is TestAlternatingGrowth1 where: | ||
# agg := NewFloat64(NewConfig(WithMaxSize(4))) | ||
# agg is an instance of github.com/lightstep/otel-launcher-go/lightstep/sdk/metric/aggregator/histogram/structure.Histogram[float64] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm... I can't get this link to load. Is there a permalink option?
_(exphdps[1].zero_threshold).must_equal(0) | ||
end | ||
|
||
it 'rescale_with_alternating_growth_0' do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it 'rescale_with_alternating_growth_0' do | |
it 'rescales with alternating growth 0' do |
correct_boundary = right_boundary(scale, correct_min_index) | ||
|
||
# truffleruby will fail because truffleruby `must_equal` check the exact precision of float point | ||
_(correct_boundary).must_equal(boundary) if RUBY_ENGINE != 'truffleruby' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Truffleruby will compare the exact float point when use must_equal
.
For example, for scale =-10, boundary is 5.562684646268003e-309
; correct_boundary is 1/179769313486231590772930519078902473361797697894230657273430081157732675805500963132708477322407536021120113879871393357658789768814416622492847430639474124377767893424865485276302219601246094119453082952085005768838150682342462881473913110540827237163350510684586298239947245938479716304835356329624224137216
,
Since 5.562684646268003e-309
will assume all number after 3 will be 0, so it's different compare to correct_boundary. Truffleruby will treat them as not equal but other ruby variation will accept it.
Let me know if I miss something.
Description
This PR introduces basic functionality for exponential histograms, similar to Python and JavaScript. Most of the code is adapted/copied from Python. The only missing feature compared to other languages is merging.
Exponential histogram merging aims to make bucket sizes consistent. With the current otel-ruby metrics collection mechanism, raw data is sent unmodified to the collector (e.g., from
@data_point
).I think it's best to implement the merge feature on the collector side to reduce duplicate work and computation on the client side. Downscaling is implemented on the language side to adjust bucket sizes, avoiding out-of-range indices for very large or small numbers.
Test
Most test cases are copied from Python/JavaScript to ensure the correctness of mapping and downscaling.
I didn't add all test case from Python as I think some of the test cases are redundant (e.g. 0 increment, etc.) but please correct me if I am wrong.