Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf improvement - Summary metric #729

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

mdebjit
Copy link

@mdebjit mdebjit commented Dec 26, 2024

CKMS Quantiles/Summary Performance Improvements

  • Optimized insertion: Replaced linear search with binary search in the insertBatch algorithm, improving insertion speed.

  • Enhanced readability & maintainability: Streamlined index management in both the compress and insertBatch algorithms, resulting in cleaner, more maintainable code.

  • Prevent undefined behavior: Addressed potential issues in the compress method by eliminating vector element erasure during iteration, preventing iterator invalidation and out-of-bounds errors.

  • Pre-allocation: Reserved vector capacity upfront when the size is known, eliminating multiple resize operations and significantly enhancing performance.

Benchmark Before Change (ns) After Change (ns) Change in Time Change in Speed
BM_Summary_Observe/0/iterations:262144 8074 1129 -6945 7.14x Faster
BM_Summary_Observe/1/iterations:262144 8032 1244 -6788 6.46x Faster
BM_Summary_Observe/8/iterations:262144 10122 8410 -1712 1.20x Faster
BM_Summary_Observe/64/iterations:262144 15786 17444 +1658 0.90x Slower
BM_Summary_Collect/0/0 41.8 42.7 +0.9 1.02x Slower
BM_Summary_Collect/1/0 96.2 110 +13.8 1.14x Slower
BM_Summary_Collect/8/0 338 429 +91 1.27x Slower
BM_Summary_Collect/64/0 2355 2788 +433 1.18x Slower
BM_Summary_Collect/0/1 44.1 42.4 -1.7 1.04x Faster
BM_Summary_Collect/1/1 101 111 +10 1.10x Slower
BM_Summary_Collect/8/1 496 521 +25 1.05x Slower
BM_Summary_Collect/64/1 8301 9060 +759 1.09x Slower
BM_Summary_Collect/0/8 44.9 35.5 -9.4 1.26x Faster
BM_Summary_Collect/1/8 149 137 -12 1.09x Faster
BM_Summary_Collect/8/8 1384 1787 +403 1.29x Slower
BM_Summary_Collect/64/8 45587 50081 +4494 1.10x Slower
BM_Summary_Collect/0/64 31.4 35.7 +4.3 1.14x Slower
BM_Summary_Collect/1/64 283 132 -151 2.14x Faster
BM_Summary_Collect/8/64 7899 9547 +648 1.21x Slower
BM_Summary_Collect/64/64 405506 420493 +14987 1.04x Slower
BM_Summary_Collect/0/512 34.5 33.6 -0.9 1.03x Faster
BM_Summary_Collect/1/512 1534 157 -1377 9.77x Faster
BM_Summary_Collect/8/512 39361 9249 -30112 4.25x Faster
BM_Summary_Collect/64/512 2029824 432225 -1597600 4.70x Faster
BM_Summary_Collect/0/4096 35.3 34.5 -0.8 1.02x Faster
BM_Summary_Collect/1/4096 3086 273 -2813 11.32x Faster
BM_Summary_Collect/8/4096 117444 116648 -796 1.01x Faster
BM_Summary_Collect/64/4096 5433854 5666187 +232333 1.04x Slower
BM_Summary_Collect/0/32768 33.2 39.3 +6.1 1.18x Slower
BM_Summary_Collect/1/32768 4012 2411 -1601 1.66x Faster
BM_Summary_Collect/8/32768 297063 511257 +214194 1.72x Slower
BM_Summary_Collect/64/32768 24117380 21697663 -2429702 1.11x Faster
BM_Summary_Collect/0/262144 40.7 34.9 -5.8 1.17x Faster
BM_Summary_Collect/1/262144 13560 11292 -2268 1.20x Faster
BM_Summary_Collect/8/262144 869739 1540760 +670021 1.77x Slower
BM_Summary_Collect/64/262144 36734543 100230906 +63596363 2.73x Slower
BM_Summary_Observe_Common/iterations:262144 4580 2480 -2100 1.85x Faster
BM_Summary_Collect_Common/0 196 310 +114 1.58x Slower
BM_Summary_Collect_Common/1 250 352 +102 1.41x Slower
BM_Summary_Collect_Common/8 579 899 +320 1.55x Slower
BM_Summary_Collect_Common/64 1810 816 -994 2.22x Faster
BM_Summary_Collect_Common/512 5108 1828 -3280 2.80x Faster
BM_Summary_Collect_Common/4096 32956 5892 -27064 5.60x Faster
BM_Summary_Collect_Common/32768 183060 28329 -154731 6.46x Faster
BM_Summary_Collect_Common/262144 159511 166397 +6868 1.04x Slower

The observations clearly indicate that for a limited number of quantiles (typically around 8), which represent the most common use case, there are substantial performance gains. However, for a larger number of quantiles (around 64), the performance improvements are either negligible or slightly diminished.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant