Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refresh cagg uses min value for dimension when start_time is NULL #7546

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

gayyappan
Copy link
Contributor

When the refresh_continuous_aggregate window's start is NULL, use the min value in the hypertable to determine the beginning of the range instead of the min value for the partition column.

When the refresh_continuous_aggregate window's start is NULL, use
the min value in the hypertable to determine the beginning of
the range instead of the min value for the partition column.
@gayyappan gayyappan marked this pull request as draft December 18, 2024 19:03
Copy link

codecov bot commented Dec 18, 2024

Codecov Report

Attention: Patch coverage is 10.81081% with 33 lines in your changes missing coverage. Please review.

Project coverage is 82.15%. Comparing base (59f50f2) to head (8c32688).
Report is 660 commits behind head on main.

Files with missing lines Patch % Lines
src/hypertable.c 0.00% 25 Missing ⚠️
tsl/src/continuous_aggs/refresh.c 33.33% 4 Missing and 4 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #7546      +/-   ##
==========================================
+ Coverage   80.06%   82.15%   +2.09%     
==========================================
  Files         190      230      +40     
  Lines       37181    43360    +6179     
  Branches     9450    10912    +1462     
==========================================
+ Hits        29770    35624    +5854     
- Misses       2997     3412     +415     
+ Partials     4414     4324      -90     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@gayyappan gayyappan added this to the TimescaleDB 2.18.0 milestone Jan 10, 2025
Copy link
Contributor

@mkindahl mkindahl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The description of the PR explains what is done, but it is more important why it is done. In particular:

The old behavior take the minimum value of the type when NULL is provided to create an open-ended range. What is the situation that break prompting you to do this approach instead?

Comment on lines +2430 to +2434
appendStringInfo(command,
"SELECT pg_catalog.min(%s) FROM %s.%s",
quote_identifier(NameStr(dim->fd.column_name)),
quote_identifier(NameStr(ht->fd.schema_name)),
quote_identifier(NameStr(ht->fd.table_name)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does a full table scan of the entire hypertable. With a hypertable that is big, this is going to be a significant problem, especially if tiering is involved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants