Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add metadata_file_last_modified for snowpiped tables #239

Merged
merged 3 commits into from
Apr 10, 2024

Conversation

Catisyf
Copy link
Contributor

@Catisyf Catisyf commented Nov 8, 2023

Description & motivation

resolves: #276

dbt currently supports the following metadata columns for snowpiped table.

    #   `metadata_filename`: the file from which this row was loaded
    #   `metadata_file_row_number`: the numbered row this was in that file
    #   `_dbt_copied_at`: the current_timestamp when this row was loaded (backfilled or piped)

This PR adds a new metadata column metadata_file_last_modified (metadata$file_last_modified), which returns the last modified timestamp of the staged file the current row belongs to. This metadata is useful for a use case where we'd like to filter staged files from an S3 bucket after a certain cutoff date, and we can't easily infer such date from the file name.

This is a simple modification and I have tested the changes in our dbt project. It works as expected.

Checklist

  • I have verified that these changes work locally
  • I have updated the README.md (if applicable)
  • I have added an integration test for my fix/feature (if applicable)

@Catisyf Catisyf requested a review from jeremyyeo as a code owner November 8, 2023 08:34
@jtmcn
Copy link

jtmcn commented Jan 12, 2024

Please do add this functionality. Although I'd suggest replacing the current _dbt_copied_at field which uses current_timestamp since Snowflake recommends against using that function in a Snowpipe.

https://community.snowflake.com/s/article/Incorrect-timestamp-in-Snowpipe-while-using-current-timestampfunction

https://docs.snowflake.com/en/user-guide/data-load-snowpipe-ts#load-times-inserted-using-current-timestamp-earlier-than-load-time-values-in-copy-history-view

@dataders
Copy link
Collaborator

@jtmcn see #281 for your suggestion. I'd welcome PR for this!

Copy link
Collaborator

@dataders dataders left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @Catisyf!

@dataders dataders merged commit e5359d8 into dbt-labs:main Apr 10, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[feature] additional metadata for snowpipe backed external tables
3 participants