Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] dbt-snowflake raises an error when caching less than the maximum number of objects in a schema #1234

Closed
2 tasks done
mikealfare opened this issue Nov 7, 2024 · 0 comments · Fixed by #1235
Closed
2 tasks done
Assignees
Labels
type:bug Something isn't working

Comments

@mikealfare
Copy link
Contributor

mikealfare commented Nov 7, 2024

Is this a new bug in dbt-snowflake?

  • I believe this is a new bug in dbt-snowflake
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

I'm seeing this error when running dbt-snowflake on a schema with 12,500 objects:

16:39:51  Encountered an error:
Compilation Error in macro list_relations_without_caching (macros/adapters/metadata.sql)
  dbt will list a maximum of 100000 objects in schema <DATABASE>.<SCHEMA>.
             Your schema exceeds this limit. Please contact [email protected] for troubleshooting tips,
             or review and reduce the number of objects contained.
  
  > in macro snowflake__get_paginated_relations_array (macros/adapters.sql)
  > called by macro snowflake__list_relations_without_caching (macros/adapters.sql)
  > called by macro list_relations_without_caching (macros/adapters/metadata.sql)
  > called by macro list_relations_without_caching (macros/adapters/metadata.sql)

I can confirm that there are less than 100,000 objects in this schema.

-- use dbt service account role
show objects in <DATABASE>.<SCHEMA>;
result: 12,500

Expected Behavior

I should be able to run dbt-snowflake on a schema with 12,500 objects and the cache should reflect all 12,500 objects with no error.

Steps To Reproduce

  1. Create a project with 12,500 objects in a single schema (or more than 10K really)
  2. Run dbt run on the full project

Relevant log output

No response

Environment

- OS: dbt Cloud
- Python: 3.9
- dbt-core: versionless
- dbt-snowflake: versionless

Additional Context

Initial analysis suggests this is the result of the 2024_07 Snowflake bundle. In particular, it looks like part of this change. We use the object name as a watermark when paginating. This is no longer deterministic, meaning that we could (and very likely do) get duplicates of relations and miss relations because we're picking a random starting point in a randomized list. In the past, this return recordset was ordered, making the pagination deterministic. We need to figure out a new watermark method.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant