[Feature] support copy multiple tables in parallel using copy_partitions #559
Labels
feature:python-models
Issues related to python models
pkg:dbt-bigquery
Issue affects dbt-bigquery
type:enhancement
New feature request
Is this your first time submitting a feature request?
Describe the feature
Python BigQuery Client supports asynchronous copy jobs while the dbt-bigquery adapter sends BigQuery requests one by one (using
incremental_strategy = 'insert_overwrite'
withcopy_partitions=true
).We can achieve a better performance if we start sending requests in small batches of partitions.
dbt-bigquery already supports parallel execution in the copy_bq_table function.
But in the bq_copy_partitions macro partitions are sent one at a time.
We can probably implement this feature by introducing a
batch_size
argument to the configs:Default value will be 1. And
bq_copy_partitions
macro will send a list of partitions to thecopy_bq_table
, where the size of list =batch_size
.Describe alternatives you've considered
No response
Who will this benefit?
Anyone who has high amount of heavy BigQuery partitions.
Are you interested in contributing this feature?
Definitely, just need a green light to proceed
Anything else?
No response
The text was updated successfully, but these errors were encountered: