diff --git a/website/docs/docs/build/snapshots.md b/website/docs/docs/build/snapshots.md index a592056ae4..b754639c01 100644 --- a/website/docs/docs/build/snapshots.md +++ b/website/docs/docs/build/snapshots.md @@ -437,103 +437,100 @@ Snapshot tables will be created as a clone of your sourc In dbt Core v1.9+ (or available sooner in [the "Latest" release track in dbt Cloud](/docs/dbt-versions/cloud-release-tracks)): - These column names can be customized to your team or organizational conventions using the [`snapshot_meta_column_names`](/reference/resource-configs/snapshot_meta_column_names) config. -ess) - Use the [`dbt_valid_to_current` config](/reference/resource-configs/dbt_valid_to_current) to set a custom indicator for the value of `dbt_valid_to` in current snapshot records (like a future date such as `9999-12-31`). By default, this value is `NULL`. When set, dbt will use this specified value instead of `NULL` for `dbt_valid_to` for current records in the snapshot table. - Use the [`hard_deletes`](/reference/resource-configs/hard-deletes) config to track deleted records as new rows with the `dbt_is_deleted` meta field when using the `hard_deletes='new_record'` field. -| Field | Meaning | Usage | -| -------------- | ------- | ----- | -| dbt_valid_from | The timestamp when this snapshot row was first inserted | This column can be used to order the different "versions" of a record. | -| dbt_valid_to | The timestamp when this row became invalidated.
For current records, this is `NULL` by default or the value specified in `dbt_valid_to_current`. | The most recent snapshot record will have `dbt_valid_to` set to `NULL` or the specified value. | -| dbt_scd_id | A unique key generated for each snapshotted record. | This is used internally by dbt | -| dbt_updated_at | The updated_at timestamp of the source record when this snapshot row was inserted. | This is used internally by dbt | -| dbt_is_deleted | A boolean value indicating if the record has been deleted. `True` if deleted, `False` otherwise. | Added when `hard_deletes='new_record'` is configured. This is used internally by dbt | +| Field |
Meaning
| Notes | Example| +| -------------- | ------- | ----- | ------- | +| `dbt_valid_from` | The timestamp when this snapshot row was first inserted and became valid. | This column can be used to order the different "versions" of a record. | `snapshot_meta_column_names: {dbt_valid_from: start_date}` | +| `dbt_valid_to` | The timestamp when this row became invalidated. For current records, this is `NULL` by default or the value specified in `dbt_valid_to_current`. | The most recent snapshot record will have `dbt_valid_to` set to `NULL` or the specified value. | `snapshot_meta_column_names: {dbt_valid_to: end_date}` | +| `dbt_scd_id` | A unique key generated for each snapshot row. | This is used internally by dbt. | `snapshot_meta_column_names: {dbt_scd_id: scd_id}` | +| `dbt_updated_at` | The `updated_at` timestamp of the source record when this snapshot row was inserted. | This is used internally by dbt. | `snapshot_meta_column_names: {dbt_updated_at: modified_date}` | +| `dbt_is_deleted` | A string value indicating if the record has been deleted. (`True` if deleted, `False` if not deleted). |Added when `hard_deletes='new_record'` is configured. | `snapshot_meta_column_names: {dbt_is_deleted: is_deleted}` | -*The timestamps used for each column are subtly different depending on the strategy you use: +All of these column names can be customized using the `snapshot_meta_column_names` config. Refer to this [example](/reference/resource-configs/snapshot_meta_column_names#example) for more details. -For the `timestamp` strategy, the configured `updated_at` column is used to populate the `dbt_valid_from`, `dbt_valid_to` and `dbt_updated_at` columns. +*The timestamps used for each column are subtly different depending on the strategy you use: -
- Details for the timestamp strategy +- For the `timestamp` strategy, the configured `updated_at` column is used to populate the `dbt_valid_from`, `dbt_valid_to` and `dbt_updated_at` columns. -Snapshot query results at `2024-01-01 11:00` + -| id | status | updated_at | -| -- | ------- | ---------------- | -| 1 | pending | 2024-01-01 10:47 | + Snapshot query results at `2024-01-01 11:00` -Snapshot results (note that `11:00` is not used anywhere): + | id | status | updated_at | + | -- | ------- | ---------------- | + | 1 | pending | 2024-01-01 10:47 | -| id | status | updated_at | dbt_valid_from | dbt_valid_to | dbt_updated_at | -| -- | ------- | ---------------- | ---------------- | ---------------- | ---------------- | -| 1 | pending | 2024-01-01 10:47 | 2024-01-01 10:47 | | 2024-01-01 10:47 | + Snapshot results (note that `11:00` is not used anywhere): -Query results at `2024-01-01 11:30`: + | id | status | updated_at | dbt_valid_from | dbt_valid_to | dbt_updated_at | + | -- | ------- | ---------------- | ---------------- | ---------------- | ---------------- | + | 1 | pending | 2024-01-01 10:47 | 2024-01-01 10:47 | | 2024-01-01 10:47 | -| id | status | updated_at | -| -- | ------- | ---------------- | -| 1 | shipped | 2024-01-01 11:05 | + Query results at `2024-01-01 11:30`: -Snapshot results (note that `11:30` is not used anywhere): + | id | status | updated_at | + | -- | ------- | ---------------- | + | 1 | shipped | 2024-01-01 11:05 | -| id | status | updated_at | dbt_valid_from | dbt_valid_to | dbt_updated_at | -| -- | ------- | ---------------- | ---------------- | ---------------- | ---------------- | -| 1 | pending | 2024-01-01 10:47 | 2024-01-01 10:47 | 2024-01-01 11:05 | 2024-01-01 10:47 | -| 1 | shipped | 2024-01-01 11:05 | 2024-01-01 11:05 | | 2024-01-01 11:05 | + Snapshot results (note that `11:30` is not used anywhere): -Snapshot results with `hard_deletes='new_record'`: + | id | status | updated_at | dbt_valid_from | dbt_valid_to | dbt_updated_at | + | -- | ------- | ---------------- | ---------------- | ---------------- | ---------------- | + | 1 | pending | 2024-01-01 10:47 | 2024-01-01 10:47 | 2024-01-01 11:05 | 2024-01-01 10:47 | + | 1 | shipped | 2024-01-01 11:05 | 2024-01-01 11:05 | | 2024-01-01 11:05 | -| id | status | updated_at | dbt_valid_from | dbt_valid_to | dbt_updated_at | dbt_is_deleted | -|----|---------|------------------|------------------|------------------|------------------|----------------| -| 1 | pending | 2024-01-01 10:47 | 2024-01-01 10:47 | 2024-01-01 11:05 | 2024-01-01 10:47 | False | -| 1 | shipped | 2024-01-01 11:05 | 2024-01-01 11:05 | 2024-01-01 11:20 | 2024-01-01 11:05 | False | -| 1 | deleted | 2024-01-01 11:20 | 2024-01-01 11:20 | | 2024-01-01 11:20 | True | + Snapshot results with `hard_deletes='new_record'`: + | id | status | updated_at | dbt_valid_from | dbt_valid_to | dbt_updated_at | dbt_is_deleted | + |----|---------|------------------|------------------|------------------|------------------|----------------| + | 1 | pending | 2024-01-01 10:47 | 2024-01-01 10:47 | 2024-01-01 11:05 | 2024-01-01 10:47 | False | + | 1 | shipped | 2024-01-01 11:05 | 2024-01-01 11:05 | 2024-01-01 11:20 | 2024-01-01 11:05 | False | + | 1 | deleted | 2024-01-01 11:20 | 2024-01-01 11:20 | | 2024-01-01 11:20 | True | -
-
+ -For the `check` strategy, the current timestamp is used to populate each column. If configured, the `check` strategy uses the `updated_at` column instead, as with the timestamp strategy. +- For the `check` strategy, the current timestamp is used to populate each column. If configured, the `check` strategy uses the `updated_at` column instead, as with the timestamp strategy. -
- Details for the check strategy + -Snapshot query results at `2024-01-01 11:00` + Snapshot query results at `2024-01-01 11:00` -| id | status | -| -- | ------- | -| 1 | pending | + | id | status | + | -- | ------- | + | 1 | pending | -Snapshot results: + Snapshot results: -| id | status | dbt_valid_from | dbt_valid_to | dbt_updated_at | -| -- | ------- | ---------------- | ---------------- | ---------------- | -| 1 | pending | 2024-01-01 11:00 | | 2024-01-01 11:00 | + | id | status | dbt_valid_from | dbt_valid_to | dbt_updated_at | + | -- | ------- | ---------------- | ---------------- | ---------------- | + | 1 | pending | 2024-01-01 11:00 | | 2024-01-01 11:00 | -Query results at `2024-01-01 11:30`: + Query results at `2024-01-01 11:30`: -| id | status | -| -- | ------- | -| 1 | shipped | + | id | status | + | -- | ------- | + | 1 | shipped | -Snapshot results: + Snapshot results: -| id | status | dbt_valid_from | dbt_valid_to | dbt_updated_at | -| --- | ------- | ---------------- | ---------------- | ---------------- | -| 1 | pending | 2024-01-01 11:00 | 2024-01-01 11:30 | 2024-01-01 11:00 | -| 1 | shipped | 2024-01-01 11:30 | | 2024-01-01 11:30 | + | id | status | dbt_valid_from | dbt_valid_to | dbt_updated_at | + | --- | ------- | ---------------- | ---------------- | ---------------- | + | 1 | pending | 2024-01-01 11:00 | 2024-01-01 11:30 | 2024-01-01 11:00 | + | 1 | shipped | 2024-01-01 11:30 | | 2024-01-01 11:30 | -Snapshot results with `hard_deletes='new_record'`: + Snapshot results with `hard_deletes='new_record'`: -| id | status | dbt_valid_from | dbt_valid_to | dbt_updated_at | dbt_is_deleted | -|----|---------|------------------|------------------|------------------|----------------| -| 1 | pending | 2024-01-01 11:00 | 2024-01-01 11:30 | 2024-01-01 11:00 | False | -| 1 | shipped | 2024-01-01 11:30 | 2024-01-01 11:40 | 2024-01-01 11:30 | False | -| 1 | deleted | 2024-01-01 11:40 | | 2024-01-01 11:40 | True | + | id | status | dbt_valid_from | dbt_valid_to | dbt_updated_at | dbt_is_deleted | + |----|---------|------------------|------------------|------------------|----------------| + | 1 | pending | 2024-01-01 11:00 | 2024-01-01 11:30 | 2024-01-01 11:00 | False | + | 1 | shipped | 2024-01-01 11:30 | 2024-01-01 11:40 | 2024-01-01 11:30 | False | + | 1 | deleted | 2024-01-01 11:40 | | 2024-01-01 11:40 | True | -
+ ## Configure snapshots in versions 1.8 and earlier diff --git a/website/docs/reference/resource-configs/snapshot_meta_column_names.md b/website/docs/reference/resource-configs/snapshot_meta_column_names.md index 24e4c8ca57..59d63374de 100644 --- a/website/docs/reference/resource-configs/snapshot_meta_column_names.md +++ b/website/docs/reference/resource-configs/snapshot_meta_column_names.md @@ -19,7 +19,7 @@ snapshots: dbt_valid_to: dbt_scd_id: dbt_updated_at: - dbt_is_deleted: + dbt_is_deleted: ``` @@ -35,7 +35,7 @@ snapshots: "dbt_valid_to": "", "dbt_scd_id": "", "dbt_updated_at": "", - "dbt_is_deleted": "", + "dbt_is_deleted": "", } ) }} @@ -54,7 +54,7 @@ snapshots: dbt_valid_to: dbt_scd_id: dbt_updated_at: - dbt_is_deleted: + dbt_is_deleted: ``` @@ -67,17 +67,17 @@ In order to align with an organization's naming conventions, the `snapshot_meta_ By default, dbt snapshots use the following column names to track change history using [Type 2 slowly changing dimension](https://en.wikipedia.org/wiki/Slowly_changing_dimension#Type_2:_add_new_row) records: -| Field | Meaning | Notes | -| -------------- | ------- | ----- | -| `dbt_valid_from` | The timestamp when this snapshot row was first inserted and became valid. | The value is affected by the [`strategy`](/reference/resource-configs/strategy). | -| `dbt_valid_to` | The timestamp when this row is no longer valid. | | -| `dbt_scd_id` | A unique key generated for each snapshot row. | This is used internally by dbt. | -| `dbt_updated_at` | The `updated_at` timestamp of the source record when this snapshot row was inserted. | This is used internally by dbt. | -| `dbt_is_deleted` | A boolean value indicating if the record has been deleted. `True` if deleted, `False` otherwise. | Added when `hard_deletes='new_record'` is configured. | +| Field |
Meaning
| Notes | Example | +| -------------- | ------- | ----- | ------- | +| `dbt_valid_from` | The timestamp when this snapshot row was first inserted and became valid. | The value is affected by the [`strategy`](/reference/resource-configs/strategy). | `snapshot_meta_column_names: {dbt_valid_from: start_date}` | +| `dbt_valid_to` | The timestamp when this row is no longer valid. | | `snapshot_meta_column_names: {dbt_valid_to: end_date}` | +| `dbt_scd_id` | A unique key generated for each snapshot row. | This is used internally by dbt. | `snapshot_meta_column_names: {dbt_scd_id: scd_id}` | +| `dbt_updated_at` | The `updated_at` timestamp of the source record when this snapshot row was inserted. | This is used internally by dbt. | `snapshot_meta_column_names: {dbt_updated_at: modified_date}` | +| `dbt_is_deleted` | A string value indicating if the record has been deleted. (`True` if deleted, `False` if not deleted). |Added when `hard_deletes='new_record'` is configured. | `snapshot_meta_column_names: {dbt_is_deleted: is_deleted}` | -However, these column names can be customized using the `snapshot_meta_column_names` config. +All of these column names can be customized using the `snapshot_meta_column_names` config. Refer to the [Example](#example) for more details. -:::warning +:::warning To avoid any unintentional data modification, dbt will **not** automatically apply any column renames. So if a user applies `snapshot_meta_column_names` config for a snapshot without updating the pre-existing table, they will get an error. We recommend either only using these settings for net-new snapshots, or arranging an update of pre-existing tables prior to committing a column name change.