Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updating old source freshness language #6657

Merged
merged 4 commits into from
Dec 13, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion website/docs/docs/build/dbt-tips.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ Leverage these dbt packages to streamline your workflow:
- Set `vars` in your `dbt_project.yml` to define global defaults for certain conditions, which you can then override using the `--vars` flag in your commands.
- Use [for loops](/guides/using-jinja?step=3) in Jinja to <Term id="dry">DRY</Term> up repetitive logic, such as selecting a series of columns that all require the same transformations and naming patterns to be applied.
- Instead of relying on post-hooks, use the [grants config](/reference/resource-configs/grants) to apply permission grants in the warehouse resiliently.
- Define [source-freshness](/docs/build/sources#snapshotting-source-data-freshness) thresholds on your sources to avoid running transformations on data that has already been processed.
- Define [source-freshness](/docs/build/sources#source-data-freshness) thresholds on your sources to avoid running transformations on data that has already been processed.
- Use the `+` operator on the left of a model `dbt build --select +model_name` to run a model and all of its upstream dependencies. Use the `+` operator on the right of the model `dbt build --select model_name+` to run a model and everything downstream that depends on it.
- Use `dir_name` to run all models in a package or directory.
- Use the `@` operator on the left of a model in a non-state-aware CI setup to test it. This operator runs all of a selection’s parents and children, and also runs the parents of its children, which in a fresh CI schema will likely not exist yet.
Expand Down
14 changes: 7 additions & 7 deletions website/docs/docs/build/sources.md
Original file line number Diff line number Diff line change
Expand Up @@ -130,11 +130,11 @@ You can find more details on the available properties for sources in the [refere
<FAQ path="Tests/testing-sources" />
<FAQ path="Runs/running-models-downstream-of-source" />

## Snapshotting source data freshness
With a couple of extra configs, dbt can optionally snapshot the "freshness" of the data in your source tables. This is useful for understanding if your data pipelines are in a healthy state, and is a critical component of defining SLAs for your warehouse.
## Source data freshness
With a couple of extra configs, dbt can optionally capture the "freshness" of the data in your source tables. This is useful for understanding if your data pipelines are in a healthy state, and is a critical component of defining SLAs for your warehouse.

### Declaring source freshness
To configure sources to snapshot freshness information, add a `freshness` block to your source and `loaded_at_field` to your table declaration:
To configure source freshness information, add a `freshness` block to your source and `loaded_at_field` to your table declaration:

<File name='models/<filename>.yml'>

Expand Down Expand Up @@ -164,14 +164,14 @@ sources:

</File>

In the `freshness` block, one or both of `warn_after` and `error_after` can be provided. If neither is provided, then dbt will not calculate freshness snapshots for the tables in this source.
In the `freshness` block, one or both of `warn_after` and `error_after` can be provided. If neither is provided, then dbt will not calculate freshness for the tables in this source.

Additionally, the `loaded_at_field` is required to calculate freshness for a table. If a `loaded_at_field` is not provided, then dbt will not calculate freshness for the table.

These configs are applied hierarchically, so `freshness` and `loaded_at_field` values specified for a `source` will flow through to all of the `tables` defined in that source. This is useful when all of the tables in a source have the same `loaded_at_field`, as the config can just be specified once in the top-level source definition.

### Checking source freshness
To snapshot freshness information for your sources, use the `dbt source freshness` command ([reference docs](/reference/commands/source)):
To obtain freshness information for your sources, use the `dbt source freshness` command ([reference docs](/reference/commands/source)):

```
$ dbt source freshness
Expand All @@ -182,7 +182,7 @@ Behind the scenes, dbt uses the freshness properties to construct a `select` que
```sql
select
max(_etl_loaded_at) as max_loaded_at,
convert_timezone('UTC', current_timestamp()) as snapshotted_at
convert_timezone('UTC', current_timestamp()) as calculated_at
from raw.jaffle_shop.orders

```
Expand All @@ -198,7 +198,7 @@ Some databases can have tables where a filter over certain columns are required,
```sql
select
max(_etl_loaded_at) as max_loaded_at,
convert_timezone('UTC', current_timestamp()) as snapshotted_at
convert_timezone('UTC', current_timestamp()) as calculated_at
from raw.jaffle_shop.orders
where _etl_loaded_at >= date_sub(current_date(), interval 1 day)
```
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -495,7 +495,7 @@ Graph example:

### Are my data sources fresh?

Checking [source freshness](/docs/build/sources#snapshotting-source-data-freshness) allows you to ensure that sources loaded and used in your dbt project are compliant with expectations. The API provides the latest metadata about source loading and information about the freshness check criteria.
Checking [source freshness](/docs/build/sources#source-data-freshness) allows you to ensure that sources loaded and used in your dbt project are compliant with expectations. The API provides the latest metadata about source loading and information about the freshness check criteria.

<Lightbox src="/img/docs/dbt-cloud/discovery-api/source-freshness-page.png" width="75%" title="Source freshness page in dbt Cloud"/>

Expand Down
2 changes: 1 addition & 1 deletion website/docs/docs/deploy/artifacts.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ id: "artifacts"
description: "Use artifacts to power your automated docs site and source freshness data."
---

When running dbt jobs, dbt Cloud generates and saves *artifacts*. You can use these artifacts, like `manifest.json`, `catalog.json`, and `sources.json` to power different aspects of dbt Cloud, namely: [dbt Explorer](/docs/collaborate/explore-projects), [dbt Docs](/docs/collaborate/build-and-view-your-docs#dbt-docs), and [source freshness reporting](/docs/build/sources#snapshotting-source-data-freshness).
When running dbt jobs, dbt Cloud generates and saves *artifacts*. You can use these artifacts, like `manifest.json`, `catalog.json`, and `sources.json` to power different aspects of dbt Cloud, namely: [dbt Explorer](/docs/collaborate/explore-projects), [dbt Docs](/docs/collaborate/build-and-view-your-docs#dbt-docs), and [source freshness reporting](/docs/build/sources#source-data-freshness).

## Create dbt Cloud Artifacts

Expand Down
4 changes: 2 additions & 2 deletions website/docs/docs/deploy/source-freshness.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ id: "source-freshness"
description: "Validate that data freshness meets expectations and alert if stale."
---

dbt Cloud provides a helpful interface around dbt's [source data freshness](/docs/build/sources#snapshotting-source-data-freshness) calculations. When a dbt Cloud job is configured to snapshot source data freshness, dbt Cloud will render a user interface showing you the state of the most recent snapshot. This interface is intended to help you determine if your source data freshness is meeting the service level agreement (SLA) that you've defined for your organization.
dbt Cloud provides a helpful interface around dbt's [source data freshness](/docs/build/sources#source-data-freshness) calculations. When a dbt Cloud job is configured to snapshot source data freshness, dbt Cloud will render a user interface showing you the state of the most recent snapshot. This interface is intended to help you determine if your source data freshness is meeting the service level agreement (SLA) that you've defined for your organization.

<Lightbox src="/img/docs/dbt-cloud/using-dbt-cloud/data-sources-next.png" title="Data Sources in dbt Cloud"/>

Expand All @@ -17,7 +17,7 @@ dbt Cloud provides a helpful interface around dbt's [source data freshness](/doc

<Lightbox src="/img/docs/dbt-cloud/select-source-freshness.png" title="Selecting source freshness"/>

To enable source freshness snapshots, firstly make sure to configure your sources to [snapshot freshness information](/docs/build/sources#snapshotting-source-data-freshness). You can add source freshness to the list of commands in the job run steps or enable the checkbox. However, you can expect different outcomes when you configure a job by selecting the **Run source freshness** checkbox compared to adding the command to the run steps.
To enable source freshness snapshots, firstly make sure to configure your sources to [snapshot freshness information](/docs/build/sources#source-data-freshness). You can add source freshness to the list of commands in the job run steps or enable the checkbox. However, you can expect different outcomes when you configure a job by selecting the **Run source freshness** checkbox compared to adding the command to the run steps.

Review the following options and outcomes:

Expand Down
2 changes: 1 addition & 1 deletion website/docs/faqs/Models/insert-records.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,4 +9,4 @@
For those coming from an <Term id="etl" /> (Extract Transform Load) paradigm, there's often a desire to write transformations as `insert` and `update` statements. In comparison, dbt will wrap your `select` query in a `create table as` statement, which can feel counter-productive.

* If you wish to use `insert` statements for performance reasons (i.e. to reduce data that is processed), consider [incremental models](/docs/build/incremental-models)
* If you wish to use `insert` statements since your source data is constantly changing (e.g. to create "Type 2 Slowly Changing Dimensions"), consider [snapshotting your source data](/docs/build/sources#snapshotting-source-data-freshness), and building models on top of your snaphots.
* If you wish to use `insert` statements since your source data is constantly changing (e.g. to create "Type 2 Slowly Changing Dimensions"), consider [snapshotting your source data](/docs/build/sources#source-data-freshness), and building models on top of your snaphots.

Check warning on line 12 in website/docs/faqs/Models/insert-records.md

View workflow job for this annotation

GitHub Actions / vale

[vale] website/docs/faqs/Models/insert-records.md#L12

[custom.LatinAbbreviations] Avoid Latin abbreviations: 'for example'. Consider using 'e.g' instead.
Raw output
{"message": "[custom.LatinAbbreviations] Avoid Latin abbreviations: 'for example'. Consider using 'e.g' instead.", "location": {"path": "website/docs/faqs/Models/insert-records.md", "range": {"start": {"line": 12, "column": 89}}}, "severity": "WARNING"}

Check warning on line 12 in website/docs/faqs/Models/insert-records.md

View workflow job for this annotation

GitHub Actions / vale

[vale] website/docs/faqs/Models/insert-records.md#L12

[custom.Typos] Oops there's a typo -- did you really mean 'e.g.'?
Raw output
{"message": "[custom.Typos] Oops there's a typo -- did you really mean 'e.g.'? ", "location": {"path": "website/docs/faqs/Models/insert-records.md", "range": {"start": {"line": 12, "column": 89}}}, "severity": "WARNING"}
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ Let’s [create a job](/docs/deploy/deploy-jobs#create-and-schedule-jobs) in dbt
- This will allow the job to inherit the catalog, schema, credentials, and environment variables defined in [Set up your dbt project with Databricks](/guides/set-up-your-databricks-dbt-project).
4. Under **Execution Settings**
- Check the **Generate docs on run** checkbox to configure the job to automatically generate project docs each time this job runs. This will ensure your documentation stays evergreen as models are added and modified.
- Select the **Run on source freshness** checkbox to configure dbt [source freshness](/docs/deploy/source-freshness) as the first step of this job. Your sources will need to be configured to [snapshot freshness information](/docs/build/sources#snapshotting-source-data-freshness) for this to drive meaningful insights.
- Select the **Run on source freshness** checkbox to configure dbt [source freshness](/docs/deploy/source-freshness) as the first step of this job. Your sources will need to be configured to [snapshot freshness information](/docs/build/sources#source-data-freshness) for this to drive meaningful insights.

Add the following three **Commands:**
- `dbt source freshness`
Expand Down
2 changes: 1 addition & 1 deletion website/docs/guides/refactoring-legacy-sql.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ This allows you to call the same table in multiple places with `{{ src('my_sourc
We start here for several reasons:

#### Source freshness reporting
Using sources unlocks the ability to run [source freshness reporting](/docs/build/sources#snapshotting-source-data-freshness) to make sure your raw data isn't stale.
Using sources unlocks the ability to run [source freshness reporting](/docs/build/sources#source-data-freshness) to make sure your raw data isn't stale.

#### Easy dependency tracing
If you're migrating multiple stored procedures into dbt, with sources you can see which queries depend on the same raw tables.
Expand Down
2 changes: 1 addition & 1 deletion website/docs/reference/artifacts/dbt-artifacts.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ With every invocation, dbt generates and saves one or more *artifacts*. Several

- [documentation](/docs/collaborate/build-and-view-your-docs)
- [state](/reference/node-selection/syntax#about-node-selection)
- [visualizing source freshness](/docs/build/sources#snapshotting-source-data-freshness)
- [visualizing source freshness](/docs/build/sources#source-data-freshness)

They could also be used to:

Expand Down
2 changes: 1 addition & 1 deletion website/docs/reference/artifacts/sources-json.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ sidebar_label: "Sources"

**Produced by:** [`source freshness`](/reference/commands/source)

This file contains information about [sources with freshness checks](/docs/build/sources#checking-source-freshness). Today, dbt Cloud uses this file to power its [Source Freshness visualization](/docs/build/sources#snapshotting-source-data-freshness).
This file contains information about [sources with freshness checks](/docs/build/sources#checking-source-freshness). Today, dbt Cloud uses this file to power its [Source Freshness visualization](/docs/build/sources#source-data-freshness).

### Top-level keys

Expand Down
2 changes: 1 addition & 1 deletion website/docs/reference/commands/source.md
Original file line number Diff line number Diff line change
Expand Up @@ -75,4 +75,4 @@ Snapshots of source freshness can be used to understand:

This command can be run manually to determine the state of your source data freshness at any time. It is also recommended that you run this command on a schedule, storing the results of the freshness snapshot at regular intervals. These longitudinal snapshots will make it possible to be alerted when source data freshness SLAs are violated, as well as understand the trend of freshness over time.

dbt Cloud makes it easy to snapshot source freshness on a schedule, and provides a dashboard out of the box indicating the state of freshness for all of the sources defined in your project. For more information on snapshotting freshness in dbt Cloud, check out the [docs](/docs/build/sources#snapshotting-source-data-freshness).
dbt Cloud makes it easy to snapshot source freshness on a schedule, and provides a dashboard out of the box indicating the state of freshness for all of the sources defined in your project. For more information on snapshotting freshness in dbt Cloud, check out the [docs](/docs/build/sources#source-data-freshness).
2 changes: 1 addition & 1 deletion website/docs/reference/source-configs.md
Original file line number Diff line number Diff line change
Expand Up @@ -121,7 +121,7 @@ sources:

Sources can be configured via a `config:` block within their `.yml` definitions, or from the `dbt_project.yml` file under the `sources:` key. This configuration is most useful for configuring sources imported from [a package](/docs/build/packages).

You can disable sources imported from a package to prevent them from rendering in the documentation, or to prevent [source freshness checks](/docs/build/sources#snapshotting-source-data-freshness) from running on source tables imported from packages.
You can disable sources imported from a package to prevent them from rendering in the documentation, or to prevent [source freshness checks](/docs/build/sources#source-data-freshness) from running on source tables imported from packages.

- **Note**: To disable a source table nested in a YAML file in a subfolder, you will need to supply the subfolder(s) within the path to that YAML file, as well as the source name and the table name in the `dbt_project.yml` file.<br /><br />
The following example shows how to disable a source table nested in a YAML file in a subfolder:
Expand Down
Loading