-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reorganize fixtures and implement a happy path test for semantic_manifest #9930
Conversation
Thank you for your pull request! We could not find a changelog entry for this change. For details on how to document a change, see the contributing guide. |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #9930 +/- ##
==========================================
- Coverage 88.13% 88.06% -0.07%
==========================================
Files 178 180 +2
Lines 22449 22546 +97
==========================================
+ Hits 19785 19856 +71
- Misses 2664 2690 +26
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
@@ -42,7 +42,7 @@ class NodeRelation(dbtClassMixin): | |||
alias: str | |||
schema_name: str # TODO: Could this be called simply "schema" so we could reuse StateRelation? | |||
database: Optional[str] = None | |||
relation_name: Optional[str] = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we using this field anywhere?
If not, might be OK to drop similar to #9794
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will look up but @QMalcolm if you know anything about it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's used in a couple places in Metricflow:
- convert_semantic_model.py::create_sql_source_data_set
- data_warehouse_model_validator.py::gen_semantic_model_tasks
Thus it is important for us to power it, and we should not drop it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like a big improvement! Good to see that manifest construction code moved out into a reusable fixture.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great! I had small question on the change in semantic_model.py
and some small cleanup requests 🙂
@@ -4,6 +4,9 @@ | |||
from dbt.artifacts.resources.types import NodeType | |||
from dbt.contracts.graph.nodes import SourceDefinition | |||
|
|||
# All manifest related fixtures. | |||
from tests.unit.utils.manifest import * # noqa |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀
@@ -42,7 +42,7 @@ class NodeRelation(dbtClassMixin): | |||
alias: str | |||
schema_name: str # TODO: Could this be called simply "schema" so we could reuse StateRelation? | |||
database: Optional[str] = None | |||
relation_name: Optional[str] = None | |||
relation_name: Optional[str] = "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is focused on tests and fixtures for tests. Was the default value of relation_name
changed to make testing easier, or was there a functional purpose for the change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So after this got fixed we started seeing test failures that take the form of
______________________ TestSemanticManifest.test_validate ______________________
self = <tests.unit.contracts.graph.test_semantic_manifest.TestSemanticManifest object at 0x7f52df26c040>
manifest = Manifest(nodes={'model.test.metricflow_time_spine': ModelNode(database='dbt', schema='analytics', name='metricflow_tim...d_path_count=0, static_analysis_path_count=0), _lock=<Lock(owner=None)>, _macros_by_name=None, _macros_by_package=None)
def test_validate(self, manifest):
sm_manifest = SemanticManifest(manifest)
> assert sm_manifest.validate()
tests/unit/contracts/graph/test_semantic_manifest.py:28:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
core/dbt/contracts/graph/semantic_manifest.py:42: in validate
semantic_manifest = self._get_pydantic_semantic_manifest()
core/dbt/contracts/graph/semantic_manifest.py:69: in _get_pydantic_semantic_manifest
PydanticSemanticModel.parse_obj(semantic_model.to_dict())
.tox/unit/lib/python3.8/site-packages/pydantic/v1/main.py:526: in parse_obj
return cls(**obj)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
__pydantic_self__ = PydanticSemanticModel()
data = {'config': {'enabled': True, 'group': None, 'meta': {}}, 'created_at': 1713305259.7969077, 'defaults': None, 'depends_on': {'macros': [], 'nodes': []}, ...}
values = {'defaults': None, 'description': 'Customer entity', 'dimensions': [], 'entities': [], ...}
fields_set = {'defaults', 'description', 'dimensions', 'entities', 'label', 'measures', ...}
validation_error = ValidationError(model='PydanticSemanticModel', errors=[{'loc': ('node_relation', 'relation_name'), 'msg': 'none is not an allowed value', 'type': 'type_error.none.not_allowed'}])
def __init__(__pydantic_self__, **data: Any) -> None:
"""
Create a new model by parsing and validating input data from keyword arguments.
Raises ValidationError if the input data cannot be parsed to form a valid model.
"""
# Uses something other than `self` the first arg to allow "self" as a settable attribute
values, fields_set, validation_error = validate_model(__pydantic_self__.__class__, data)
if validation_error:
> raise validation_error
E pydantic.v1.error_wrappers.ValidationError: 1 validation error for PydanticSemanticModel
E node_relation -> relation_name
E none is not an allowed value (type=type_error.none.not_allowed)
.tox/unit/lib/python3.8/site-packages/pydantic/v1/main.py:341: ValidationError
=============================== warnings summary ===============================
tests/unit/test_query_headers.py:44
/home/runner/work/dbt-core/dbt-core/tests/unit/test_query_headers.py:44: DeprecationWarning: invalid escape sequence \/
self.assertTrue(re.match(f"^\/\*.*\*\/\n{self.query}$", sql)) # noqa: [W605]
-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
---------- coverage: platform linux, python 3.8.18-final-0 -----------
Coverage XML written to file coverage.xml
=========================== short test summary info ============================
FAILED tests/unit/contracts/graph/test_semantic_manifest.py::TestSemanticManifest::test_validate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test that failed, TestSemanticManifest.test_validate
, is a new test added by this PR. The question becomes, is this an error in the code or the test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like the protocol for NodeRelation does have node_relation
as a required string. The PydanticNodeRelation has a specific pydantic validator for handling node_relation 🤔 So why do we in core have it as optional? I should be the person that knows this 🙃
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haha yes, maybe this does deserve a separate PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updating the default value of relation_name
to ""
in the class definition should be fine. The NodeRelation
class is only instantiated on a SemanticModel
in one place, and it passes in a value for relation_name
.
from dbt.contracts.graph.semantic_manifest import SemanticManifest | ||
|
||
# Request fixtures for manifest, this is simialr to import * in the file. | ||
# pytest_plugins = ("tests.unit.utils.manifest",) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that we've moved the import to conftest.py
, can we drop these comments?
# Request fixtures for manifest, this is simialr to import * in the file. | ||
pytest_plugins = ("tests.unit.utils.manifest",) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe with the update to conftest.py
this is no longer needed 🤔
@QMalcolm Done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shweet! Let's get it shipped 🚢
More discussion on current test failure in my previous comment questioning the change |
resolves #9665
Still might move things around a bit but would love some eyes on it about fixtures structure etc.
This patch moved the existing fixtures to build a manifest around and make it easier to overwrite what nodes are included, while keeping the existing default.
We then used the fixtures to generate a semantic manifest that has both metrics and semantic models test more things in the validate function.
A screenshot of the new unit test coverage for Semantic manifest
Problem
Solution
Checklist