Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom relationship sorting #496

Draft
wants to merge 31 commits into
base: develop
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions changes/457.added
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
Added `sort_relationships()` helper function
Added tests for `sort_relationships()` helper function
Added call to `sort_relationships()` function in contrib `NautobotAdapter`
50 changes: 49 additions & 1 deletion docs/dev/jobs.md
Original file line number Diff line number Diff line change
Expand Up @@ -168,7 +168,7 @@ The methods [`calculate_diff`][nautobot_ssot.jobs.base.DataSyncBaseJob.calculate
Optionally, on your Job class, also implement the [`lookup_object`][nautobot_ssot.jobs.base.DataSyncBaseJob.lookup_object], [`data_mapping`][nautobot_ssot.jobs.base.DataSyncBaseJob.data_mappings], and/or [`config_information`][nautobot_ssot.jobs.base.DataSyncBaseJob.config_information] APIs (to provide more information to the end user about the details of this Job), as well as the various metadata properties on your Job's Meta inner class. Refer to the example Jobs provided in this Nautobot app for examples and further details.
Install your Job via any of the supported Nautobot methods (installation into the `JOBS_ROOT` directory, inclusion in a Git repository, or packaging as part of an app) and it should automatically become available!

### Extra Step: Implementing `create`, `update` and `delete`
### Extra Step 1: Implementing `create`, `update` and `delete`

If you are synchronizing data _to_ Nautobot and not _from_ Nautobot, you can entirely skip this step. The `nautobot_ssot.contrib.NautobotModel` class provides this functionality automatically.

Expand All @@ -179,3 +179,51 @@ If you need to perform the `create`, `update` and `delete` operations on the rem

!!! warning
Special care should be taken when synchronizing new Devices with children Interfaces into a Nautobot instance that also defines Device Types with Interface components of the same name. When the new Device is created in Nautobot, its Interfaces will also be created as defined in the respective Device Type. As a result, when SSoT will attempt to create the children Interfaces loaded by the remote adapter, these will already exist in the target Nautobot system. In this scenario, if not properly handled, the sync will fail! Possible remediation steps may vary depending on the specific use-case, therefore this is left as an exercise to the reader/developer to solve for their specific context.

### Extra Step 2: Sorting Many-to-Many and Many-to-One Relationships
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is still using the old approach, is that correct? Not a problem for now since its a draft MR, I just want to make sure I understand the functionality correct.


If you are not syncing any many-to-many relationships (M2M) or many-to-one (N:1) relationships from the many side, you can skip this step.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it a proper noun, ie Many-to-Many and Many-to-One?


Loading M2M and N:1 relationship data from source and target destinations are typically not in the same order as each other. For example, the order of a device's interfaces from the source data may differ compared to the order Nautobot loads the data.

To resolve this, each relationships must be properly sorted before the source and target are compared against eachater. An additional attribute called `sorted_relationships` must be defined in both the source and target adapters. This attribute must be identical between both adapters.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
To resolve this, each relationships must be properly sorted before the source and target are compared against eachater. An additional attribute called `sorted_relationships` must be defined in both the source and target adapters. This attribute must be identical between both adapters.
To resolve this, each relationships must be properly sorted before the source and target are compared against each other. An additional attribute called `sorted_relationships` must be defined in both the source and target adapters. This attribute must be identical between both adapters.


M2M and N:1 relationships are stored in the DiffSync store as a list of dictionaries. To sort a list of dictionaries, we must specify a dictionary key to sort by and do so by using code similar to the following:

```
for obj in diffsync.get_all("model_name"):
sorted_data = sorted(
obj.attribute_name,
key=lamda x: x["sort_by_key_name"]
)
obj.attribute_name = sorted_data
diffsync.update(obj)
```

The `sorted_relationships` attribute was added is a tuple of tuples. Each entry must have three string values indicating:

1. Name of the model with attribute to be sorted
2. Attribute within the model to sort
3. Dictionary key name to sort by

The helper function `sort_relationships` has been added to contrib to assist in sorting relationships. The `NautobotAdapter` will automatically call this function and process any entries added to `sorted_relationships`.

For integrations other than the `NautobotAdapter`, you must also import and add the `sort_relationships` into into the `load()` method and simply pass the DiffSync/Adapter object through using `self`. This must be done after all other loading logic is completed.

Example:
```
from nautobot_ssot.contrib import sort_relationships

class SourceAdapter(DiffSync):

sorted_relationships = (
("tenant", "relationship", "name"),
)

def load(self):
...
# Primary load logic
...
sort_relationships(self)
```

48 changes: 48 additions & 0 deletions docs/user/modeling.md
Original file line number Diff line number Diff line change
Expand Up @@ -167,6 +167,54 @@ Through us defining the model, Nautobot will now be able to dynamically load IP
!!! note
Although `Interface.ip_addresses` is a generic relation, there is only one content type (i.e. `ipam.ipaddress`) that may be related through this relation, therefore we don't have to specific this in any way.

## Sorting Relationships
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks up-to-date, right? So it should be an accurate representation of how to use this feature?


One-to-many (from the one side) and many-to-many relationships are typically stored as lists in the DiffSync object. A problem with false updates during the diff arises when the data source and target load these lists in a different order from each other.

### The DiffSync Model

The first step is to add the appropriate metadata to the attributes in the DiffSync Model using annotations. The `FieldType` enum is used to identify what fields in the model are to be sorted.

```python
from typing import Annotated, List

from nautobot.dcim.models import Location
from nautobot_ssot.contrib import NautobotModel
from nautobot_ssot.contrib.types import FieldType


class TagDict(TypedDict):
name: str
color: str


class LocationModel(NautobotModel):
_model = Location
_modelname = "location"
_identifiers = ("name",)
_attributes("tags",)

name: str
tags: Annotated[List[TagDict], FieldType.SORTED_FIELD]
```

A model attribute will not be sorted without the `FieldType.SORTED_FIELD` in the annotation. For lists of simple objects (such as strings, integers, etc.), no additional configuration is required. The sorting process is done automatically.

### Sorting Dictionaries

Sorting lists of more complex objects requires an additional configuration to properly sort. In the case of dictionaries, you must specify which key the list of dictionaries should be sorted by.

The existing DiffSync process uses lists of `TypedDict` objects to represent some one-to-many and many-to-many relationships. Looking at the example from above, we can see that the `TagDict` class doesn't have any specification as to which key entry the list of `TagDict` objects should be sorted by. We'll update the class by adding the `FieldType.SORT_BY` marker as follows:

```python
...

class TagDict(TypedDict):
name: Annotated[str, FieldType.SORT_BY]
color: str

...
```

## Filtering Objects Loaded From Nautobot

Expand Down
7 changes: 5 additions & 2 deletions nautobot_ssot/contrib/adapter.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
from typing import DefaultDict, Dict, FrozenSet, Hashable, Tuple, Type, get_args

import pydantic
from diffsync import Adapter
from diffsync import DiffSync
from diffsync.exceptions import ObjectCrudException
from django.contrib.contenttypes.models import ContentType
from django.db.models import Model
Expand All @@ -34,7 +34,7 @@
ParameterSet = FrozenSet[Tuple[str, Hashable]]


class NautobotAdapter(Adapter):
class NautobotAdapter(DiffSync):
"""
Adapter for loading data from Nautobot through the ORM.

Expand All @@ -44,6 +44,7 @@ class NautobotAdapter(Adapter):
# This dictionary acts as an ORM cache.
_cache: DefaultDict[str, Dict[ParameterSet, Model]]
_cache_hits: DefaultDict[str, int] = defaultdict(int)
sorted_relationships = ()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand the newest version correctly, this field is no longer necessary, is that correct?


def __init__(self, *args, job, sync=None, **kwargs):
"""Instantiate this class, but do not load data immediately from the local system."""
Expand Down Expand Up @@ -174,6 +175,8 @@ def load(self):
# for this specific model class as well as its children without returning anything.
self._load_objects(diffsync_model)

# sort_relationships(self)

def _get_diffsync_class(self, model_name):
"""Given a model name, return the diffsync class."""
try:
Expand Down
105 changes: 105 additions & 0 deletions nautobot_ssot/contrib/sorting.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
"""Functions for sorting DiffSync model lists ensuring they are sorted to prevent false actions."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a lot of this and of the existing base code should at some point be refactored to use annotationlib - this is not for now though, maybe you can just leave a comment?


from diffsync import Adapter, DiffSyncModel
from typing_extensions import get_type_hints

from nautobot_ssot.contrib.types import FieldType


def is_sortable_field(attribute_type_hints):
"""Checks type hints to verify if field labled as sortable or not."""
if attribute_type_hints.__name__ != "Annotated" or not hasattr(attribute_type_hints, "__metadata__"):
return False
for metadata in attribute_type_hints.__metadata__:
if metadata == FieldType.SORTED_FIELD:
return True
return False


def get_sortable_obj_type(attribute_type_hints):
"""Get the object type of a sortable list based on the type hints."""
if not hasattr(attribute_type_hints, "__args__"):
return None
if not attribute_type_hints.__args__:
return None
attr_type = attribute_type_hints.__args__[0]
if not hasattr(attr_type, "__args__"):
return None
attr_type_args = getattr(attr_type, "__args__")
if attr_type_args:
return attr_type_args[0]
return None


def get_sortable_obj_sort_key(sortable_obj_type):
"""Get the sort key from a TypedDict type if set in the metadata."""
content_obj_type_hints = get_type_hints(sortable_obj_type, include_extras=True)
for key, value in content_obj_type_hints.items():
if not value.__name__ == "Annotated":
continue
for metadata in getattr(value, "__metadata__", ()):
if metadata == FieldType.SORT_BY:
return key
return None


def get_sortable_fields_from_model(model: DiffSyncModel):
"""Get a list of sortable fields and their sort key from a DiffSync model."""
sortable_fields = []
model_type_hints = get_type_hints(model, include_extras=True)

for attribute_name in model._attributes: # pylint: disable=protected-access
attribute_type_hints = model_type_hints.get(attribute_name)
if not is_sortable_field(attribute_type_hints):
continue

sortable_obj_type = get_sortable_obj_type(attribute_type_hints)
sort_key = get_sortable_obj_sort_key(sortable_obj_type)

sortable_fields.append(
{
"attribute": attribute_name,
"sort_key": sort_key,
}
)
return sortable_fields


def sort_diffsync_object(obj, attribute, key):
"""Update the sortable attribute in a DiffSync object."""
sorted_data = None
if key:
sorted_data = sorted(
getattr(obj, attribute),
key=lambda x: x[key],
)
else:
sorted_data = sorted(getattr(obj, attribute))
if sorted_data:
setattr(obj, attribute, sorted_data)
return obj


def sort_relationships(source: Adapter, target: Adapter):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we loop over every single object here - have you done some tests on performance implications? I guess if you don't have any sortable fields (i.e. your integration was built prior to the introduction of this mechanism and you haven't annotated any fields) the performance difference should be negligible, right?

"""Sort relationships based on the metadata defined in the DiffSync model."""
if not source or not target:
return
# Loop through Top Level entries
for level in getattr(target, "top_level", []):
# Get the DiffSync Model
model = getattr(target, level)
if not model:
continue

# Get sortable fields from model
sortable_fields = get_sortable_fields_from_model(target)
if not sortable_fields:
continue

for sortable in sortable_fields:
attribute = sortable["attribute"]
key = sortable["sort_key"]

for adapter in (source, target):
for obj in adapter.get_all(attribute):
adapter.update(sort_diffsync_object(obj, attribute, key))
22 changes: 12 additions & 10 deletions nautobot_ssot/contrib/typeddicts.py
Original file line number Diff line number Diff line change
@@ -1,32 +1,34 @@
"""Common TypedDict definitions used in Many-to-Many relationships."""

from typing import TypedDict
from typing import Annotated, TypedDict

from nautobot_ssot.contrib.types import FieldType


class ContentTypeDict(TypedDict):
"""TypedDict for Django Content Types."""

app_label: str
model: str
model: Annotated[str, FieldType.SORT_BY]


class TagDict(TypedDict):
"""TypedDict for Nautobot Tags."""

name: str
name: Annotated[str, FieldType.SORT_BY]


class LocationDict(TypedDict):
"""TypedDict for DCIM Locations."""

name: str
name: Annotated[str, FieldType.SORT_BY]
parent__name: str


class DeviceDict(TypedDict):
"""TypedDict for DCIM Devices."""

name: str
name: Annotated[str, FieldType.SORT_BY]
location__name: str
tenant__name: str
rack__name: str
Expand All @@ -40,35 +42,35 @@ class DeviceDict(TypedDict):
class InterfaceDict(TypedDict):
"""TypedDict for DCIM INterfaces."""

name: str
name: Annotated[str, FieldType.SORT_BY]
device__name: str


class PrefixDict(TypedDict):
"""TypedDict for IPAM Prefixes."""

network: str
network: Annotated[str, FieldType.SORT_BY]
prefix_length: int
namespace__name: str


class VLANDict(TypedDict):
"""TypedDict for IPAM VLANs."""

vid: int
vid: Annotated[int, FieldType.SORT_BY]
vlan_group__name: str


class IPAddressDict(TypedDict):
"""TypedDict for IPAM IP Address."""

host: str
host: Annotated[str, FieldType.SORT_BY]
prefix_length: int


class VirtualMachineDict(TypedDict):
"""TypedDict for IPAM ."""

name: str
name: Annotated[str, FieldType.SORT_BY]
cluster__name: str
tenant__name: str
10 changes: 10 additions & 0 deletions nautobot_ssot/contrib/types.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,16 @@
from typing import Optional


class FieldType(Enum):
"""Enum to specify field types for DiffSynModels and TypedDicts."""

# Indicates if a model field is to be sorted
SORTED_FIELD = 1

# Indicates what field to sort by for lists of dictionaries
SORT_BY = 2


class RelationshipSideEnum(Enum):
"""This details which side of a custom relationship the model it's defined on is on."""

Expand Down
4 changes: 4 additions & 0 deletions nautobot_ssot/jobs/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
from nautobot.extras.jobs import BooleanVar, DryRunVar, Job

from nautobot_ssot.choices import SyncLogEntryActionChoices
from nautobot_ssot.contrib.sorting import sort_relationships
from nautobot_ssot.models import BaseModel, Sync, SyncLogEntry

DataMapping = namedtuple("DataMapping", ["source_name", "source_url", "target_name", "target_url"])
Expand Down Expand Up @@ -180,6 +181,9 @@ def record_memory_trace(step: str):
if memory_profiling:
record_memory_trace("target_load")

# Sorting relationships must be done before calculating diffs.
sort_relationships(self.source_adapter, self.target_adapter)

self.logger.info("Calculating diffs...")
self.calculate_diff()
calculate_diff_time = datetime.now()
Expand Down
1 change: 1 addition & 0 deletions nautobot_ssot/tests/contrib/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
"""Unit tests for contrib."""
Loading
Loading