Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom relationship sorting #496

Draft
wants to merge 31 commits into
base: develop
Choose a base branch
from

Conversation

Renrut5
Copy link
Contributor

@Renrut5 Renrut5 commented Aug 1, 2024

Closes #457

@Renrut5 Renrut5 requested a review from a team as a code owner August 1, 2024 19:49
@Renrut5 Renrut5 requested a review from Kircheneer August 1, 2024 19:50
@Renrut5 Renrut5 marked this pull request as draft August 1, 2024 19:50
@Renrut5
Copy link
Contributor Author

Renrut5 commented Aug 1, 2024

@Kircheneer this is an idea of mine for sorting out relationships using contrib and avoiding false updates. The function can be used by both the Contrib NautobotAdapter and also by custom adapters for implementations.

The only thing that is needed for the NautobotAdapter is to specify the _sorted_relationships attribute, which should be the same value in both source and target adapters.

The custom adapters will need to be sure to import and call the function in load().

Thoughts?

@Renrut5 Renrut5 marked this pull request as ready for review August 5, 2024 13:24
@Kircheneer
Copy link
Contributor

What do you think of making this the default behaviour? I don't think to-many relationships are ordered ever, so maybe we can just default to this?

@Kircheneer
Copy link
Contributor

In this case though, I would probably like to do the sorting right in the loading

@jdrew82 jdrew82 added type: enhancement New feature or request integration: contrib Contrib related issues and PRs labels Aug 21, 2024
@Renrut5
Copy link
Contributor Author

Renrut5 commented Oct 10, 2024

@Kircheneer the issue is when the source and target load to-many relationships into an unordered list as is typically done. The problem is when that order varies between source and target. When that happens, the SSoT process gives a false update. All the items in the two comparing lists are exactly the same except for the order of the items themselves.

That's what this particular PR aims to resolve. The challenge is the to-many relationships must be ordered after everything is loaded into the DiffSync store and must be done on both source and target adapters to ensure they both match.

@Kircheneer
Copy link
Contributor

So I have been thinking about this again, and I don't see a scenario in which we would not want to do this. Do you think we can omit defining sorted_relationships on the model and just automatically sort anything that is a list?

Also, since you have tuples in here, what kind of use case is there for using tuples with contrib? I don't think I have hit this before, hence the asking.

Thanks!

@Renrut5
Copy link
Contributor Author

Renrut5 commented Oct 29, 2024

@Kircheneer Have been thinking about this. Not sure if omitting the sorted_relationships is the best idea.

The challenge comes when sorting the actual list of dictionaries, you need to define what key of the dictionaries to sort by. Not every dictionary has a name key, and so each relationship would need it's on configuration.

Unless there is a different way to sort the relationships that would be better?

@Kircheneer
Copy link
Contributor

You're right, I totally missed that. I want to try and have less implicit configuration (like the first element of the tuples in this list matching the field name on the model) and move towards explicit configuration. Do you think it would be possible that we somehow encode this information into the TypedDict we have to define anyway for the relationship? Say for example

class TenantDict(TypedDict):
    name: typing.Annotated[str, FieldType.SortBy]

where Fieldtype is a new Enum?

Let me know what you think of the approach, I know it makes the implementation a bit more complicated but the user interface is cleaner in my opinion.

@Renrut5
Copy link
Contributor Author

Renrut5 commented Nov 12, 2024

@Kircheneer what if we did the annotation in the model creation? I think I have a working case on this.

@dataclass
class SortedRelationship:
    sort_key: str

class MyModel(NautobotModel):
    ...
    relationship_field: typing.Annotated[list[dict], SortedRelationship("name")]

@Kircheneer
Copy link
Contributor

I thought about that option, too, but didn't like how you'd end up having to define two data structures (TypedDict + Dataclass) for the same relationship

@Renrut5
Copy link
Contributor Author

Renrut5 commented Nov 13, 2024

@Kircheneer my main issue with including the annotation into the TypedDict is that every entry in that list will have that same metadata (so it'll be repeated a significant number of times), and more importantly is that you'd have to pull the first item from the list, then search it each key in the dictionary for the appropriate annotation to sort by it. That would make the code more complex and difficult to follow than putting the annotation into the model, even if it is two data structures in the annotation.

@Kircheneer
Copy link
Contributor

@Kircheneer my main issue with including the annotation into the TypedDict is that every entry in that list will have that same metadata (so it'll be repeated a significant number of times)

The actual instances of the typed dict wouldn't have the metadata, you would only define it once on the definition of the typed dict, or am I missing something?

and more importantly is that you'd have to pull the first item from the list, then search it each key in the dictionary for the appropriate annotation to sort by it

I think you would only have to look at X in m2m_field: List[X] where class X(TypedDict), would you?

@Renrut5
Copy link
Contributor Author

Renrut5 commented Dec 30, 2024

@Kircheneer what are your thoughts on the changes made since last review?

Basically, I have it set to where sort_relationships will loop throug hthe top_level attribute of the target adapter. It will then check each entry for attributes that have the FieldType.SORTED_FIELD enum in the metadata.

If it finds it, it'll sort then look for a key inside the dictionary, if applicable, and look for FieldType.SORT_KEY enum. This will sort the list of dictionaries by that key.

For example:

class TagsDict(TypedDict):
   name: Annotated[str, FieldType.SORT_KEY]

class NautobotTenantModel(NautobotModel):
   _attributes = ("name", "tags",)
   tags: Annotated[List[str], FieldType.SORT_KEY] = []

@Renrut5 Renrut5 marked this pull request as draft December 31, 2024 20:30
Copy link
Contributor

@Kircheneer Kircheneer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple clarifying comments, but the direction looks good, thanks!

@@ -44,6 +44,7 @@ class NautobotAdapter(Adapter):
# This dictionary acts as an ORM cache.
_cache: DefaultDict[str, Dict[ParameterSet, Model]]
_cache_hits: DefaultDict[str, int] = defaultdict(int)
sorted_relationships = ()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand the newest version correctly, this field is no longer necessary, is that correct?

@@ -0,0 +1,105 @@
"""Functions for sorting DiffSync model lists ensuring they are sorted to prevent false actions."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a lot of this and of the existing base code should at some point be refactored to use annotationlib - this is not for now though, maybe you can just leave a comment?

return obj


def sort_relationships(source: Adapter, target: Adapter):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we loop over every single object here - have you done some tests on performance implications? I guess if you don't have any sortable fields (i.e. your integration was built prior to the introduction of this mechanism and you haven't annotated any fields) the performance difference should be negligible, right?

@@ -179,3 +179,51 @@ If you need to perform the `create`, `update` and `delete` operations on the rem

!!! warning
Special care should be taken when synchronizing new Devices with children Interfaces into a Nautobot instance that also defines Device Types with Interface components of the same name. When the new Device is created in Nautobot, its Interfaces will also be created as defined in the respective Device Type. As a result, when SSoT will attempt to create the children Interfaces loaded by the remote adapter, these will already exist in the target Nautobot system. In this scenario, if not properly handled, the sync will fail! Possible remediation steps may vary depending on the specific use-case, therefore this is left as an exercise to the reader/developer to solve for their specific context.

### Extra Step 2: Sorting Many-to-Many and Many-to-One Relationships
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is still using the old approach, is that correct? Not a problem for now since its a draft MR, I just want to make sure I understand the functionality correct.

@@ -167,6 +167,54 @@ Through us defining the model, Nautobot will now be able to dynamically load IP
!!! note
Although `Interface.ip_addresses` is a generic relation, there is only one content type (i.e. `ipam.ipaddress`) that may be related through this relation, therefore we don't have to specific this in any way.

## Sorting Relationships
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks up-to-date, right? So it should be an accurate representation of how to use this feature?

Copy link
Contributor

@jdrew82 jdrew82 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few things I noticed.


Loading M2M and N:1 relationship data from source and target destinations are typically not in the same order as each other. For example, the order of a device's interfaces from the source data may differ compared to the order Nautobot loads the data.

To resolve this, each relationships must be properly sorted before the source and target are compared against eachater. An additional attribute called `sorted_relationships` must be defined in both the source and target adapters. This attribute must be identical between both adapters.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
To resolve this, each relationships must be properly sorted before the source and target are compared against eachater. An additional attribute called `sorted_relationships` must be defined in both the source and target adapters. This attribute must be identical between both adapters.
To resolve this, each relationships must be properly sorted before the source and target are compared against each other. An additional attribute called `sorted_relationships` must be defined in both the source and target adapters. This attribute must be identical between both adapters.


### Extra Step 2: Sorting Many-to-Many and Many-to-One Relationships

If you are not syncing any many-to-many relationships (M2M) or many-to-one (N:1) relationships from the many side, you can skip this step.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it a proper noun, ie Many-to-Many and Many-to-One?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
integration: contrib Contrib related issues and PRs type: enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Custom relationships items ordering is not ignored when calculating changes
3 participants