Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor common revision code to pkg #16269

Merged
merged 1 commit into from
Oct 10, 2023
Merged

Conversation

dusk125
Copy link
Contributor

@dusk125 dusk125 commented Jul 18, 2023

Addresses #16159.

Moves the revision structure definition and related functions from package mvcc in the server, to a subpackage in pkg.

@dusk125 dusk125 force-pushed the refactor-revision branch 3 times, most recently from b230d3b to fadbbbb Compare July 18, 2023 17:31
Copy link
Member

@fuweid fuweid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look good.

It might help to #15979.

server/storage/mvcc/kvstore.go Outdated Show resolved Hide resolved
Copy link
Member

@serathius serathius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This refactor incorrectly mixes revision as a transaction identifier and revision serialization used for mvcc. In the first case the revision has 17 bytes in second has 18 bytes as it includes tombstone.

The original code is no better, but at least it's private and separated, so changes in logic will not break etcdserver.

@serathius
Copy link
Member

serathius commented Jul 19, 2023

I would propose to do the refactoring in different order. Proposed steps:

  • Move server/storage/mvcc/kvstore.go revision/tombstone logic into server/storage/mvcc/revision.go. Functions like isTombstone and appendMarkTombstone.
  • Create a RevisionTombstone struct that will include information about tombstone.
  • Change BytesToRev, NewRevBytes and RevToBytes to use RevisionTombstone
  • And only then make the revision package public.

@ahrtr
Copy link
Member

ahrtr commented Jul 19, 2023

  1. We do need to make revision and related common method & functions public (under the pkg/v3), as we already explicitly document it in The Implicit Kubernetes-ETCD Contract. It can also reduce the previously copy-pasted duplicated code
    • The Revision struct and the const RevBytesLen = 8 + 1 + 8
    • Method GreatThan
    • Function RevToBytes and BytesToRev
  2. Keep all others, including tomestome, unchanged. It's open to discussion on how to improve it, but they are not necessarily to be done in this PR.

@serathius
Copy link
Member

serathius commented Jul 19, 2023

We do need to make revision and related common method & functions public (under the pkg/v3), as we already explicitly document it in The Implicit Kubernetes-ETCD Contract. It can also reduce the previously copy-pasted duplicated code

  • The Revision struct and the const RevBytesLen = 8 + 1 + 8
  • Method GreatThan
  • Function RevToBytes and BytesToRev

We don't need to do anything if it increases risk of bug. Code duplication is not inherently bad, but exposing internals without proper care is.

The The Implicit Kubernetes-ETCD Contract is totally unrelated. Please understand that revision as a concept for versioning changes is separate from how etcd stores revision on disk. Method GreatThan belongs to the first one, RevBytesLen, RevToBytes and BytesToRev belong to the second one. Mixing them together will lead to subtle bugs.

@ahrtr
Copy link
Member

ahrtr commented Jul 19, 2023

We don't need to do anything if it increases risk of bug. Code duplication is not inherently bad, but exposing internals without proper care is.

The The Implicit Kubernetes-ETCD Contract is totally unrelated. Please understand that revision as a concept for versioning changes is separate from how etcd stores revision on disk. Method GreatThan belongs to the first one, RevBytesLen, RevToBytes and BytesToRev belong to the second one. Mixing them together will lead to subtle bugs.

It partially makes sense to me. The revision storage format is just etcd's internal mechanism, and users do not care about it. So from this perspective, it doesn't make sense to expose them. But it's a common mechanism inside etcd, so a modification based on my above comment #16269 (comment) is to get them included in pkg/v3/internal/revision.

@serathius
Copy link
Member

serathius commented Jul 19, 2023

It partially makes sense to me. The revision storage format is just etcd's internal mechanism, and users do not care about it. So from this perspective, it doesn't make sense to expose them. But it's a common mechanism inside etcd, so a modification based on my above comment #16269 (comment) is to get them included in pkg/v3/internal/revision.

Right, I'm not against exposing the etcd storage mechanism. I'm postulating code should:

  • Make clear separation between the concept revision for versioning and the revision storage representation. Both can be exposed, but each clearly distinguished.
  • Include the tombstone into the storage representation. If we are exposing the etcd internal representation of revision, we need to remember that tombstone is part of this representation. You cannot read the bbolt Keys bucket and forget the tombstone.

@ahrtr
Copy link
Member

ahrtr commented Jul 19, 2023

  • Include the tombstone into the storage representation. If we are exposing the etcd internal representation of revision, we need to remember that tombstone is part of this representation. You cannot read the bbolt Keys bucket and forget the tombstone.

Agreed.

@ahrtr
Copy link
Member

ahrtr commented Jul 19, 2023

Based on above discussion, the actions:

  • Get the revision struct and related method & functions included in package pkg/v3/internal/revision;
  • Get tombstone related definitions included in pkg/v3/internal/revision as well.
    • markedRevBytesLen
    • markBytePosition
    • markTombstone
    • IsTombstone()
    • AppendMarkTombstone()

@fuweid
Copy link
Member

fuweid commented Jul 19, 2023

Include the tombstone into the storage representation. If we are exposing the etcd internal representation of revision, we need to remember that tombstone is part of this representation. You cannot read the bbolt Keys bucket and forget the tombstone.

Good point !

@dusk125
Copy link
Contributor Author

dusk125 commented Aug 3, 2023

I would propose to do the refactoring in different order. Proposed steps:

  • Move server/storage/mvcc/kvstore.go revision/tombstone logic into server/storage/mvcc/revision.go. Functions like isTombstone and appendMarkTombstone.
  • Create a RevisionTombstone struct that will include information about tombstone.
  • Change BytesToRev, NewRevBytes and RevToBytes to use RevisionTombstone
  • And only then make the revision package public.

@serathius just so I make sure I'm understanding you properly. Are you suggesting to have both mvcc.RevisionTombstone (for internal storage usage in mvcc) and revision.Revision for external usage in things like etcdutl?

@serathius
Copy link
Member

Yes, etcdutl reads data stored in Key bucket that might include revisions with tombstone. It needs to properly read and store the tombstone.

@dusk125 dusk125 force-pushed the refactor-revision branch 2 times, most recently from c0f09b3 to bb765fd Compare September 28, 2023 17:46
tools/.golangci.yaml Outdated Show resolved Hide resolved
@serathius
Copy link
Member

serathius commented Sep 29, 2023

Please note that mvcc is the package where Revision and BucketKey meet. Requests to mvcc be about a specific revisions, so it should use Revision struct. However when mvcc writes to bbolt it should use it's key so BucketKey. Before writing a revision it needs to package the Revision struct into BucketKey and decide whether to set the tombstone.

Possibly it would be more reasonable to split this PR into two, first one introduces Revision and BucketKey and the other one migrates etcdutl to them.

Copy link
Member

@serathius serathius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @ahrtr

server/storage/mvcc/revision.go Outdated Show resolved Hide resolved
server/storage/mvcc/revision.go Outdated Show resolved Hide resolved
@dusk125 dusk125 force-pushed the refactor-revision branch from 620bf51 to fe3535b Compare October 9, 2023 16:23
@ahrtr
Copy link
Member

ahrtr commented Oct 9, 2023

I understand that the BucketKey is representing the format stored in disk, while Revision is representing the structure in memory.

But I find it's confusing when reviewing the PR, and sometimes there is no clear boundary when to use which one (Revision vs BucketKey) and which functions (RevToBytes vs RevToBucketKey, BytesToRev vs BytesToBucketKey). I believe other contributors may be confused as well when they read or update the related source code.

Actually the markTombstone is only used in a couple of places, and appendMarkTombstone is only used in delete.

So I suggest to remove BucketKey to get rid of the unnecessary confusion, and keep the original functions isTombstone and appendMarkTombstone from pragmatic perspective.

server/storage/mvcc/kvstore.go Outdated Show resolved Hide resolved
server/storage/mvcc/kvstore.go Outdated Show resolved Hide resolved
server/storage/mvcc/kvstore.go Outdated Show resolved Hide resolved
server/storage/mvcc/kvstore_compaction.go Outdated Show resolved Hide resolved
server/storage/mvcc/kvstore_compaction_test.go Outdated Show resolved Hide resolved
@dusk125 dusk125 force-pushed the refactor-revision branch from fe3535b to 5691307 Compare October 9, 2023 18:18
server/storage/mvcc/kvstore_txn.go Outdated Show resolved Hide resolved
server/storage/mvcc/store.go Outdated Show resolved Hide resolved
server/storage/mvcc/store.go Outdated Show resolved Hide resolved
server/storage/mvcc/store.go Outdated Show resolved Hide resolved
server/storage/mvcc/store.go Outdated Show resolved Hide resolved
server/storage/mvcc/watchable_store.go Outdated Show resolved Hide resolved
server/storage/mvcc/watchable_store.go Outdated Show resolved Hide resolved
@dusk125 dusk125 force-pushed the refactor-revision branch from 5691307 to 395376d Compare October 9, 2023 18:53
@ahrtr
Copy link
Member

ahrtr commented Oct 9, 2023

Thanks @dusk125 for the quick update.

Now the BucketKey is only used in method (*storeTxnWrite) delete, so a more pragmatic approach is to remove BucketKey and keep the original appendMarkTombstone (Note I notice that the function isTombstone isn't removed :)). But I won't insist on that if others think the current implementation is good.

Overall looks good to me. Leave to @serathius to take another look.

@serathius serathius merged commit 16e19a9 into etcd-io:main Oct 10, 2023
27 checks passed
@dusk125 dusk125 deleted the refactor-revision branch October 10, 2023 13:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants