Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(kad): use Bytes for Record::value #4753

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

joshuef
Copy link
Contributor

@joshuef joshuef commented Oct 28, 2023

Description

This should help avoid potentially costly clones over Record.

Notes & open questions

Change checklist

  • I have performed a self-review of my own code
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works
  • A changelog entry has been made in the appropriate crates

@joshuef
Copy link
Contributor Author

joshuef commented Oct 28, 2023

Similar in scope to #4751 This should make end user handling of Records much lighter

@joshuef joshuef changed the title fix(gossip): convert kad record.value to Bytes. feat(kad): convert kad record.value to Bytes. Oct 28, 2023
Copy link
Contributor

@thomaseizinger thomaseizinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, one more suggestion :)

protocols/kad/src/behaviour/test.rs Outdated Show resolved Hide resolved
@thomaseizinger thomaseizinger changed the title feat(kad): convert kad record.value to Bytes. feat(kad): use Bytes for Record::value Oct 29, 2023
@joshuef
Copy link
Contributor Author

joshuef commented Oct 30, 2023

Updated 🙇

@@ -314,7 +314,7 @@ pub enum KadResponseMsg {
/// The key of the record.
key: record::Key,
/// Value of the record.
value: Vec<u8>,
value: Bytes,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just explored this a bit more and KadResponseMsg is actually never cloned so we can probably just remove the trait and this here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah sorry I think I posted this to the wrong thread here.

@@ -422,7 +422,7 @@ fn resp_msg_to_proto(kad_msg: KadResponseMsg) -> proto::Message {
key: key.to_vec(),
record: Some(proto::Record {
key: key.to_vec(),
value,
value: value.to_vec(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as with the gossipsub branch, I am a bit doubtful that this actually yields improvements because we are copying here again.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, the knock on is that we're reading from the Record which is and can be cloned in kad and user space. So we'll end up allocating to just have vecs in here ( if the Record contains Bytes for the value which I think it should...).

That comes in

HandlerIn::PutRecordRes {
                    key: record.key,
                    value: record.value, // <- we'd have to convert _from_ Bytes here,
                    request_id,
                },```

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry the above was for another comment response.

We are still having to convert to vec here, but that should become the only clone, vs a variety of cloneing that happens in the kad layers at the moment.

So this should still yield reduce mem.

Copy link
Contributor

mergify bot commented Nov 2, 2023

This pull request has merge conflicts. Could you please resolve them @joshuef? 🙏

@joshuef
Copy link
Contributor Author

joshuef commented Nov 3, 2023

Small update/rebase there.

Copy link
Contributor

mergify bot commented Nov 5, 2023

This pull request has merge conflicts. Could you please resolve them @joshuef? 🙏

@joshuef
Copy link
Contributor Author

joshuef commented Nov 8, 2023

What's the proper way w/r/t version bumping here? Should I be bumping in the cargo.toml already? Or just noting the unreleased nature in the changelog?

Copy link
Contributor

mergify bot commented Nov 10, 2023

This pull request has merge conflicts. Could you please resolve them @joshuef? 🙏

@joshuef
Copy link
Contributor Author

joshuef commented Nov 13, 2023

@thomaseizinger Are there any blockers to landing this at the moment? Are we looking at the protobuf Cow work as a prereq? (I'm a bit stalled there, not sure on a decent approach... I've commented on the issue for that)

@thomaseizinger
Copy link
Contributor

@thomaseizinger Are there any blockers to landing this at the moment? Are we looking at the protobuf Cow work as a prereq? (I'm a bit stalled there, not sure on a decent approach... I've commented on the issue for that)

Two things:

  1. It is a breaking change because Record and all of its fields are public.
  2. I am not convinced we are reducing the number of allocations. Doesn't this PR as is just delay them, assuming we ignore user-space? I'd prefer a more holistic solution that we can properly benchmark to be honest.

As a first step, we should probably prepare for making all fields of Record private and provide accessor / conversion functions. That will give us freedom down the line to make changes to the internals.

I understand that you probably don't want to wait for that to land as we've just cut a breaking release :)
If you don't mind, I would appreciate if you could explore within your fork on what changes you need to properly cut down on allocations within libp2p-kad. I'd assume that migrating the protobuf's to use borrowed data is definitely part of that solution.

@joshuef
Copy link
Contributor Author

joshuef commented Nov 14, 2023

That's fine, I can explore a bit more easily now that we're atop 0.53 in our repo anyway. I can understand wanting to know more concretely what's afoot and the benefits we'd get here. Do you have a recommended test setup/bench for allocations at the moment? (You can something in another thread, but I don't recall where that was).

Also makes sense w/r/t Record APIs 👍


Doesn't this PR as is just delay them

I don't think so? It converts every kad-space clone of Record into something much cheaper (I can see four points where we record.clone() in kad, eg.). So that's allocs avoided there, even if we still need to realloc at the moment for protobuf.

Plus any user space gains when handling Records?

I'll try and form a minimal change from our codebase (with the libp2p branch being the only diff) and get some numbers / repro steps if that's helpful 👍

@thomaseizinger
Copy link
Contributor

I'll try and form a minimal change from our codebase (with the libp2p branch being the only diff) and get some numbers / repro steps if that's helpful 👍

That would be great!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants