Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

Increase limit on light client response size #5461

Merged
merged 3 commits into from
Mar 31, 2020

Conversation

tomaka
Copy link
Contributor

@tomaka tomaka commented Mar 30, 2020

I'm working on a pull request that switches to this new light client protocol, but I realized that none of the requests are going through on Polkadot because the maximum response size is too low.
Even the most basic responses are 1.7MiB in size, so more than the default 1MiB.

Unfortunately this also causes the connection to be force-closed (and immediately reopened afterwards) because of libp2p/rust-libp2p#1530

I'm opening this PR separately because we need it to be deployed on full nodes before light client can switch to this new protocol.

@tomaka tomaka added A2-insubstantial Pull request requires no code review (e.g., a sub-repository hash update). B0-silent Changes should not be mentioned in any release notes labels Mar 30, 2020
@tomaka tomaka added this to the 2.0 milestone Mar 30, 2020
@tomaka tomaka requested a review from twittner March 30, 2020 17:13
/// - max. pending requests = 128
/// - inactivity timeout = 15s
/// - request timeout = 15s
pub fn new(id: &ProtocolId) -> Self {
let mut c = Config {
max_data_size: 1024 * 1024,
max_data_size: 64 * 1024 * 1024,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't 64MB not a little too much? Any idea why the request is that big?

I would have assumed that this #5179 brings down the size.

Copy link
Contributor

@twittner twittner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would perhaps make sense to rename max_data_size to max_request_data_size and keep the 1 MiB limit (unless we expect requests to exceed this limit which is quite unlikely I guess). Currently max_data_size is also used as a limit for the response data and we could have a separate and larger limit here which is then used in line 1080.

@tomaka
Copy link
Contributor Author

tomaka commented Mar 31, 2020

For clarification it's not the requests that are large, it's the responses.
I'm going to briefly print out how large requests and responses are and report back.

@tomaka
Copy link
Contributor Author

tomaka commented Mar 31, 2020

So I have a little setup with "polkadotjs <-> light node <-> full node" on a dev chain.

The communications between the light and full nodes start with four identical RemoteCallRequest/RemoteCallResponses for Core_version (I assume there might be an issue with PolkadotJS here). The requests are 50 bytes, and responses 1766285 bytes (or 1.7MiB).

Then I see a serie of RemoteReadRequests of sizes 70, 70, 110, 118, 185, 201, 259, 283, 333, 365, 407
Their associated RemoteReadResponse have sizes 693, 637, 990, 1100, 990, 1100, 990, 1100, 990, 1100, 990

Since the requests are mysteriously growing in size, here's the last read request before I shut down the node:

RemoteReadRequest { block: 0xc57a15d3ccbe287c280900683d54e56e4393725ff835dfc3133b1dfd8b86e1a1, header: Header { parent_hash: 0x480a81e0f042be3cbab310d7f6a0532a19c374ed4c5c5be20e92c546c948ac27, number: 36, state_root: 0x9e57b641a3e9e93e8654d0f8b06eb8122b82ad8dd7c769f50127f02165e2d6ee, extrinsics_root: 0xeb0941796e2bcbff3b5227680cef816a02e9425ba6281e1221c5bdc6ef72a325, digest: Digest { logs: [DigestItem::PreRuntime([66, 65, 66, 69], [2, 0, 0, 0, 0, 93, 7, 129, 31, 0, 0, 0, 0]), DigestItem::Consensus([66, 65, 66, 69], [1, 4, 212, 53, 147, 199, 21, 253, 211, 28, 97, 20, 26, 189, 4, 169, 159, 214, 130, 44, 133, 88, 133, 76, 205, 227, 154, 86, 132, 231, 165, 109, 162, 125, 1, 0, 0, 0, 0, 0, 0, 0, 2, 232, 2, 244, 166, 226, 138, 102, 132, 237, 9, 130, 42, 88, 216, 122, 74, 210, 211, 143, 83, 217, 31, 210, 129, 101, 20, 125, 168, 0, 36, 78]), DigestItem::Seal([66, 65, 66, 69], [22, 197, 137, 152, 234, 141, 156, 168, 104, 249, 238, 74, 35, 224, 132, 224, 81, 20, 225, 69, 220, 245, 37, 71, 144, 46, 198, 123, 178, 23, 41, 121, 227, 101, 224, 149, 117, 80, 87, 106, 64, 99, 103, 152, 19, 101, 222, 205, 193, 133, 229, 37, 86, 191, 99, 69, 209, 9, 203, 254, 163, 108, 96, 135])] } }, keys: [[95, 62, 73, 7, 247, 22, 172, 137, 182, 52, 125, 21, 236, 236, 237, 202, 62, 209, 75, 69, 237, 32, 208, 84, 240, 94, 55, 226, 84, 44, 254, 112, 81, 131, 102, 181, 177, 188, 124, 153, 212, 53, 147, 199, 21, 253, 211, 28, 97, 20, 26, 189, 4, 169, 159, 214, 130, 44, 133, 88, 133, 76, 205, 227, 154, 86, 132, 231, 165, 109, 162, 125], [95, 62, 73, 7, 247, 22, 172, 137, 182, 52, 125, 21, 236, 236, 237, 202, 62, 209, 75, 69, 237, 32, 208, 84, 240, 94, 55, 226, 84, 44, 254, 112, 229, 53, 38, 49, 72, 218, 175, 73, 190, 93, 219, 21, 121, 183, 46, 132, 82, 79, 194, 158, 120, 96, 158, 60, 175, 66, 232, 90, 161, 24, 235, 254, 11, 10, 212, 4, 181, 189, 210, 95], [95, 62, 73, 7, 247, 22, 172, 137, 182, 52, 125, 21, 236, 236, 237, 202, 62, 209, 75, 69, 237, 32, 208, 84, 240, 94, 55, 226, 84, 44, 254, 112, 14, 91, 224, 15, 188, 46, 21, 181, 254, 101, 113, 125, 173, 4, 71, 215, 21, 246, 96, 160, 165, 132, 17, 222, 80, 155, 66, 230, 239, 184, 55, 95, 86, 47, 88, 165, 84, 213, 134, 14], [95, 62, 73, 7, 247, 22, 172, 137, 182, 52, 125, 21, 236, 236, 237, 202, 62, 209, 75, 69, 237, 32, 208, 84, 240, 94, 55, 226, 84, 44, 254, 112, 166, 71, 231, 85, 195, 5, 33, 211, 142, 175, 4, 21, 22, 135, 115, 99, 38, 201, 254, 161, 126, 37, 252, 82, 135, 97, 54, 147, 201, 18, 144, 156, 178, 38, 170, 71, 148, 242, 106, 72], [95, 62, 73, 7, 247, 22, 172, 137, 182, 52, 125, 21, 236, 236, 237, 202, 62, 209, 75, 69, 237, 32, 208, 84, 240, 94, 55, 226, 84, 44, 254, 112, 221, 78, 63, 37, 245, 55, 138, 109, 144, 181, 171, 32, 92, 105, 116, 201, 234, 132, 27, 230, 136, 134, 70, 51, 220, 156, 168, 163, 87, 132, 62, 234, 207, 35, 20, 100, 153, 101, 254, 34]], retry_count: None }

After trying again, I don't see this increase anymore and I might have just been measuring the outcome of a normal PolkadotJS activity, it's kind of hard to tell.

@tomaka
Copy link
Contributor Author

tomaka commented Mar 31, 2020

It seems that the limit for responses here is 1.7MiB, but keep in mind that this is a dev chain and not Kusama/Polkadot, and I'm not sure how they compare.

What about 16MiB for responses? This response size limit is mostly a way for light clients to know that they're not allocating an unreasonable amount of memory to store that response. Light client can't really get DoSed, as they are not supposed to get connected to and can just respond to any anomaly by closing their connections. Any "reasonable memory size" would look good to me.

The limit for request sizes is probably more important and can probably be set at 4kiB, if that sounds reasonable?

@twittner
Copy link
Contributor

What about 16MiB for responses? [...] The limit for request sizes is probably more important and can probably be set at 4kiB, if that sounds reasonable?

A 16MiB (or even bigger) response data limit looks good to me.
As for the request data upper bound I would just leave it at 1MiB.

@bkchr
Copy link
Member

bkchr commented Mar 31, 2020

So I have a little setup with "polkadotjs <-> light node <-> full node" on a dev chain.

The communications between the light and full nodes start with four identical RemoteCallRequest/RemoteCallResponses for Core_version (I assume there might be an issue with PolkadotJS here). The requests are 50 bytes, and responses 1766285 bytes (or 1.7MiB).

Then I see a serie of RemoteReadRequests of sizes 70, 70, 110, 118, 185, 201, 259, 283, 333, 365, 407
Their associated RemoteReadResponse have sizes 693, 637, 990, 1100, 990, 1100, 990, 1100, 990, 1100, 990

Since the requests are mysteriously growing in size, here's the last read request before I shut down the node:

RemoteReadRequest { block: 0xc57a15d3ccbe287c280900683d54e56e4393725ff835dfc3133b1dfd8b86e1a1, header: Header { parent_hash: 0x480a81e0f042be3cbab310d7f6a0532a19c374ed4c5c5be20e92c546c948ac27, number: 36, state_root: 0x9e57b641a3e9e93e8654d0f8b06eb8122b82ad8dd7c769f50127f02165e2d6ee, extrinsics_root: 0xeb0941796e2bcbff3b5227680cef816a02e9425ba6281e1221c5bdc6ef72a325, digest: Digest { logs: [DigestItem::PreRuntime([66, 65, 66, 69], [2, 0, 0, 0, 0, 93, 7, 129, 31, 0, 0, 0, 0]), DigestItem::Consensus([66, 65, 66, 69], [1, 4, 212, 53, 147, 199, 21, 253, 211, 28, 97, 20, 26, 189, 4, 169, 159, 214, 130, 44, 133, 88, 133, 76, 205, 227, 154, 86, 132, 231, 165, 109, 162, 125, 1, 0, 0, 0, 0, 0, 0, 0, 2, 232, 2, 244, 166, 226, 138, 102, 132, 237, 9, 130, 42, 88, 216, 122, 74, 210, 211, 143, 83, 217, 31, 210, 129, 101, 20, 125, 168, 0, 36, 78]), DigestItem::Seal([66, 65, 66, 69], [22, 197, 137, 152, 234, 141, 156, 168, 104, 249, 238, 74, 35, 224, 132, 224, 81, 20, 225, 69, 220, 245, 37, 71, 144, 46, 198, 123, 178, 23, 41, 121, 227, 101, 224, 149, 117, 80, 87, 106, 64, 99, 103, 152, 19, 101, 222, 205, 193, 133, 229, 37, 86, 191, 99, 69, 209, 9, 203, 254, 163, 108, 96, 135])] } }, keys: [[95, 62, 73, 7, 247, 22, 172, 137, 182, 52, 125, 21, 236, 236, 237, 202, 62, 209, 75, 69, 237, 32, 208, 84, 240, 94, 55, 226, 84, 44, 254, 112, 81, 131, 102, 181, 177, 188, 124, 153, 212, 53, 147, 199, 21, 253, 211, 28, 97, 20, 26, 189, 4, 169, 159, 214, 130, 44, 133, 88, 133, 76, 205, 227, 154, 86, 132, 231, 165, 109, 162, 125], [95, 62, 73, 7, 247, 22, 172, 137, 182, 52, 125, 21, 236, 236, 237, 202, 62, 209, 75, 69, 237, 32, 208, 84, 240, 94, 55, 226, 84, 44, 254, 112, 229, 53, 38, 49, 72, 218, 175, 73, 190, 93, 219, 21, 121, 183, 46, 132, 82, 79, 194, 158, 120, 96, 158, 60, 175, 66, 232, 90, 161, 24, 235, 254, 11, 10, 212, 4, 181, 189, 210, 95], [95, 62, 73, 7, 247, 22, 172, 137, 182, 52, 125, 21, 236, 236, 237, 202, 62, 209, 75, 69, 237, 32, 208, 84, 240, 94, 55, 226, 84, 44, 254, 112, 14, 91, 224, 15, 188, 46, 21, 181, 254, 101, 113, 125, 173, 4, 71, 215, 21, 246, 96, 160, 165, 132, 17, 222, 80, 155, 66, 230, 239, 184, 55, 95, 86, 47, 88, 165, 84, 213, 134, 14], [95, 62, 73, 7, 247, 22, 172, 137, 182, 52, 125, 21, 236, 236, 237, 202, 62, 209, 75, 69, 237, 32, 208, 84, 240, 94, 55, 226, 84, 44, 254, 112, 166, 71, 231, 85, 195, 5, 33, 211, 142, 175, 4, 21, 22, 135, 115, 99, 38, 201, 254, 161, 126, 37, 252, 82, 135, 97, 54, 147, 201, 18, 144, 156, 178, 38, 170, 71, 148, 242, 106, 72], [95, 62, 73, 7, 247, 22, 172, 137, 182, 52, 125, 21, 236, 236, 237, 202, 62, 209, 75, 69, 237, 32, 208, 84, 240, 94, 55, 226, 84, 44, 254, 112, 221, 78, 63, 37, 245, 55, 138, 109, 144, 181, 171, 32, 92, 105, 116, 201, 234, 132, 27, 230, 136, 134, 70, 51, 220, 156, 168, 163, 87, 132, 62, 234, 207, 35, 20, 100, 153, 101, 254, 34]], retry_count: None }

After trying again, I don't see this increase anymore and I might have just been measuring the outcome of a normal PolkadotJS activity, it's kind of hard to tell.

Ahh I'm an idiot. 1.7MB makes totally sense, as we still always send the rutime inside of the proof to the light client. And the current Polkadot runtime is around 1.7MB. So, yeah 16MB sounds reasonable.

@tomaka
Copy link
Contributor Author

tomaka commented Mar 31, 2020

Did the change. Also included a small unused import warning fix.

@tomaka tomaka added A0-please_review Pull request needs code review. and removed A2-insubstantial Pull request requires no code review (e.g., a sub-repository hash update). labels Mar 31, 2020
@tomaka
Copy link
Contributor Author

tomaka commented Mar 31, 2020

I also realized that we don't need to deploy this to full nodes ahead of time, since the message size limit is not enforced on the sending size.

I'll open the PR to switch to the new protocol shortly after this one is merged.

@gavofyork gavofyork merged commit dca30b2 into paritytech:master Mar 31, 2020
@tomaka tomaka deleted the inc-light-client-sz-limit branch March 31, 2020 19:05
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A0-please_review Pull request needs code review. B0-silent Changes should not be mentioned in any release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants