-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Research task: Create a microbenchmark setup to test the efficiency of WebSockets vs HTTP2/3 + SSE #301
Comments
/bounty $2000 |
💎 $3,000 bounty • Garden ComputingSteps to solve:
Thank you for contributing to garden-co/jazz! Add a bounty • Share on socials
|
Hi @aeplay I see your project just joined Algora, so welcome! Different projects pick different styles of working so I'm curious, how do you want attempts at this bounty to shake out:
The risk with style 1. is that the assigned dev might take too long to show progress (if they are inexperienced or experienced but busy with a day job). The risk with style 2. is since there is only 1 bounty reward, anyone willing to work on this risks getting blind-sided by other devs who open PRs to claim the bounty. |
Hey @ayewo good question! This is what I'm most curious about in this bounty model as well. For this task I would say first-come first-serve, since it is quite a detailed project and I would hate anyone to waste their effort. There is no super urgent deadline for it, so I would be happy to let the first serious contender iterate on it with my input. |
Excellent! In that case, I'd like to /attempt #301 this.
|
@ayewo yes! Let's gooo Please ask for clarifications here, I'm in GMT+1 and mostly available during normal work hours, but also during other times on my phone for quick answers |
@aeplay I'm also in GMT+1 :) and just joined your Discord. I take it you prefer clarifications happen here in the open, right? |
yes please, don't worry about making this issue noisy, that's what it's for |
Roger that. Would appreciate it if you can assign the issue to me, otherwise there will be drive-by attempts from other devs feigning ignorance of our conversation above. |
done, thanks for walking me through this |
Hey @DhairyaMajmudar I appreciate your offer but would like to keep this focused on one person attempting it. Thank you! |
That's fine! |
And @ayewo just clarifying: there is no CI/CD aspect to this project - it's all meant to be run manually. |
@aeplay Yes, understood. You want this to be single-thread and launched locally. (I built microbenchmark recently using a combination of PowerShell (on Windows) and Bash (on Linux) but they were each executed remotely on EC2 instances using Terraform.) |
Hint: Next, time You May want to Keep Applications a Bit Longer open so You Can Evaluate a few applicants, it doesn't have to be, first come first serve or Battle Royale. |
Makes sense, but this time I wanted to move quickly and @ayewo seemed eager and capable so I just went with him |
I'd like to share some progress on the research I've done so far and ask a few questions. I looked into the HTTP protocol versions supported by servers A, B, and C and it seems that only Caddy supports all three versions of the HTTP protocol natively (i.e. HTTP/1.1, HTTP/2 and HTTP/3). HTTP Versions Supported by Web Servers1
Node.js doesn't yet suppport HTTP/3 natively but I came across a 3rd party repo (https://github.com/endel/webtransport-nodejs) that claims to offer HTTP/3 support but I didn't look too closely. Since you also want to test against 3 different protocols:
I tried to map servers A, B, C to the 3 protocols to see what is possible: Web Server to Web Protocol Mapping
Questions
For the actual test, does this imply that after each server is started, 10 clients will be spawned that will subscribe to a CoValue, then 1 client will mutate the CoValue triggering a notification by the server to those 10 clients? Footnotes |
Hey @ayewo, thanks for sharing your research results in such a well-structured format.
|
Some more clarifications:
So the full mapping would look like this Web Server to Web Protocol Mapping
|
Thanks for the confirmation.
Please note that I have updated the 2nd table to remove the port numbers. I imagine each of the servers A,B & C will be started as a standalone process so they could simply listen on the same port i.e. |
|
yeah makes sense re ports - we can run the different cases in succession Just double checked re uWebSockets and HTTP2 - you're right, that's surprising. Remove that case then, but try HTTP1 + SSE, please |
This is exactly the case, correct |
Another question: I want to assume all protocol combinations will use TLS in the benchmarks? |
yes please assume and use TLS for everything (local certs are ok), because one thing I am interested in is how long it takes to bootstrap a connection - which is most noticed on interrupted connections. I'm expecting Websockets + TLS to be the longest and HTTP3 + SSE + TLS to be the fastest in this regard. |
Got it. |
More questions. The Simulation spec talks about simulating the transfer of structured and binary data. But looking at the main differences (source) between WebSockets and SSE in the table below:
|
|
Re: 1 & 2 Can you relax this so that base64 encoding is not necessary for loading/creating binary CoValues. In other words, base64 encoding will only used for delivering subscription events over a WebSocket or SSE? It's much easier to split a 50MB binary file, as is, and stream it in 100KB chunks in either direction (server->client and client->server) than to do so with base64 encoding added to the mix. |
Hey @ayewo sorry for the late reply. Yes happy to relax this. Ideally (to be most similar to cojson) you could base64 encode the individual chunks - but if it's simpler to have them binary wherever possible just do that - it's not really relevant to the main concern. Thank you |
Hey @aeplay Brief status update: Right now, I’ve working all 3 use cases for text and binary CoValues over:
Still trying to finish the WebSocket implementation and hoping I can re-use most of the code for SSE browser client to build the browser client that will interact with the WebSocket server. (PS: I’ve been really poor at sharing updates on my progress because I have been dealing with regular interruptions. So sorry about that.) |
Hey @ayewo no worries, thanks for your update - looking forward to it! |
Hey @ayewo any updates on this? :) |
@aeplay I should open a PR tomorrow or Wednesday, God willing. |
Turns out my estimate was off by a few days as there are parts of the benchmarks’ plumbing that are not yet finished. Sorry about that. When I started, I must have interpreted the Deliverable section as saying that a PR should be only be opened when the code is close to done. But re-reading it now, I realize I should have simply asked for your preference:
Right now, I have most things working. The only requirement I haven't touched at all are simulating the various Network conditions: I, II, III, IV. |
That sounds wonderful, I would love to see a WIP draft PR. Thank you! |
💡 @ayewo submitted a pull request that claims the bounty. You can visit your bounty board to reward. |
I'd opened a draft PR and included basic instructions on how to set it up locally for testing at the end. |
Any news @ayewo ? |
Hey @aeplay I've been AFK for a few weeks because I traveled. I should be returning home this weekend, God willing. |
Hey @aeplay I now have all web servers working—6 of them—and it is clear I really underestimated the amount of plumbing required to have everything working, end-to-end. As requested, I investigated using HTTP/3 with the uWebSocket project and it turned out to be unworkable, as I mentioned in my preliminary findings, because support was experimental and development had been paused. Right now, their experimental HTTP/3 implementation exposed via (Sorry for the long hiatus between updates.) |
Thanks for your continued work on this @ayewo If you could share numerical results either here or on the PR, even if it is from a WIP stage that would be super interesting for me to get a quick idea before running it myself Please be mindful of your time budget considering the bounty amount and feel free to stop as soon as you have insightful results. Just ping me then! |
@aeplay thanks for finally responding. Ha, I'm already underwater with respect to Right now, I don't have much availability to run those numbers until this or next weekend. I hope that's fine? |
/bounty $1000 |
💎 A new bounty of $1,000 has been added by aeplay! 💰 Current prize pool: $3,000 |
@ayewo I hope this effectively increases the bounty to $3000 The timing is fine - I'd appreciate if you focused on getting some numbers and having the minimal extra work done to be able to reproduce them over any kind of polish, and then I'll mark it as accepted! |
@aeplay thanks for the kind gesture of upping the bounty, really appreciate it. You might probably have to delete the "/bounty $1000" comment then comment anew with "/bounty $3000" for the Algora bot to pick it up though.
OK. I might not be able to do anything this weekend because I have several things lined up already. Thanks for understanding.
Got it. |
In order to get an idea how best to proceed with #233, it would be good to have ballpark numbers of the performance characteristics of WebSockets vs HTTP requests + Server-Sent Events for our needs.
Setup
We can get this data - completely decoupled from the internals of Jazz - by creating some synthetic microbenchmarks.
Simulation details
Original data use that needs to be simulated
Currently, Jazz uses WebSockets to sync CoValue state between the client and syncing/persistence server.
The communication typically consists of three scenarios:
Websockets vs Requests & SSE
Currently, 1., 2. and 3. happen over WebSockets, with 1 package per request/response/incoming update
For using Requests and SSE instead, we would use Requests & Responses for 1. and 2., while for 3. we listen to incoming updates with Server-Sent Events and publish outgoing updates as a Request with no expected Response.
Simulation spec
There are roughly two classes of CoValues: structured CoValues (thousands of <50 byte edits) and binary-data CoValues (few edits that are each 100kB).
Since we are only interested in the data transmission performance, we can model the scenarios using packets containing random data:
No extra HTTP headers should be set (other than what browser set by default, and these should be minimised if possible)
Target metrics
The main variables we are interested in are
Variables
It would be good to get results for the metrics above assuming
Different network conditions
Different protocols
You don't need to actually deploy a server anywhere if you can simulate these conditions locally, just make sure to note down your hardware specs and use exactly one thread/core for the server
Dimensions summary
So in total we have the following dimensions:
Deliverable
I realise this spec is a lot, so feel free to ask lots of clarifying questions before & after accepting the task!
The text was updated successfully, but these errors were encountered: