-
Notifications
You must be signed in to change notification settings - Fork 140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Investigation] Reported performance regression when updating wasmtime from 19 to 24. #2058
Comments
My next step is to try to update FVM v2 in forest to see what deps get updated along with it. |
We've confirmed that having wasmtime v24 anywhere in the build tree, even if unused, causes this slowdown. |
Interesting, I'm seeing smallvec now depending on serde. I can't imagine how that might be relevant, but I'll need to check. |
I've tracked it down to wasmtime's cranelift feature (does not appear to be feature unification but I may be missing something). I'm now trying with just |
|
cranelift-codegen reproduces, trying cranelift-control now |
cranelift-control does not. Trying cranelift-codegen-meta, cranelift-codegen-shared, and cranelift-isle |
Ok, it is |
Er, trying with reduced features first. Only "std" and "unwind" because those are required. |
Ok, still broken with those features. Now I'm trying with "all arch" to see if it's some kind if isle issue (I think it is, but I'm not sure if that's the fix). |
Enabling all architectures doesn't help, and I can't skip the ISLE build (that option only exists if it's pre-built). Now I'm bisecting cranelift-codegen, starting with 0.109. |
Ok, it's cranelift-codegen 0.107.0 exactly. Now I'm testing a build with all the FVM crates fully updated to get us all on a single wasmtime version just in case it's an issue with multiple wasmtime versions. |
Ok, it is feature unification. Specifically adding the |
My hypothesis is that this is actually just a compilation slowdown, not an execution slowdown. We're only running through ~24 epochs here and will have to pause to lazily (IIRC?) compile actors every time we load a new one. This is my guess because I'm noticing that epochs 1 & 4 are taking a while. Additionally, if we're speeding through multiple network upgrades (especially if we're switching FVM versions), we'll have to re-compile the actors per network version. |
@Stebalien This could be verified by running a longer benchmark without network upgrades. The .env. file would need to be modified along the lines of: LOTUS_IMAGE=ghcr.io/chainsafe/lotus-devnet:2024-10-10-600728e
FOREST_DATA_DIR=/forest_data
LOTUS_DATA_DIR=/lotus_data
FIL_PROOFS_PARAMETER_CACHE=/var/tmp/filecoin-proof-parameters
MINER_ACTOR_ADDRESS=f01000
LOTUS_RPC_PORT=1234
LOTUS_P2P_PORT=1235
MINER_RPC_PORT=2345
FOREST_RPC_PORT=3456
FOREST_OFFLINE_RPC_PORT=3457
F3_RPC_PORT=23456
F3_FINALITY=100000
GENESIS_NETWORK_VERSION=24
SHARK_HEIGHT=-10
HYGGE_HEIGHT=-9
LIGHTNING_HEIGHT=-8
THUNDER_HEIGHT=-7
WATERMELON_HEIGHT=-6
DRAGON_HEIGHT=-5
WAFFLE_HEIGHT=-4
TUKTUK_HEIGHT=-3
TARGET_HEIGHT=200 Note that the timeout would also need to be extended. Then, we compare the timings before and after the FVM upgrade. In that way, we will be able to confirm if the slowdown is coming from re-compiling actors (which is not a big issue). |
@Stebalien I believe your hypothesis holds. I tried on the config above locally (on a machine that previously reported 50% slowdown).
So I guess it's "okay-ish" in a sense that the slowdown only occurs under some synthetic conditions. That said, it might make sense to report it to the |
Yep, I plan on reporting it upstream. I'm also looking into possibly disabling the offending feature (requires upstream help, but I'm guessing it's only needed for GC which we don't use). Also, for forest, you can modify the "quick" compilation profile to optimize cranelift-codegen (or maybe just regalloc2). That didn't get it to be quite fast enough to pass the test in the 5m timeout, but it definitely improved the situation. I'm also going to write a quick "load/compile the actors" benchmark. |
Oh, lol, no. This trace-log is literally just for debugging. I don't think this was supposed to get shipped to production. |
Nvm:
It looks like they thought this wouldn't have a performance impact. |
Ah, that comment is saying that trace-log is only enabled for And... already fixed but not in the version we're using: bytecodealliance/wasmtime#9128. Time to update to v25, I guess. |
Importantly, this reverts a previous wasmtime change that enabled trace-logging in regalloc, massively slowing down compilation across all FVM versions. fixes #2058
We've seen a reported performance regression that appears to be associated with the wasmtime update that happened in FVM 4.4 (wasmtime 19 -> 24). This was an interesting performance regression because:
Things I haven't checked:
All I can think of now is... rustix? Or something like that?
The text was updated successfully, but these errors were encountered: