-
Notifications
You must be signed in to change notification settings - Fork 978
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parallel ledger close #4543
Parallel ledger close #4543
Conversation
615e535
to
54d7345
Compare
1e53b11
to
6d1ce1f
Compare
e2c7701
to
f95b782
Compare
db9a9e0
to
097cb43
Compare
097cb43
to
95cab12
Compare
Ok so my guide-level understanding of this follows. Could you confirm I've got it right and am not missing anything?
If my understanding is correct .. I think this should basically work. And I should express a tremendous congratulations for finding a cut-point that seems like it might work. This is no small feat! It's brilliant. That said, I remain quite nervous about the details, in I think 3 main ways:
All 3 of these are diffuse, vague worries I can't point to any specific code actually inhabiting. You've done great here, I would never have thought to make this cut point. But I remain worried. I wonder if there are ways we could audit, detect or mitigate any of those risks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally great work. Handful of minor nits, lots of questions to make sure I'm understanding things, handful of potential clarifications to consider around naming / comments / logic. Plus I wrote a larger "overview question" in the PR comments.
But all that aside, congratulations on the accomplishment here!
0936264
to
be2c9a7
Compare
1f89ef4
to
8e7b826
Compare
Improve access to ledger state, to better support parallelization changes in #4543 Note that management of SorobanNetworkConfig is still not great, as currently LM manages multiple versions of the config. Ideally, soroban network config lives inside of the state snapshot (either BucketList snapshot or LedgerTxn), but this was too tricky to implement at this time due to how network config is currently implemented. We may need to clean this up later. This change also partially addresses #4318
4c3bf0c
to
9dce230
Compare
67e7306
to
64ebb6c
Compare
Some points that came up in conversation today which were, if not strictly new to me, nonetheless fresh enough feeling that I think it'd be good to get them into comments / ASCII-art / docs somewhere. I will put them in writing here so that at least one of us can come back to this later to try to transcribe into comments and/or invariants in the code. The following is all stuff that (AIUI) is true today and maintained by this change:
Whereas the following is stuff that is new as of this change:
|
410b794
to
a89cb68
Compare
@graydon your comment is almost exactly correct. The only thing I want to note is that buffered ledgers aren't abandoned when catchup starts. We continue buffering them, and trimming them based on checkpoint boundaries, but we won't apply them until normal catchup application is done. Here's how catchup schedules buffered ledger application: stellar-core/src/catchup/CatchupWork.cpp Line 482 in 2b6a40c
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just a few typos/comment questions
…rtain methods static
a89cb68
to
50035de
Compare
50035de
to
e417314
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handful of minor changes that can wait for a followup.
Resolves #4317
Concludes #4128
The implementation of this proposal requires massive changes to the stellar-core codebase, and touches almost every subsystem. There are some paradigm shifts in how the program executes, that I will discuss below for posterity. The same ideas are reflected in code comments as well, as it’ll be important for code maintenance and extensibility
Database access
Currently, only Postgres DB backend is supported, as it required minimal changes to how DB queries are structured (Postgres provides a fairly nice concurrency model).
SQLite concurrency support is a lot more rudimentary, with only a single writer allowed, and the whole database is locked during writing. This necessitates further changes in core (such as splitting the database into two). Given that most network infrastructure is on Postgres right now, SQLite support can be added later.
Reduced responsibilities of SQL
SQL tables have been trimmed as much as possible to avoid conflicts, essentially we only store persistent state such as the latest LCL and SCP history, as well as legacy OFFER table.
Asynchronous externalize flow
There are three important subsystems in core that are in charge of tracking consensus, externalizing and applying ledgers, and advancing the state machine to catchup or synced state:
Prior to this change, the externalize flow had two different flows:
With the new changes, the triggering ledger close flow moved to CatchupManager completely. Essentially, CatchupManager::processLedger became a centralized place to decide whether to apply a ledger, or trigger catchup. Because ledger close happens in the background, the transition between externalize and “closeLedger→set synced” becomes asynchronous.
Concurrent ledger close
List of core items that moved to the background followed by explanation why it is safe to do so:
Emitting meta
Ledger application is the only process that touches the meta pipe, no conflicts with other subsystems
Writing checkpoint files
Only the background thread writes in-progress checkpoint files. Main thread deals exclusively with “complete” checkpoints, which after completion must not be touched by any subsystem except publishing.
Updating ledger state
The rest of the system operates strictly on read-only BucketList snapshots, and is unaffected by changing state. Note: there are some calls to LedgerTxn in the codebase still, but those only appear on startup during setup (when node is not operational) or in offline commands.
Incrementing current LCL
Because ledger close moved to the background, guarantees about ledger state and its staleness are now different. Previously, ledger state queried by subsystems outside of apply was always up-to-date. With this change, it is possible the snapshot used by main thread may become slightly stale (if background just closed a new ledger, but main thread hasn't refreshed its snapshot yet). There are different use cases of main thread's ledger state, which must be treated with caution and evaluated individually:
Reflecting state change in the bucketlist
Close ledger is the only place in the code that updates the BucketList. Other subsystems may only read it. Example is garbage collection, which queries the latest BucketList state to decide which buckets to delete. These are protected with a mutex (the same LCL mutex used in LM, as bucketlist is conceptually a part of LCL as well).