Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Support for Hot Archive HAS to Stellar Archivist #4612

Open
SirTyson opened this issue Jan 8, 2025 · 2 comments
Open

Add Support for Hot Archive HAS to Stellar Archivist #4612

SirTyson opened this issue Jan 8, 2025 · 2 comments
Assignees
Labels

Comments

@SirTyson
Copy link
Contributor

SirTyson commented Jan 8, 2025

With the introduction of State Archival in p23, we are changing the History Archive format. This will require the corresponding changes in stellar-archivist. Given there's been lots of issues with the tool and it's an outdated go library, this may be a good opportunity to just rewrite it in something like rust and have the core team maintain it more actively going forward.

@SirTyson SirTyson added the bug label Jan 8, 2025
@SirTyson SirTyson self-assigned this Jan 8, 2025
@ire-and-curses
Copy link
Member

We got some feedback from Blockdaemon around specific limitations of the current implementation:

Issues

  1. In some modes, the tool treats corrupted files (e.g. zero length) as ok but the archive is bad (and SDF correctly flags the archive as bad)
  2. replacing corrupted files with the archive tool is not intuitive, it has to be done it manually (blockdaemon has a workaround which involves doing explicit repair which will force overwrite existing files)
  3. any upstream HTTP error (503, 429) results in the archive tool immediately aborting its run (blockdaemon observed that the errors and non errors are all on the same output, so you have to dig through a ton of logs to figure anything out and determine both where to pick up where you left off, and which files are missing/corrupted)

Suggestions

  1. Archive tool should retry (with backoff?) on errors (e.g. 503 and 429) - support/historyarchive: Fails instead of retrying on http 503 response. It also panics on a 429 go#2352
  2. Both scan and repair should know the difference between zero length files, corrupted files, and correct files.
  3. Ease of use: simplify the tool (which would make the documentation simpler); e.g., a scan should suggest the commands that need to be run to repair an archive.
  4. Improve the ease of use of the tool by documenting the process in a simple way to detect and fix a file or ledger in the archive E.g. it would be good to know that the Archive is 500Tb, so operators should not keep it on their local disks. Explicitly document the workflow involved and add recommendations.

@ire-and-curses
Copy link
Member

Here's a quick link to the issues/feature requests open for the current implementation, which should help inform your new design.

https://github.com/stellar/go/labels/archivist

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants