-
Notifications
You must be signed in to change notification settings - Fork 617
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(prover): support multi tasks #1585
Conversation
…ailing task to same prover
WalkthroughThe pull request introduces a comprehensive set of changes across multiple files in the coordinator and prover components, focusing on enhancing the flexibility of prover type handling and task assignment. The modifications include adding a new configuration parameter for external prover thresholds, updating methods to support multiple prover types, and improving task retrieval and assignment logic. These changes aim to make the system more adaptable to different prover configurations and external prover scenarios. Changes
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
close, move it here #1587 |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #1585 +/- ##
===========================================
- Coverage 52.01% 51.91% -0.11%
===========================================
Files 157 157
Lines 12476 12501 +25
===========================================
Hits 6490 6490
- Misses 5424 5449 +25
Partials 562 562
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🧹 Nitpick comments (20)
coordinator/internal/logic/provertask/chunk_prover_task.go (1)
81-119
: Consider refactoring the assignment loop for clarity.
The loop merging assigned/unassigned tasks is repeated across multiple files and might be prone to duplication or errors. Extracting this logic into a shared helper or reusing existing patterns could improve maintainability and readability.coordinator/internal/orm/bundle.go (3)
57-69
: Prefer parameter placeholders in raw SQL to reduce potential risks.
While your limit is likely trusted data, using parameter placeholders (e.g.,?
) instead offmt.Sprintf
can reduce the chance of SQL injection and is often a best practice.
71-83
: Correct the docstring to avoid mentioning a non-existent limit parameter.
The comment “based on the specified limit” is misleading because there is no limit argument in this method.- // GetUnassignedBundleCount retrieves unassigned bundle count based on the specified limit. + // GetUnassignedBundleCount retrieves the unassigned bundle count up to the specified attempts criteria.
86-97
: Same raw SQL observations as in GetUnassignedBundles.
Using placeholder parameters is generally preferable to manual string concatenation for queries, even if the input is from config or code.coordinator/internal/logic/provertask/bundle_prover_task.go (1)
83-119
: There is potential duplication with chunk-related logic.
The loop and offset logic for assigned vs. unassigned tasks strongly mirrors other “task” classes. A shared helper function may simplify future maintenance and reduce code duplication.coordinator/internal/logic/provertask/batch_prover_task.go (1)
83-119
: Consider consolidating logic for assigned/unassigned retrieval.
A common function could handle your offset-based approach and reduce repetition across chunk, bundle, and batch files.coordinator/internal/orm/chunk.go (3)
76-86
: Favor placeholder parameters over string formatting in raw SQL.
Usingfmt.Sprintf
for query construction is workable but can be replaced with parameter binding to reduce risk of injection and improve clarity.
88-101
: Docstring references “limit” but no limit parameter is provided.
Align the comment with the method signature or add a limit param if intended.
104-115
: Apply the same SQL parameter binding approach where feasible.
Likewise, consider placeholders for building queries inGetAssignedChunks
to ensure usage consistency and best practices.coordinator/internal/orm/batch.go (2)
81-93
: Consider switching to parameterized queries to mitigate potential SQL injection.
WhilemaxActiveAttempts
,maxTotalAttempts
, andlimit
likely come from a trusted source, building the query string viafmt.Sprintf
is prone to accidental injection if these values are ever user-controlled. Leveraging GORM's chained methods for conditions and limits would be safer and more consistent.- sql := fmt.Sprintf("SELECT * FROM batch WHERE proving_status = %d AND total_attempts < %d AND active_attempts < %d AND chunk_proofs_status = %d AND batch.deleted_at IS NULL ORDER BY batch.index LIMIT %d;", - int(types.ProvingTaskUnassigned), maxTotalAttempts, maxActiveAttempts, int(types.ChunkProofsStatusReady), limit) - err := db.Raw(sql).Scan(&batch).Error + err := db. + Where("proving_status = ? AND total_attempts < ? AND active_attempts < ? AND chunk_proofs_status = ? AND batch.deleted_at IS NULL", + types.ProvingTaskUnassigned, maxTotalAttempts, maxActiveAttempts, types.ChunkProofsStatusReady). + Order("batch.index"). + Limit(int(limit)). + Find(&batch).Error
111-122
: Unify the approach for assigned batches query.
Similar toGetUnassignedBatches
, consider using GORM's chaining methods instead offmt.Sprintf
to reduce injection risks and maintain consistency.- sql := fmt.Sprintf("SELECT * FROM batch WHERE proving_status = %d AND total_attempts < %d AND active_attempts < %d AND chunk_proofs_status = %d AND batch.deleted_at IS NULL ORDER BY batch.index LIMIT %d;", - int(types.ProvingTaskAssigned), maxTotalAttempts, maxActiveAttempts, int(types.ChunkProofsStatusReady), limit) - err := db.Raw(sql).Scan(&batch).Error + err := db. + Where("proving_status = ? AND total_attempts < ? AND active_attempts < ? AND chunk_proofs_status = ? AND batch.deleted_at IS NULL", + types.ProvingTaskAssigned, maxTotalAttempts, maxActiveAttempts, types.ChunkProofsStatusReady). + Order("batch.index"). + Limit(int(limit)). + Find(&batch).Errorcoordinator/internal/orm/prover_task.go (1)
151-168
: Check for missing record scenarios.
Usingdb.Find(&proverTask)
withLimit(1)
will not differentiate between "record not found" and other errors. Consider usingdb.First(&proverTask)
or explicitly checkingerrors.Is(err, gorm.ErrRecordNotFound)
to ensure precise error handling.prover/src/prover.rs (2)
74-83
: Consider deduplicating potentially repeated TaskTypes.This fold operation accumulates task types for each prover type, but if a user mistakenly repeats a
ProverType
, duplicate task types may appear. A quick dedup might prevent unintended behavior.- let task_types: Vec<TaskType> = ... - .fold(Vec::new(), |mut acc, prover_type| { - acc.extend(get_task_types(*prover_type)); - acc - }); + let mut seen = std::collections::HashSet::new(); + let task_types: Vec<TaskType> = self.config + .prover_types + .iter() + .flat_map(|pt| get_task_types(*pt)) + .filter(|tt| seen.insert(*tt)) + .collect();
89-94
: Repeated chunk check could be centralized.This block replicates the same condition used for
geth_client
creation. Consider extracting a helper to avoid redundancy.prover/src/utils.rs (2)
27-39
: Remove commented-out code for clarity.The old
get_task_types
function is now obsolete. Consider deleting rather than commenting it out to avoid confusion.- // pub fn get_task_types(prover_types: Vec<ProverType>) -> Vec<TaskType> { - // ... - // }
47-54
: MappingTaskType::Bundle
toProverType::Batch
may cause confusion.While this might be intentional for your specific workflow, be mindful that grouping both
Batch
andBundle
under one prover type can be a source of logical ambiguity if these tasks diverge in the future.prover/src/zk_circuits_handler.rs (1)
152-176
: Possible performance impact from repeatedly building handlers.The nested
.flat_map
callsbuild(*prover_type, ...)
multiple times. If building the handler is expensive, caching each(hard_fork_name, prover_type)
pair could save resources.prover/config.json (2)
6-6
: Document the prover type values.The value
2
inprover_types
array appears to be a magic number. Consider:
- Adding a comment explaining what type
2
represents- Using an enum or constants to define valid prover types
- Including validation for allowed values
Line range hint
1-24
: Consider adding JSON schema validation.To ensure configuration integrity and provide better developer experience, consider adding a JSON schema that:
- Documents all valid fields and their purposes
- Validates the types and allowed values
- Enforces required fields
prover/src/coordinator_client.rs (1)
Line range hint
71-79
: Update LoginMessage documentation.Since
prover_types
now supports multiple values, consider updating the struct documentation to reflect:
- The purpose of supporting multiple prover types
- The impact on task assignment
- Any constraints on the number or combination of types
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (17)
coordinator/conf/config.json
(1 hunks)coordinator/internal/config/config.go
(1 hunks)coordinator/internal/logic/provertask/batch_prover_task.go
(2 hunks)coordinator/internal/logic/provertask/bundle_prover_task.go
(2 hunks)coordinator/internal/logic/provertask/chunk_prover_task.go
(2 hunks)coordinator/internal/logic/provertask/prover_task.go
(1 hunks)coordinator/internal/orm/batch.go
(1 hunks)coordinator/internal/orm/bundle.go
(1 hunks)coordinator/internal/orm/chunk.go
(1 hunks)coordinator/internal/orm/prover_task.go
(1 hunks)prover/config.json
(1 hunks)prover/src/config.rs
(1 hunks)prover/src/coordinator_client.rs
(1 hunks)prover/src/main.rs
(1 hunks)prover/src/prover.rs
(5 hunks)prover/src/utils.rs
(1 hunks)prover/src/zk_circuits_handler.rs
(5 hunks)
✅ Files skipped from review due to trivial changes (1)
- coordinator/internal/logic/provertask/prover_task.go
🔇 Additional comments (23)
coordinator/internal/logic/provertask/chunk_prover_task.go (2)
7-7
: No issues with the newly added import.
This import is necessary forstrings.HasPrefix
usage.
76-76
: No functional change.
This line appears to be a blank or spacing adjustment; no further action needed.coordinator/internal/logic/provertask/bundle_prover_task.go (1)
7-7
: Import usage is fine.
Approved for handling prefix checks withstrings.HasPrefix
.coordinator/internal/logic/provertask/batch_prover_task.go (1)
7-7
: No issues with the added import.
This is necessary for checking external prover prefixes.coordinator/internal/orm/batch.go (1)
95-108
: Query logic looks consistent and clear.
This function appropriately uses GORM's chaining methods, preventing accidental query issues. No further concerns here.coordinator/internal/config/config.go (1)
19-20
: New config field aligns with external prover feature.
DefiningExternalProverThreshold
is a straightforward addition to facilitate external prover activation. Looks good.coordinator/conf/config.json (1)
5-5
: Configuration change appears correct.
Theexternal_prover_threshold
setting of 32 matches the newly introduced logic in the code, and no further concerns stand out.prover/src/prover.rs (7)
11-12
: Imports look good for multi-prover handling.These imports introduce the newly added utilities and enumerations needed for multiple prover types. Nothing appears problematic here.
31-35
: Potential panic ifl2geth
is missing whileProverType::Chunk
is present.Because
l2geth.as_ref().unwrap()
is called whenProverType::Chunk
is found, a missing orNone
l2geth
config would cause a panic. Consider validating thatl2geth
exists ifChunk
is part ofprover_types
.
47-47
: Provider creation logic looks consistent.Initialization of
CircuitsHandlerProvider
aligns with the new multi-prover setup. No issues noted.
50-50
: Initializing multiple verification keys.Passing
config.prover_types.clone()
updates the logic to handle multiple prover types seamlessly. Looks good.
85-85
: Struct field assignment is straightforward.No functional changes beyond referencing the new
task_types
vector.
117-122
: Graceful handling of unknown task types.Bailing on
TaskType::Undefined
is a clean way to prevent incompatible tasks. Nicely done.
127-127
: Correct usage ofget_circuits_handler
.Passing the newly resolved
prover_type
is consistent with the refactored multi-prover logic.prover/src/main.rs (1)
69-69
: Logging multiple prover types is correct.Referencing
config.prover_types
better reflects the new vector-based configuration. No issues here.prover/src/config.rs (1)
33-33
: Shifting from a singleProverType
to a vector is a robust improvement.Ensure that
prover_types
is not empty or invalid in production configurations. Adding a small validation step infrom_file
may prevent edge cases.prover/src/zk_circuits_handler.rs (6)
42-42
: Storingcurrent_prover_type
asOption
is flexible.This design fits a scenario with no active type until first usage.
47-47
: Constructor update aligns with multi-prover logic.Removing the single
prover_type
parameter is coherent with the new design.
102-102
: Initializingcurrent_prover_type
toNone
.This default state prevents inadvertent usage before a type is selected.
112-112
: Explicitly receivingprover_type
inget_circuits_handler
.This makes the method more flexible, removing hidden dependencies on an internal field.
132-132
: Tracking the selectedprover_type
.Storing
current_prover_type
for reuse is consistent and logical.
145-145
: Refactoringinit_vks
to accept multipleprover_types
.Good adaptation to the new flexible design.
prover/src/coordinator_client.rs (1)
76-76
: Add validation for prover_types.Consider adding validation to ensure
prover_types
is not empty before cloning. This would prevent potential runtime issues if the configuration is misconfigured.Example validation:
if self.config.prover_types.is_empty() { bail!("prover_types cannot be empty"); }
if strings.HasPrefix(taskCtx.ProverName, ExternalProverNamePrefix) { | ||
unassignedChunkCount, getCountError := cp.chunkOrm.GetUnassignedChunkCount(ctx.Copy(), maxActiveAttempts, maxTotalAttempts, getTaskParameter.ProverHeight) | ||
if getCountError != nil { | ||
log.Error("failed to get unassigned chunk proving tasks count", "height", getTaskParameter.ProverHeight, "err", err) | ||
return nil, ErrCoordinatorInternalFailure | ||
} | ||
// Assign external prover if unassigned task number exceeds threshold | ||
if unassignedChunkCount < cp.cfg.ProverManager.ExternalProverThreshold { | ||
return nil, nil | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix logging variable mismatch.
At line 68, you are logging err
instead of getCountError
, which can cause confusion in error diagnosis.
- log.Error("failed to get unassigned chunk proving tasks count", "height", getTaskParameter.ProverHeight, "err", err)
+ log.Error("failed to get unassigned chunk proving tasks count", "height", getTaskParameter.ProverHeight, "err", getCountError)
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
if strings.HasPrefix(taskCtx.ProverName, ExternalProverNamePrefix) { | |
unassignedChunkCount, getCountError := cp.chunkOrm.GetUnassignedChunkCount(ctx.Copy(), maxActiveAttempts, maxTotalAttempts, getTaskParameter.ProverHeight) | |
if getCountError != nil { | |
log.Error("failed to get unassigned chunk proving tasks count", "height", getTaskParameter.ProverHeight, "err", err) | |
return nil, ErrCoordinatorInternalFailure | |
} | |
// Assign external prover if unassigned task number exceeds threshold | |
if unassignedChunkCount < cp.cfg.ProverManager.ExternalProverThreshold { | |
return nil, nil | |
} | |
} | |
if strings.HasPrefix(taskCtx.ProverName, ExternalProverNamePrefix) { | |
unassignedChunkCount, getCountError := cp.chunkOrm.GetUnassignedChunkCount(ctx.Copy(), maxActiveAttempts, maxTotalAttempts, getTaskParameter.ProverHeight) | |
if getCountError != nil { | |
log.Error("failed to get unassigned chunk proving tasks count", "height", getTaskParameter.ProverHeight, "err", getCountError) | |
return nil, ErrCoordinatorInternalFailure | |
} | |
// Assign external prover if unassigned task number exceeds threshold | |
if unassignedChunkCount < cp.cfg.ProverManager.ExternalProverThreshold { | |
return nil, nil | |
} | |
} |
if strings.HasPrefix(taskCtx.ProverName, ExternalProverNamePrefix) { | ||
unassignedBundleCount, getCountError := bp.bundleOrm.GetUnassignedBundleCount(ctx.Copy(), maxActiveAttempts, maxTotalAttempts) | ||
if getCountError != nil { | ||
log.Error("failed to get unassigned batch proving tasks count", "height", getTaskParameter.ProverHeight, "err", err) | ||
return nil, ErrCoordinatorInternalFailure | ||
} | ||
// Assign external prover if unassigned task number exceeds threshold | ||
if unassignedBundleCount < bp.cfg.ProverManager.ExternalProverThreshold { | ||
return nil, nil | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix logging variable mismatch.
At line 70, you log err
instead of getCountError
, which can complicate debugging.
- log.Error("failed to get unassigned batch proving tasks count", "height", getTaskParameter.ProverHeight, "err", err)
+ log.Error("failed to get unassigned batch proving tasks count", "height", getTaskParameter.ProverHeight, "err", getCountError)
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
if strings.HasPrefix(taskCtx.ProverName, ExternalProverNamePrefix) { | |
unassignedBundleCount, getCountError := bp.bundleOrm.GetUnassignedBundleCount(ctx.Copy(), maxActiveAttempts, maxTotalAttempts) | |
if getCountError != nil { | |
log.Error("failed to get unassigned batch proving tasks count", "height", getTaskParameter.ProverHeight, "err", err) | |
return nil, ErrCoordinatorInternalFailure | |
} | |
// Assign external prover if unassigned task number exceeds threshold | |
if unassignedBundleCount < bp.cfg.ProverManager.ExternalProverThreshold { | |
return nil, nil | |
} | |
} | |
if strings.HasPrefix(taskCtx.ProverName, ExternalProverNamePrefix) { | |
unassignedBundleCount, getCountError := bp.bundleOrm.GetUnassignedBundleCount(ctx.Copy(), maxActiveAttempts, maxTotalAttempts) | |
if getCountError != nil { | |
log.Error("failed to get unassigned batch proving tasks count", "height", getTaskParameter.ProverHeight, "err", getCountError) | |
return nil, ErrCoordinatorInternalFailure | |
} | |
// Assign external prover if unassigned task number exceeds threshold | |
if unassignedBundleCount < bp.cfg.ProverManager.ExternalProverThreshold { | |
return nil, nil | |
} | |
} |
if strings.HasPrefix(taskCtx.ProverName, ExternalProverNamePrefix) { | ||
unassignedBatchCount, getCountError := bp.batchOrm.GetUnassignedBatchCount(ctx.Copy(), maxActiveAttempts, maxTotalAttempts) | ||
if getCountError != nil { | ||
log.Error("failed to get unassigned batch proving tasks count", "height", getTaskParameter.ProverHeight, "err", err) | ||
return nil, ErrCoordinatorInternalFailure | ||
} | ||
// Assign external prover if unassigned task number exceeds threshold | ||
if unassignedBatchCount < bp.cfg.ProverManager.ExternalProverThreshold { | ||
return nil, nil | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix logging variable mismatch.
At line 70, you log err
instead of getCountError
, potentially hiding the real error.
- log.Error("failed to get unassigned batch proving tasks count", "height", getTaskParameter.ProverHeight, "err", err)
+ log.Error("failed to get unassigned batch proving tasks count", "height", getTaskParameter.ProverHeight, "err", getCountError)
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
if strings.HasPrefix(taskCtx.ProverName, ExternalProverNamePrefix) { | |
unassignedBatchCount, getCountError := bp.batchOrm.GetUnassignedBatchCount(ctx.Copy(), maxActiveAttempts, maxTotalAttempts) | |
if getCountError != nil { | |
log.Error("failed to get unassigned batch proving tasks count", "height", getTaskParameter.ProverHeight, "err", err) | |
return nil, ErrCoordinatorInternalFailure | |
} | |
// Assign external prover if unassigned task number exceeds threshold | |
if unassignedBatchCount < bp.cfg.ProverManager.ExternalProverThreshold { | |
return nil, nil | |
} | |
} | |
if strings.HasPrefix(taskCtx.ProverName, ExternalProverNamePrefix) { | |
unassignedBatchCount, getCountError := bp.batchOrm.GetUnassignedBatchCount(ctx.Copy(), maxActiveAttempts, maxTotalAttempts) | |
if getCountError != nil { | |
log.Error("failed to get unassigned batch proving tasks count", "height", getTaskParameter.ProverHeight, "err", getCountError) | |
return nil, ErrCoordinatorInternalFailure | |
} | |
// Assign external prover if unassigned task number exceeds threshold | |
if unassignedBatchCount < bp.cfg.ProverManager.ExternalProverThreshold { | |
return nil, nil | |
} | |
} |
@@ -129,9 +126,10 @@ impl<'a> CircuitsHandlerProvider<'a> { | |||
); | |||
if let Some(builder) = self.circuits_handler_builder_map.get(hard_fork_name) { | |||
log::info!("building circuits handler for {hard_fork_name}"); | |||
let handler = builder(self.prover_type, self.config, self.geth_client.clone()) | |||
let handler = builder(prover_type, self.config, self.geth_client.clone()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codebase verification
Inconsistent error handling pattern confirmed - change recommended
The function already uses proper error handling with Result
and the bail!
macro. Using expect
here is inconsistent with the codebase's error handling pattern and could hide recoverable errors in this fallback path.
🔗 Analysis chain
Using expect
may obscure recoverable errors.
Consider returning a proper Result
instead of panicking here. This approach keeps the caller in control of error handling.
- let handler = builder(prover_type, self.config, self.geth_client.clone())
- .expect("failed to build circuits handler");
+ let handler = builder(prover_type, self.config, self.geth_client.clone())
+ .map_err(|e| anyhow::anyhow!("failed to build circuits handler: {}", e))?;
🏁 Scripts executed
The following scripts were executed for the analysis:
Error: No shell scripts found
Script:
#!/bin/bash
# Get the full context of the function containing this line
rg -B 10 -A 10 "builder\(prover_type, self\.config, self\.geth_client\.clone\(\)\)" prover/src/zk_circuits_handler.rs
# Look for similar builder pattern usages
rg "builder\(.*\)\.expect\(" prover/src/
# Find other error handling patterns in the file
rg "map_err|\.expect|\.unwrap" prover/src/zk_circuits_handler.rs
Length of output: 1452
Purpose or design rationale of this PR
Describe your change. Make sure to answer these three questions: What does this PR do? Why does it do it? How does it do it?
PR title
Your PR title must follow conventional commits (as we are doing squash merge for each PR), so it must start with one of the following types:
Deployment tag versioning
Has
tag
incommon/version.go
been updated or have you addedbump-version
label to this PR?Breaking change label
Does this PR have the
breaking-change
label?Summary by CodeRabbit
Release Notes
Configuration Updates
external_prover_threshold
to manage external prover task assignmentTask Management Improvements
Prover Flexibility
These changes optimize task distribution and provide more granular control over prover task assignment, particularly for external provers.