-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix another S3 multipart upload issue with marc exporter (PP-1693) #2053
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #2053 +/- ##
=======================================
Coverage 90.78% 90.78%
=======================================
Files 343 343
Lines 40561 40568 +7
Branches 8787 8787
=======================================
+ Hits 36824 36831 +7
Misses 2483 2483
Partials 1254 1254 ☔ View full report in Codecov by Sentry. |
cd301aa
to
d479a82
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This all looks good. A couple of suggestions below.
src/palace/manager/marc/uploader.py
Outdated
key, upload.upload_id, len(upload.parts), upload.buffer.encode() | ||
key, | ||
upload.upload_id, | ||
len(upload.parts) + 1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it would be better to have this "+1" logic in a single function with the explanation in its docstring and call that from the multiple places that need it. That might make it a little more clear what's going on for future us.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I simplified the logic here so its easier to follow, and the part logic with offsets is all encapsulated in the same place.
The simplification comes at the expense of efficiency, since it will require a couple more calls to redis and another call to s3, but that inefficiency probably doesn't make much a difference here and its much more readable.
tests/manager/marc/test_uploader.py
Outdated
# 1. A small record that doesn't need to be uploaded in parts, it just | ||
# gets uploaded directly when complete is called (test1). | ||
# 2. A large record that needs to be uploaded in parts, on the first sync | ||
# call its buffer is large enough to trigger the upload. When complete | ||
# is called, there is no data in the buffer, so no final part needs to be | ||
# uploaded (test2). | ||
# 3. A large record that needs to be uploaded in parts, on the first sync | ||
# call its buffer is large enough to trigger the upload. When complete | ||
# is called, there is data in the buffer, so a final part needs to be | ||
# uploaded (test3). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor/pedantic: It took me a minute to process this explanation because of the sentence structure. Maybe start a new sentence or add a semicolon after "... in parts" at the beginning of each case explanation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I re-worked this comment.
87820af
to
b3c5c67
Compare
Description
MarcUploadManager
Motivation and Context
I forgot an offset that needed to be updated in #2052.
How Has This Been Tested?
Checklist