Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix another S3 multipart upload issue with marc exporter (PP-1693) #2053

Merged
merged 3 commits into from
Sep 11, 2024

Conversation

jonathangreen
Copy link
Member

Description

Motivation and Context

I forgot an offset that needed to be updated in #2052.

How Has This Been Tested?

  • Running tests

Checklist

  • I have updated the documentation accordingly.
  • All new and existing tests passed.

@jonathangreen jonathangreen added the bug Something isn't working label Sep 11, 2024
@jonathangreen jonathangreen requested a review from a team September 11, 2024 15:37
Copy link

codecov bot commented Sep 11, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 90.78%. Comparing base (3784c8f) to head (bd77517).
Report is 4 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #2053   +/-   ##
=======================================
  Coverage   90.78%   90.78%           
=======================================
  Files         343      343           
  Lines       40561    40568    +7     
  Branches     8787     8787           
=======================================
+ Hits        36824    36831    +7     
  Misses       2483     2483           
  Partials     1254     1254           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@jonathangreen jonathangreen force-pushed the bugfix/marc-integration branch from cd301aa to d479a82 Compare September 11, 2024 15:51
Copy link
Contributor

@tdilauro tdilauro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This all looks good. A couple of suggestions below.

key, upload.upload_id, len(upload.parts), upload.buffer.encode()
key,
upload.upload_id,
len(upload.parts) + 1,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it would be better to have this "+1" logic in a single function with the explanation in its docstring and call that from the multiple places that need it. That might make it a little more clear what's going on for future us.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I simplified the logic here so its easier to follow, and the part logic with offsets is all encapsulated in the same place.

The simplification comes at the expense of efficiency, since it will require a couple more calls to redis and another call to s3, but that inefficiency probably doesn't make much a difference here and its much more readable.

Comment on lines 354 to 363
# 1. A small record that doesn't need to be uploaded in parts, it just
# gets uploaded directly when complete is called (test1).
# 2. A large record that needs to be uploaded in parts, on the first sync
# call its buffer is large enough to trigger the upload. When complete
# is called, there is no data in the buffer, so no final part needs to be
# uploaded (test2).
# 3. A large record that needs to be uploaded in parts, on the first sync
# call its buffer is large enough to trigger the upload. When complete
# is called, there is data in the buffer, so a final part needs to be
# uploaded (test3).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor/pedantic: It took me a minute to process this explanation because of the sentence structure. Maybe start a new sentence or add a semicolon after "... in parts" at the beginning of each case explanation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I re-worked this comment.

@jonathangreen jonathangreen force-pushed the bugfix/marc-integration branch from 87820af to b3c5c67 Compare September 11, 2024 16:36
@jonathangreen jonathangreen merged commit aa8f01a into main Sep 11, 2024
21 checks passed
@jonathangreen jonathangreen deleted the bugfix/marc-integration branch September 11, 2024 17:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants