Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exception ends upload of very large file #8

Open
khagler opened this issue Oct 31, 2012 · 11 comments
Open

Exception ends upload of very large file #8

khagler opened this issue Oct 31, 2012 · 11 comments

Comments

@khagler
Copy link

khagler commented Oct 31, 2012

Attempts to upload a very large (288.64 GB) file run for about an hour or two, then fail with the following output:

frontier:glacier-cli khagler$ ./glacier.py archive upload Photos ~/Documents/photos.tgz
Traceback (most recent call last):
File "./glacier.py", line 618, in
App().main()
File "./glacier.py", line 604, in main
args.func(args)
File "./glacier.py", line 416, in archive_upload
archive_id = vault.create_archive_from_file(file_obj=args.file, description=name)
File "/Users/khagler/glacier/glacier-cli/boto/glacier/vault.py", line 141, in create_archive_from_file
writer.write(data)
File "/Users/khagler/glacier/glacier-cli/boto/glacier/writer.py", line 152, in write
self.send_part()
File "/Users/khagler/glacier/glacier-cli/boto/glacier/writer.py", line 141, in send_part
content_range, part)
File "/Users/khagler/glacier/glacier-cli/boto/glacier/layer1.py", line 626, in upload_part
response_headers=response_headers)
File "/Users/khagler/glacier/glacier-cli/boto/glacier/layer1.py", line 83, in make_request
data=data)
File "/Users/khagler/glacier/glacier-cli/boto/connection.py", line 913, in make_request
return self._mexe(http_request, sender, override_num_retries)
File "/Users/khagler/glacier/glacier-cli/boto/connection.py", line 859, in _mexe
raise e
socket.gaierror: [Errno 8] nodename nor servname provided, or not known

OS: Mac OS X 10.7.5 Server
Python 2.7.1

I tried uploading the same file using FastGlacier with the part size set to 1 GB. It would upload some of each part before failing with a message about the remote host dropping the connection. After setting the part size to 256 MB, it was able to upload individual parts successfully.

Addendum:

After a bit more investigation, I think I've figured out what might be going on. According to the Amazon documentation, the maximum number of parts for a multi-part upload is 10,000. For this (very large) archive to be split evenly into 10,000 parts, each part would have to be about 27.5 MB--or, given the limits on allowable part sizes, 32 MB. It looks like you're using a default part size (which I didn't realize at the time I could change) of 8 MB. If I'm right about that, then an 80 GB file would be a (marginally less painful) valid test.

@basak
Copy link
Owner

basak commented Nov 2, 2012

Sorry you're having problems and I appreciate the report. I also seen an upload failure which I assumed was some kind of network problem so I didn't save the traceback. Can you tell me if how this reproduces - every time, intermittent (and if so what sort of proportion), and is there any pattern to how much data is uploaded before it fails?

My own single failure prompted me to write automatic resume upload support. This is working and I expect to push it shortly. But despite that I'd like uploads to work first time!

@khagler
Copy link
Author

khagler commented Nov 3, 2012

It happens every time. I don't know how much data is being uploaded, but it runs for a pretty long time before failing. I'm almost certain that what's happening here is that it starts the upload with the default 4 MB part size, and then 40,000 MB worth of uploading later it tries to upload the next part and Amazon rejects it because the 10,000 part limit has been reached. Exactly how long that takes varies depending on what else I'm doing with my connection at the time (and how many of my neighbors are bittorrenting their favorite TV shows ;-), which accounts for the variable but long time to failure.

I've written a fix that checks the size of the archive to be uploaded and determines the smallest part size that will work if 4 MB is too small. I created a 50 GB dummy file, and found that it did indeed fail to upload as expected without the fix. I'm trying it now with the fix, and it's still running. I'll update when it eventually either finishes or fails.

@fbueno
Copy link

fbueno commented Nov 3, 2012

Same here.
301MB - OK
2.6GB - OK
37GB - failed

@basak
Copy link
Owner

basak commented Nov 4, 2012

Based on the code, it looks like the problem is that the particular boto.glacier method I'm using doesn't let me pick a part size and chooses 4 MiB arbitrarily. So a suitable fix would be to automatically determine a suitable part size as khagler described, but I think this would need to go into boto rather than glacier-cli.

khagler: is this what you're working on, or shall I?

@khagler
Copy link
Author

khagler commented Nov 4, 2012

Yes, basically. I had modified your archive_upload so that it modified vault.DefaultPartSize, but I agree that this really ought to be done in boto, so I'll see about moving it there.

I've run a few tests, and while my fix does take care of the original problem, it exposes a new one: It seems to be pretty common for individual part uploads to fail (I've been seeing about a 1% failure rate in FastGlacier, which tells you when it happens). Unfortunately, boto doesn't seem to have a way to note this and retry failed parts.

@fastfwd75
Copy link

I am also getting this. Tried with 4GB splits and it failed. Tried with 1023M splits and it also failed. I can't realistically go smaller. Any hope of a fix for this? The same 1023M split uploaded without troubles in "simple amazon glacier uploader" but I prefer command line tools.

socket.gaierror: [Errno 8] nodename nor servname provided, or not known

@fastfwd75
Copy link

Full error text:

Traceback (most recent call last):
File "/Users/jonathan/glacier/glacier-cli/glacier", line 694, in
App().main()
File "/Users/jonathan/glacier/glacier-cli/glacier", line 680, in main
args.func(args)
File "/Users/jonathan/glacier/glacier-cli/glacier", line 482, in archive_upload
archive_id = vault.create_archive_from_file(file_obj=args.file, description=name)
File "/Users/jonathan/glacier/glacier-cli/boto/glacier/vault.py", line 141, in create_archive_from_file
writer.write(data)
File "/Users/jonathan/glacier/glacier-cli/boto/glacier/writer.py", line 152, in write
self.send_part()
File "/Users/jonathan/glacier/glacier-cli/boto/glacier/writer.py", line 141, in send_part
content_range, part)
File "/Users/jonathan/glacier/glacier-cli/boto/glacier/layer1.py", line 626, in upload_part
response_headers=response_headers)
File "/Users/jonathan/glacier/glacier-cli/boto/glacier/layer1.py", line 83, in make_request
data=data)
File "/Users/jonathan/glacier/glacier-cli/boto/connection.py", line 913, in make_request
return self._mexe(http_request, sender, override_num_retries)
File "/Users/jonathan/glacier/glacier-cli/boto/connection.py", line 859, in _mexe
raise e
socket.gaierror: [Errno 8] nodename nor servname provided, or not known

@dengkai
Copy link

dengkai commented Jun 17, 2013

I'm running into this issue as well with large files (> 1GB), same error:

Traceback (most recent call last):
File "./glacier.py", line 730, in
App().main()
File "./glacier.py", line 716, in main
self.args.func()
File "./glacier.py", line 498, in archive_upload
file_obj=self.args.file, description=name)
File "/Users/user/glacier/glacier-cli/boto/glacier/vault.py", line 141, in create_archive_from_file
writer.write(data)
File "/Users/user/glacier/glacier-cli/boto/glacier/writer.py", line 152, in write
self.send_part()
File "/Users/user/glacier/glacier-cli/boto/glacier/writer.py", line 141, in send_part
content_range, part)
File "/Users/user/glacier/glacier-cli/boto/glacier/layer1.py", line 626, in upload_part
response_headers=response_headers)
File "/Users/user/glacier/glacier-cli/boto/glacier/layer1.py", line 83, in make_request
data=data)
File "/Users/user/glacier/glacier-cli/boto/connection.py", line 913, in make_request
return self._mexe(http_request, sender, override_num_retries)
File "/Users/user/glacier/glacier-cli/boto/connection.py", line 859, in _mexe
raise e
socket.gaierror: [Errno 8] nodename nor servname provided, or not known

@basak
Copy link
Owner

basak commented Jun 22, 2013

This is an issue within boto, not in glacier-cli directly. Please could anyone still affected post the version of boto you're using, and try the latest?

@dengkai
Copy link

dengkai commented Jun 24, 2013

Was seeing the issue with boto 2.5.2 but I just updated to 2.9.6 and the issue persists.

@strobe33333
Copy link

I too am having this problem

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants