Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to proceed when processing within the code times out halfway to completion? #62

Open
esturdivant-usgs opened this issue May 30, 2019 · 3 comments

Comments

@esturdivant-usgs
Copy link
Owner

esturdivant-usgs commented May 30, 2019

It looks like the process timed out while running sb.upload_files_and_upsert_item(item, up_files).

  • Was the timeout because it went too long without logging in or because the command itself was not functioning for too long?
  • Now that the script aborted with an error, what's the best way for me to resume processing? Will it be possible not to redo what was already completed?

The script had been running for a little over half an hour. It had processed 64 out of 91 XML files. It's hard to say how long the sb.upload_files_and_upsert_item() command was running. I'd guess anywhere from 2 to 15 minutes.

Here's the readout:

UPLOADING: files in directory 'pts_trans_ubw'
Traceback (most recent call last):
  File "//anaconda/envs/sb_py3/lib/python3.6/site-packages/urllib3/connectionpool.py", line 601, in urlopen
    chunked=chunked)
  File "//anaconda/envs/sb_py3/lib/python3.6/site-packages/urllib3/connectionpool.py", line 357, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "//anaconda/envs/sb_py3/lib/python3.6/http/client.py", line 1239, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "//anaconda/envs/sb_py3/lib/python3.6/http/client.py", line 1285, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "//anaconda/envs/sb_py3/lib/python3.6/http/client.py", line 1234, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "//anaconda/envs/sb_py3/lib/python3.6/http/client.py", line 1065, in _send_output
    self.send(chunk)
  File "//anaconda/envs/sb_py3/lib/python3.6/http/client.py", line 986, in send
    self.sock.sendall(data)
  File "//anaconda/envs/sb_py3/lib/python3.6/ssl.py", line 975, in sendall
    v = self.send(byte_view[count:])
  File "//anaconda/envs/sb_py3/lib/python3.6/ssl.py", line 944, in send
    return self._sslobj.write(data)
  File "//anaconda/envs/sb_py3/lib/python3.6/ssl.py", line 642, in write
    return self._sslobj.write(data)
BrokenPipeError: [Errno 32] Broken pipe

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "//anaconda/envs/sb_py3/lib/python3.6/site-packages/requests/adapters.py", line 440, in send
    timeout=timeout
  File "//anaconda/envs/sb_py3/lib/python3.6/site-packages/urllib3/connectionpool.py", line 639, in urlopen
    _stacktrace=sys.exc_info()[2])
  File "//anaconda/envs/sb_py3/lib/python3.6/site-packages/urllib3/util/retry.py", line 357, in increment
    raise six.reraise(type(error), error, _stacktrace)
  File "//anaconda/envs/sb_py3/lib/python3.6/site-packages/urllib3/packages/six.py", line 685, in reraise
    raise value.with_traceback(tb)
  File "//anaconda/envs/sb_py3/lib/python3.6/site-packages/urllib3/connectionpool.py", line 601, in urlopen
    chunked=chunked)
  File "//anaconda/envs/sb_py3/lib/python3.6/site-packages/urllib3/connectionpool.py", line 357, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "//anaconda/envs/sb_py3/lib/python3.6/http/client.py", line 1239, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "//anaconda/envs/sb_py3/lib/python3.6/http/client.py", line 1285, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "//anaconda/envs/sb_py3/lib/python3.6/http/client.py", line 1234, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "//anaconda/envs/sb_py3/lib/python3.6/http/client.py", line 1065, in _send_output
    self.send(chunk)
  File "//anaconda/envs/sb_py3/lib/python3.6/http/client.py", line 986, in send
    self.sock.sendall(data)
  File "//anaconda/envs/sb_py3/lib/python3.6/ssl.py", line 975, in sendall
    v = self.send(byte_view[count:])
  File "//anaconda/envs/sb_py3/lib/python3.6/ssl.py", line 944, in send
    return self._sslobj.write(data)
  File "//anaconda/envs/sb_py3/lib/python3.6/ssl.py", line 642, in write
    return self._sslobj.write(data)
urllib3.exceptions.ProtocolError: ('Connection aborted.', BrokenPipeError(32, 'Broken pipe'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "sb_automation.py", line 205, in <module>
    data_item, bigfiles1 = upload_files(sb, data_item, xml_file, max_MBsize=max_MBsize, replace=True, verbose=verbose)
  File "/Users/esturdivant/GitHub/science-base-automation/autoSB.py", line 612, in upload_files
    item = sb.upload_files_and_upsert_item(item, up_files) # upsert should "create or update a SB item"
  File "//anaconda/envs/sb_py3/lib/python3.6/site-packages/sciencebasepy/SbSession.py", line 380, in upload_files_and_upsert_item
    ret = self._session.post(url, params=params, files=files, data=data)
  File "//anaconda/envs/sb_py3/lib/python3.6/site-packages/requests/sessions.py", line 555, in post
    return self.request('POST', url, data=data, json=json, **kwargs)
  File "//anaconda/envs/sb_py3/lib/python3.6/site-packages/requests/sessions.py", line 508, in request
    resp = self.send(prep, **send_kwargs)
  File "//anaconda/envs/sb_py3/lib/python3.6/site-packages/requests/sessions.py", line 618, in send
    r = adapter.send(request, **kwargs)
  File "//anaconda/envs/sb_py3/lib/python3.6/site-packages/requests/adapters.py", line 490, in send
    raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', BrokenPipeError(32, 'Broken pipe'))
@esturdivant-usgs
Copy link
Owner Author

If the script aborted during the XML loop, then the JSON dictionaries weren't updated.

I tried running upload_files for the same xml file and it worked without issue.

@esturdivant-usgs
Copy link
Owner Author

Right now the answer is to rerun. This might not be necessary if the original folder names were the same as the final data page names.

@esturdivant-usgs
Copy link
Owner Author

esturdivant-usgs commented Jun 13, 2019

Added the parameter start_xml_idx. This allows the user to specify which XML to start with. Should be the number of the last file with upload completed. Commit: 9557780

esturdivant-usgs referenced this issue Jun 13, 2019
…itch to log_in() in xml loop (#65). Add start_xml_idx parameter to start upload at given point.
@esturdivant-usgs esturdivant-usgs changed the title How to proceed when processing within the code timeouts halfway to completion? How to proceed when processing within the code times out halfway to completion? Jun 13, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant