Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR gives the NCR API the ability to receive training jobs submitted via HTTP (with a sample web interface) and execute the training job on HPF. This works through the following steps:
Training jobs are submitted to the NCR API. Each training job submission consists of a:
Once the user submits their training job, this training job is sent to a WebDAV server specified by environment variables. Specifically, the WebDAV server receives:
<jobID>
.obo<jobID>
.json<jobID>
The HPF node periodically executes (possible via crontab)
python generate_qsub_job.py ~/qsub ~/uploaded_obo
. This causes READY_WEBDAV_URL to be checked for training jobs that need to be executed. If any Job IDs are available. the corresponding OBO and JSON files are downloaded from OBO_WEBDAV_URL/<jobID>
.obo and QSUB_WEBDAV_URL/<jobID>
.json respectively and a job is submitted to QSUB to run the training task.As the training task executes on HPF, progress log messages can be sent to LOGGING_WEBDAV_URL/
<jobID>
_<messageID>
and then viewed by the submitting client by visiting the/log/<jobID>
API endpoint.When the training completes, the trained model is uploaded to the WebDAV server under OUTPUT_WEBDAV_URL/
<jobID>
_config.json, OUTPUT_WEBDAV_URL/<jobID>
_ncr_weights.h5, OUTPUT_WEBDAV_URL/<jobID>
_onto.jsonIf all goes well, the name assigned to the model at the start of the training is written to COMPLETE_WEBDAV_URL/JOBCOMPLETE_
<jobID>
. If there is a failure, the string FAILED is written to FAILED_WEBDAV_URL/JOBFAIL_<jobID>
.When the
/models
API endpoint is hit, the WebDAV server is queried for completed model training jobs. When completed jobs are available, the models are downloaded and added to the model list of the NCR API for use with the/match/
and/annotate/
text analysis methods.