Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement queuing priority #56 #58

Merged
merged 6 commits into from
Jan 16, 2025
Merged

Conversation

patrick-austin
Copy link

Also includes changes from #45 #51, due to dependency of priority settings on submitting to and polling the queue. See those PRs for those changes in isolation.

  • Add queue priority settings to run.properties.
    • This can be a single number, for all users, authenticated users, investigation users or instrument scientists.
    • For the latter two, it can also be a mapping of Instrument.name or role to a priority level
  • Class to handle loading and formatting priority levels in JPQL
  • Added functions to IcatClient to perform these queries for a given user to identify their level of priority
  • Modified StatusCheck to get all queued Downloads, then when iterating only submit those with priority 1. If there is still space, start submitting from priority 2 and so on.
  • For queue submission endpoints, now only submit if user has a positive priority

Closes #56

Base automatically changed from 38_prepare_queued_downloads to 36_queuing January 6, 2025 13:49
Copy link

@kevinphippsstfc kevinphippsstfc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! I love the concept of being able to prioritise different types of request based on who the user is and this is exactly what we will need once open data becomes a thing.

Unless I'm missing something the prioritisation is only applied to downloads that are put in the queue via the new Queue Visit and Queue Files endpoints. What I think we probably need to move to is a system where ALL downloads go via the queue and are prioritised but that will need a bit more thinking about. For example, the new recall methods need to have less priority as they are likely to be larger requests even though they are likely to be submitted by Instrument Scientists who may have higher priority. And where do we prioritise those against recalls from open data users? However, I think this is probably changes that we can make in the future, depending on what ideas you have on this.

if (authenticatedPriority < 1 && defaultPriority >= 1) {
String msg = "queue.priority.authenticated disabled with value " + authenticatedString;
msg += " but queue.priority.default enabled with value " + defaultPriority;
msg += "\nAuthenticated users will use default priority if no superseding priority applies";

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These last 3 lines in both the if and the else could go below the if/else as they are common.
Although you would also need to initialise msg above the if/else, so that line is borderline worth/not worth it.
However, the final two lines are probably worth moving.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have refactored the last three lines, this also means introducing an else clause so we can return early don't warn/set default when we don't meet either criteria (corresponding to a "normal" value for authenticatedPriority).

@patrick-austin
Copy link
Author

Unless I'm missing something the prioritisation is only applied to downloads that are put in the queue via the new Queue Visit and Queue Files endpoints.

Yes, this is currently unique to the polling which moves from PAUSED with no preparedId (AKA queued) to PREPARING. I didn't want to change/break existing methods that are used by DataGateway currently, just add this as a new feature.

What I think we probably need to move to is a system where ALL downloads go via the queue and are prioritised but that will need a bit more thinking about.

Agreed - I wouldn't want to start changing the existing workflow (other than e.g. for checking feasibility) without getting some buy in from the others (frontend/service management/DLS themselves).

For example, the new recall methods need to have less priority as they are likely to be larger requests even though they are likely to be submitted by Instrument Scientists who may have higher priority. And where do we prioritise those against recalls from open data users? However, I think this is probably changes that we can make in the future, depending on what ideas you have on this.

Yeah it's not obvious to me what we want to enforce. Practically, I think the current system could be extended to cover some of this. For example, the reason for having a different authenticatedPriority and defaultPriority is to allow us to discriminate the open data users. To combine priorities, I think we can add multiple factors and evaluate the result. For example, I might be instrument scientist and have a "user" priority of 1, but requesting an entire visit might give me a "content" priority of say 3 for a total of 4, which then might be higher/equal/lower to whatever priority is given to an anonymous user requesting a Dataset. The challenge with the current approach will be mapping the "content" or "source" of the Download. Once something is in the queue, how do we know where it came from? If we base this on e.g. just the size of the Download we have that in the DB, but it might be trickier to work out whether something came by the API or frontend without either piggybacking an existing Download field or just creating a new dedicated field for priority. At any rate, what we do would depend on requirements, so definitely something for the future I think.

@patrick-austin patrick-austin changed the base branch from 36_queuing to 50_queue_files January 15, 2025 14:23
Base automatically changed from 50_queue_files to 36_queuing January 15, 2025 16:36
Copy link

@kevinphippsstfc kevinphippsstfc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies I didn't take in to account the fact that an else would be needed to cater for a direct return from the method. Having seen the updated method side by side with the old one, I prefer the original! Although it has the repeated lines of code it is simpler and easier to follow. So please just revert this commit and I'll approve the PR.

@patrick-austin patrick-austin merged commit 431a3f3 into 36_queuing Jan 16, 2025
1 check failed
@patrick-austin patrick-austin deleted the 56_queue_priority branch January 16, 2025 14:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants