You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The ScheduledThreadPoolExecutor is configured with a limited number of cores.
Previous code: new ScheduledThreadPoolExecutor(Runtime.getRuntime().availableProcessors()/2,new ThreadFactory() {
New code: new ScheduledThreadPoolExecutor(getMinimumScheduleCorePoolSize(),new ThreadFactory() {
If the number of cores is equals 2 (new ScheduledThreadPoolExecutor(2,new ThreadFactory() {) and you run the YAML above, two threads will be in the blockingSemaphore.acquire() state. Because of that, there is no space for the third thread calls the blockingSemaphore.release() and the application will hang forever.
Increasing the number of cores (e.g., to 3) resolves the issue temporarily but introduces the potential for another scenario where three cores calls the blockingSemaphore.acquire(), leading to a hang. This approach merely postpones the issue without addressing its root cause.
The text was updated successfully, but these errors were encountered:
I can find 3 areas where tasks are submitted to the ScheduledThreadPoolExecutor:
The callback from SuffixStream. This occurs when qDup detects the prompt at the end of the stream from the remote shell. qDup uses a callback with a small delay because if there are additional characters from the remote stream then the prompt is not the end of the output from the current command. This delay catches cases like sh: echo "$PS1" && doSomething.sh The callback also runs the Sh.postRun() which calls AbstractShell.shSync to capture the exit code. The shSync call is what uses blockingSemaphore to ensure only one call to the remote shell at a time.
The sleep command pushes a task into the queue to pass execution onto the next command in the script after the desired delay. This is done through the Context
The timer on a command (usually on an sh) queues a task in the pool with the specified wait duration.
I think what you've uncovered is an error in the design. The callback from the suffix stream calling shSync means we need 2 free threads in the pool to complete the callback. 1 for running the callback with shSync and then the second for the other callback that identifies the prompt at the end of the shSync command.
I think we need to move the core callback that identifies the prompt at the end of ouptut into a separate thread pool and then execute the postRun portion of the callback in the pool with the user commads.
Does my rambling make sense or should I go back to staring at the code some more?
The application utilizes a
Semaphore
referred to asblockingSemaphore
to control a shared resource.AbstractShell.java
The
ScheduledThreadPoolExecutor
is configured with a limited number of cores.Previous code:
new ScheduledThreadPoolExecutor(Runtime.getRuntime().availableProcessors()/2,new ThreadFactory() {
New code:
new ScheduledThreadPoolExecutor(getMinimumScheduleCorePoolSize(),new ThreadFactory() {
The
blockingSemaphore.acquire()
andblockingSemaphore.release()
are both created by theScheduledThreadPoolExecutor
mentioned above.A situation like the following will hang if
corePoolSize
of theScheduledThreadPoolExecutor
is equal to 2.If the number of cores is equals 2 (
new ScheduledThreadPoolExecutor(2,new ThreadFactory() {
) and you run the YAML above, two threads will be in theblockingSemaphore.acquire()
state. Because of that, there is no space for the third thread calls theblockingSemaphore.release()
and the application will hang forever.Increasing the number of cores (e.g., to 3) resolves the issue temporarily but introduces the potential for another scenario where three cores calls the
blockingSemaphore.acquire()
, leading to a hang. This approach merely postpones the issue without addressing its root cause.The text was updated successfully, but these errors were encountered: