Wroker API and preventing deadlocks


(Alpár Török) #1

I am considering using the worker API to start Elasticsearch nodes used for testing and I have arrived at the following theoretical scenario that I think could lead to deadlocks with --parallel:

Task T1 and T2 run in parallel because they are part of different projects.
Each task needs a 2 nodes ( to form a cluster to test against ).
Let’s call these N1-1, N1-2, N2-1 and N2-2 where N1-* belongs to T1.
Suppose that T1 and T2 are implemented such as to use the worker API to start the nodes.
If the following example runs with --max-workers=4 it could be possible that T1 and T2 each use a worker, then each manages to start one node of it’s own, say N1-1 and N2-1, exhausting all the workers, and each waiting for an additional worker to complete work, so the build would deadlock.
This assumes that T1 and T2 has an action that waits for the clusters to come online.

Is this scenario accounted for ? Is there a way to maybe specify how many workers a task needs so that it’s never started if those are not available. Will a build that has a task that waits for a worker to complete deadlock the build with --max-workers=1 ? Is the worker API only suited for use-cases that are fire and forget with no coordination involved ?

Thanks!
Alpar