Partitioning tests across multiple machines

I’m using gradle, with the java plugin and TestNG for a project that has a lot of tests.

I want to run builds on multiple machines simultaneously, and have the tests partitioned across multiple nodes. Each node has the TOTAL_NODES and NODE_NUMBER environment variables set to tell it how many nodes there are and which node this particular worker is.

Is there an automatic mechanism for doing this that I haven’t noticed?

Assuming that there isn’t a baked in function to do this, it seems like I could do it by taking the list of all test classes, partitioning them according to some hash value, and then adding include filters for the tests that fall into the appropriate bins. I’m not sure what the best way of getting the names of the tests or test classes is though. Does the test plugin provide a way to query that, or should I just crawl the file system?

Alternatively, is there a way to use a TestListener to skip a test?

The use case you describe is definitely something we want to support in the future. It would probably fall into the category parallel and distributed build execution. At the moment, Gradle does not support this functionality. Are you already using parallel test execution to at least get the most out of the hardware resources on the machine executing your tests?

I cannot think of a straight forward way to implement this scenario with the capabilities we have in place right now. Dividing up the test cases by hash would be a way to shard the tests. The more complex problem is to dish out the test sets to remote machines and to collect/aggregate the test results.

Thanks for the response. We can’t make much use of the parallel test execution in our CI environment because each vm runs with a single processor and limited memory. The model is to shard between vms instead of parallelization within a single vm. When running locally I do make use of the existing parallel test running, but on our current CI infrastructure it doesn’t really work.

Do you have any suggestion for how to perform the sharding by hashing step? I’m stuck at how to disable individual tests from within a gradle build. Our testing infrastructure will do the worker and environment set up, so gradle doesn’t have to know how to do that. Likewise, aggregating at the end isn’t so important. The whole build will fail if any one of the shards fails, and it’s not to bad to spelunk through manually or rerun a non parallel build afterwords to debug anything.