Should tasks handle their own parallelization

You can write tasks in one of two ways, serial or parallel, right? My question is should a task, within its internals, try to consume all of the resources? Or should it depend on gradle to attempt to consume them all (with --parallel) and step on gradle’s toes in the process.

A concrete (pseudocode) example of what I’m going for is:

task doMagic << {
    fileTree('src').each { doLongProcess(it) }
task doOtherMagic << {
    def pool = Executors.newFixedThreadPool()
    fileTree('src').each { {doLongProcess(it)} as Callable }

The former seems like it’ll be a nicer gradle --parallel citizen in that it won’t eat resources that might have been allocated for another task, but the latter will be faster (on multicore machines) if it’s the only task being run. I’m pretty sure the answer is going to be task dependent, but I was wondering if there were any practical benefits to the former or is the latter considered bad practice

Today --parallel means “try to run tasks in parallel that exist in different projects”. As part of our roadmap we’d like this to extend to tasks in the same project. Eventually, we’d like for operations inside a task to be automatically parallel (if appropriate) and safe.

What you’re asking about is a lot like what we’ve added for parallel native compilation in 2.4. It’s enabled by default (so it doesn’t check for --parallel) and it shares resources as a “good citizen”. The way it does this is currently an internal API and we’re going to be making changes to get all of the parallel work to use these primitives. It looks a lot like an ExecutorService.

Since there’s not a standard way of sharing resources in Gradle right now, I’d recommend making your tasks serial by default, and if there’s a lot to gain by doing something in parallel inside the task, make it opt-in. Just keep in mind that if someone decides to use --parallel and opt-in to intra-task parallelization, you might do something surprising (e.g., “parallel task” in 10 projects with 4 workers = 40 workers total with --parallel).

Another way that might help for slow task is to make it an incremental tasks, so if it needs to rebuild, it’ll rebuild the minimum necessary.


1 Like