How to run java tool with Worker API

Hello! I’m an author of spotbugs-gradle-plugin that runs SpotBugs for projects. This tool eats CPU and memory heavily, and have Singletons, so I am motivated to run it in a separated JVM.

Today I want to get help from community, to know better ways to run a tool (process) with Worker API.

Previously I run the tool by project.javaexec() and recently I introduced Worker API with process isolation. Worker surely improves build performance, but it keeps workers eating 1G Java heap even after the build ends.

I had several ideas to fix this issue, however, it seems not so easy to apply :thinking: :

  1. Use classloader isolation instead. But it cannot set Java heap size, so build performance could be worth. I’m also not sure that the tool can run in this isolation level without any trouble.
  2. Stop the worker process after the tool runs. But there is no API to realize: the keepAlive mode of worker is not SESSION but DAEMON, and this class is in an “internal” package.
  3. Use no isolation worker, and run project.javaexec in it. But in worker side we cannot touch project instance. If we pass the instance via WorkParameters, it throws ‘Could not serialize value of type DefaultProject’ error.

I’ve also read some similar tasks, but it seems that they depend on internal API that we should avoid. e.g. DaemonJavaCompiler, DaemonScalaCompiler.

Please help me to find better solution to apply. Thanks in advance! :slight_smile:

Obvious question, but is there any chance of refactoring to remove the singletons so you can have multiple instances in a single jvm?

1 Like

Thanks for your response, Lance! :+1:

remove the singletons so you can have multiple instances in a single jvm

It’s technically possible but still challenging. And even after we achieve it, we still have the heap size problem noted in the 1st item; instance of the tool in each worker eats java heap heavily (1 GiB~), but:

a. Worker with classloader-level isolation cannot set Java heap size implicitly
b. Multiple workers uses the same Java heap heavily at the same time, so GC runs frequently and memory-sensitive cache won’t work well as designed
c. Worker will keep eating huge memory even after the tool ends

So it’s nice if we can run one JVM for one tool instance. project.javaexec without worker works, but its performance is not so nice. This is why I’m motivated to use the Worker API.

I don’t know whether the javaexec tactic is a good one, but if you want to try you actually can without using Project.

The new experimental configuration cache feature also greatly restricts where you can use Project, that is you are not allowed to use it at execution time but only at configuration time.

Due to that several replacements for things like javaExec were done which are compatible and should also be compatible with the worker API I think.

Here you can find a nice list of method and replacement: Configuration cache

In your case you would simply inject ExecOperations into your worker class and use ExecOperations#javaexec {} just like you used Project#javaexec {} before.

1 Like

Thanks Vampire! :+1:

Based on suggestion, I’ve implemented a PoC and it seems that we cannot touch the stdout/stderr of the launched process. I’ll investigate how I can resolve this issue, and confirm how performance has been changed.