How best to handle GC thrashing tests?

This is an extreme case of being very abusive towards the JVM and is not Gradle’s doing.

I’m working on a rewrite of Guava’s Cache, code named Caffeine. Due to the large number of configuration options, the tests are parameterized to obtain full coverage. There are over 1 million test executions and growing.

A “feature” of the cache is soft and weak references. This may look attractive at first but can quickly become problematic (even Gradle adopted them until the GC thrashing became apparent). These types of references typically require a major (full) garbage collection cycle to eliminate. Soft references litter the heap causing repeat GC pressure and reduce performance - the exact opposite of the user’s intended behavior.

This combination of high test count and reference caching requires a large JVM heap (1gb) and the G1 collector to perform well. This exceeds the quota on TravisCI, which kill 9s the process as abusive. Profiling shows that the tests are very GC-able by retaining minimal live objects, but the reduced heap size on TravisCI causes too much GC thrashing or out of memory errors if reduced.

Using ‘forkMode’ does not help because it forks by test class, rather than test method. The only solution that I think might work is to run multiple Gradle test tasks, passing a parameter for whether to use reference permutations. In combination with ‘forkMode’ this might keep each JVM instance small enough for TravisCI.

Question: Is there a better alternative approach? If not, do you have a quick example of multiple test tasks chained together before I dive in to figure that out myself?

Thanks!

Multiple test tasks will solve your problem. Has the following problems though:

1. Awkward to set the appropriate test context from the IDE
2. Enforces context boxes, as opposed to each test declaring its dimensions
3. More complexity in the build

Could you create the references through some factory that allows you to clear all the references at the end of the test?

I do that already, because TestNG holds onto all test parameters for reporting (bug). This was causing a memory leak before I started testing reference caching. A lazy stream generates all test parameters and a listener ensures that they are garbage collectable after execution.

Unfortunately this doesn’t help too much with soft references which require a major GC to occur before they are discarded, as they force the heap to be filled first. TravisCI seems to kill the process before when the JVM exceeds some unknown memory limit.

Because this is only an issue on TravisCI, the IDE and local development don’t need any of these complexities. My current hack is trying to parameterize the build for multiple executions by passing system properties to enable soft/weak combinations.

So you are explicitly clear()'ing the soft references you create during the test?

Parameterising the build is the wrong way to think about it. You don’t want to have to invoke the build multiple times. You can instead create multiple test tasks that use the same classpath/testsuite as the main ‘test’ task, but parameterise somehow (e.g. pass system properties to test JVM, exclude certain tests etc.). You’d then create a lifecycle task which depends on all these extra test tasks.

There’s some info on extra test tasks here: https://blog.safaribooksonline.com/2013/08/22/gradle-test-organization/

Yes. All test data is passed in as parameters and explicitly made GC-able after execution. Profiling shows very little memory is used after a garbage collection cycle.

I agree, multiple build executions is wrong. I’ll move over to a lifecycle approach once I have it passing in its hacked form. So far that’s not working because TravisCI kills the process regardless of the memory settings or number of threads used. Unfortunately the hidden thresholds TravisCI uses makes it difficult to debug.

I think one of my problems is that I tried passing in the JVM settings as

JAVA_OPTS="-Xmx384m -XX:+UseG1GC" ./gradlew check

That doesn’t appear to be honored by the test runner, which is a forked process. When I set the jvmArgs for forked processes it seems to be reliable.

tasks.withType(JavaForkOptions) {
  jvmArgs '-Xmx384m', '-XX:+UseG1GC', '-XX:SoftRefLRUPolicyMSPerMB=250'
}

That may be the root cause for why TravisCI killed the process instead of letting the GC do its job, and none of the other workarounds are needed.

That will do it.

‘tasks.withType(JavaForkOptions)’ is a bit aggressive. I’d use ‘tasks.withType(Test)’.

Okay, this is working beautifully now. I still had to separate the referent types into their own tasks to not overwork the GC. The book example didn’t work out, but I was able to adapt the similar integTest from spring-io/sagan that a Google search turned up.

Thanks!

Glad it worked out. Looking forward to seeing Caffeine develop.