PermGen leak in DefaultIsolatedAntBuilder/MutableURLClassLoader

I have a Gradle plugin that needs to run an Ant build with a certain classpath. I can use the internal DefaultIsolatedAntBuilder class to do that without causing a classpath conflict with the Gradle runtime classpath, but that in turn causes a java.lang.OutOfMemoryError: PermGen space error if I execute the code on the same Gradle daemon instance around 30-50 times.

Here’s a buildfile you can use to reproduce the issue:

// Enable the Gradle daemon and run this until failure:
// while gradle --rerun-tasks; do :; done

import org.gradle.api.internal.ClassPathRegistry
import org.gradle.api.internal.DefaultClassPathProvider
import org.gradle.api.internal.DefaultClassPathRegistry
import org.gradle.api.internal.classpath.DefaultModuleRegistry
import org.gradle.api.internal.classpath.ModuleRegistry
import org.gradle.api.internal.project.antbuilder.DefaultIsolatedAntBuilder
import org.gradle.internal.classloader.DefaultClassLoaderFactory

ModuleRegistry moduleRegistry = new DefaultModuleRegistry()
ClassPathRegistry registry = new DefaultClassPathRegistry(new DefaultClassPathProvider(moduleRegistry))
DefaultIsolatedAntBuilder antBuilder = new DefaultIsolatedAntBuilder(registry, new DefaultClassLoaderFactory())

defaultTasks 'test'

URL url = new File('/path/to/any/jar/file.jar').toURL()

antBuilder.execute {
    URLClassLoader classLoader = antProject.getClass().classLoader

    if (!classLoader.getURLs().contains(url)) {
        classLoader.addURL(url)
    }

    println classLoader.getURLs()*.toString()
}

task test {
    println 'hello world'
}

Gradle version:

~ λ : gradle --version

------------------------------------------------------------
Gradle 2.11
------------------------------------------------------------

Build time:   2016-02-08 07:59:16 UTC
Build number: none
Revision:     584db1c7c90bdd1de1d1c4c51271c665bfcba978

Groovy:       2.4.4
Ant:          Apache Ant(TM) version 1.9.3 compiled on December 23 2013
JVM:          1.7.0_45 (Oracle Corporation 24.45-b08)
OS:           Mac OS X 10.11.4 x86_64

https://github.com/asciidoctor/asciidoctor-gradle-plugin/issues/61#issuecomment-29147205 sounds like it might be related.

Hi Eero, I can try to help you since I introduced that IsolatedAntBuilder to you in the other thread.

I think that this isn’t a bug in Gradle.
Internal classes of Gradle aren’t “supported” so in general it’s not recommended to use them. In this case, please take a look how Gradle uses IsolatedAntBuilder in Checkstyle task.

Is there some reason for you to manipulate the classloader directly? I’d recommend to pass the classpath to the withClasspath method since that’s how it should be used.

Java seems to require -XX:+UseConcMarkSweepGC -XX:+CMSClassUnloadingEnabled JVM args to enable class GC. However there are quite a few Groovy bugs that leave references around and prevent the GC. Gradle has some workarounds in DefaultIsolatedAntBuilder to get around those issues. These workarounds are executed in the stop method. That’s another reason why the classloader shouldn’t be manipulated directly when using this class.

I hope this helps. I assume you are seeing some problems in your dita-ot-gradle plugin?

Hi Lari,

Thanks for the help again! Really appreciate it.

Internal classes of Gradle aren’t “supported” so in general it’s not recommended to use them. In this case, please take a look how Gradle uses IsolatedAntBuilder in Checkstyle task.

I appreciated that. However, since adding anything into the (non-isolated) Ant classloader directly potentially causes a conflict with the Gradle runtime, I see no other option than using the internal Gradle classes.

Is there some reason for you to manipulate the classloader directly? I’d recommend to pass the classpath to the withClasspath method since that’s how it should be used.

The reason is that it doesn’t work. Using withClasspath doesn’t seem to actually make those classes available. There’s a comment in IsolatedAntBuilder.java which seems to suggest that that only makes the classes available to the ant taskdef and typedef tasks.

However, I’m not using those tasks — I need the libraries available for the whole Ant process.

I hope this helps. I assume you are seeing some problems in your dita-ot-gradle plugin?

Yeah, it’s pretty much either try to get this approach to work or give up, since I see no other way to add libraries into the Ant classpath than using the internal classes when running it via Gradle. If you have any suggestions, I’d be happy to hear about them.

(By the way, since PermGen was removed in Java 8, this issue likely mostly affects Java 7 and earlier, but I suppose that doesn’t mean the memory leak goes away altogether.)

I guess one potential solution might be to somehow add the classes I need into the Ant classloader only when the Gradle daemon initially starts, but I’m not sure there’s a way to do that.

Yes I agree. No one is going to stop you from using the internal classes. :slight_smile: It’s just that we might change the API of those classes without any notice.

I assume you mean the thread context classloader when running the Ant task? There could be other ways to load classes, but I think that the typical way is to rely on thread context classloader.

I’d try to get the IsolatedAntBuilder.withClasspath working and try to change the thread context classloader while running the task. Could you try it out and possibly push your experiment to some branch so that I could try to help with it?

Exactly.

Yes I agree. No one is going to stop you from using the internal classes. It’s just that we might change the API of those classes without any notice.

Yep, I’ll just have to live with that and update my plugin accordingly. :slightly_smiling:

I assume you mean the thread context classloader when running the Ant task? There could be other ways to load classes, but I think that the typical way is to rely on thread context classloader.

Yes, exactly — pardon the vagueness.

I’d try to get the IsolatedAntBuilder.withClasspath working and try to change the thread context classloader while running the task. Could you try it out and possibly push your experiment to some branch so that I could try to help with it?

Thanks for the suggestion — I’ll give that a try and post here when I’ve come up with something.

Many thanks again for the assistance!

FYI, the DefaultIsolatedAntBuilder does change the context classloader already:

Please push your experiment to some branch once you get so far so I could try to help getting it to work by using IsolatedAntBuilder.withClasspath.

Well, I was wrong — the problem is that setting the thread context classloader isn’t enough to make the classes available. The classes specifically need to be in the Ant project classloader (or in the system classloader, I guess, but that’ll cause conflicts).

This buildfile illustrates the issue:

import org.apache.tools.ant.Project as AntProject

FileCollection classpath = fileTree(dir: 'dita-ot/src/main/lib').matching {
    include '**/*.jar'
}

task test1 << {
    ClassLoader old = Thread.currentThread().getContextClassLoader()
    URLClassLoader classLoader = new URLClassLoader()
    classpath.each { classLoader.addURL(it.toURL()) }
    Thread.currentThread().setContextClassLoader(classLoader)

    ant.ant(antfile: 'dita-ot/src/main/build.xml') {
        property name: 'args.input', location: 'examples/simple/dita/root.ditamap'
        property name: 'transtype', value: 'html5'
    }

    Thread.currentThread().setContextClassLoader(old)
}

task test2 << {
    URLClassLoader classLoader = AntProject.class.getClassLoader()
    classpath.each { classLoader.addURL(it.toURL()) }

    ant.ant(antfile: 'dita-ot/src/main/build.xml') {
        property name: 'args.input', location: 'examples/simple/dita/root.ditamap'
        property name: 'transtype', value: 'html5'
    }
}

Here’s how you can try it out:

  1. Check out the repo for my plugin.
  2. Run git submodule update --init --remote.
  3. Run cd dita-ot && gradle and wait for the build to finish.
  4. Save the buildfile above as test.gradle in the root of the repo and run gradle -b test.gradle test1 --info and gradle -b test.gradle test2 --info. test1 doesn’t work, test2 does.

Kinda running out of options here… http://enitsys.sourceforge.net/ant-classloadertask/ works if I run it via Ant, but if I run it using Gradle’s IsolatedAntBuilder and do something like this:

antBuilder.withClasspath([new File('ant-classloadertask.jar')]).execute {
    taskdef(resource: 'net/jtools/classloadertask/antlib.xml')

    classloader(loader: 'project') {
        classpath {
            pathelement(path: 'dita-ot/lib/dost.jar')
        }
    }
}

Gradle complains that:

> classloader doesn't support the "loader" attribute

Even though it actually does. I guess I could add a build.xml file into my plugin that uses the Ant classloader task and run that instead of using the Gradle DSL…

I’ve pretty much reached an impasse. If I add libraries into the Ant project classpath outside IsolatedAntBuilder, there are classpath conflicts, and if I do it inside IsolatedAntBuilder, there’s a prohibitive memory leak (the PermGen error occurs after about 20 executions on Java 7 for me).

By the way, as far as I can tell, DefaultIsolatedAntBuilder leaks memory even if I don’t add anything into the Ant project classpath. Take this buildfile:

import org.gradle.api.internal.ClassPathRegistry
import org.gradle.api.internal.DefaultClassPathProvider
import org.gradle.api.internal.DefaultClassPathRegistry
import org.gradle.api.internal.classpath.DefaultModuleRegistry
import org.gradle.api.internal.classpath.ModuleRegistry
import org.gradle.api.internal.project.antbuilder.DefaultIsolatedAntBuilder
import org.gradle.internal.classloader.DefaultClassLoaderFactory

ModuleRegistry moduleRegistry = new DefaultModuleRegistry()
ClassPathRegistry registry = new DefaultClassPathRegistry(new DefaultClassPathProvider(moduleRegistry))
DefaultIsolatedAntBuilder antBuilder = new DefaultIsolatedAntBuilder(registry, new DefaultClassLoaderFactory())

task test1 {
    antBuilder.execute {
        echo 'hello world'
    }

    antBuilder.getClassLoaderCache().stop()
    antBuilder.stop()
}

task test2 {
    antBuilder.execute {
        echo 'hello world'
    }
}

Here’s a YourKit memory snapshot after just a few executions of test1:

It’s a similar story if I run test2.

I’m far from an expert in analyzing memory leaks, but I don’t know if there’s supposed to be that many MutableURLClassLoader objects.

EDIT: I’m meant to say that I’m literally not an expert, just the opposite — I might be completely off track here. :slight_smile:

FWIW, I get a PermGen error after ~55 runs on the same daemon by just using withClasspath, too:

import org.gradle.api.internal.ClassPathRegistry
import org.gradle.api.internal.DefaultClassPathProvider
import org.gradle.api.internal.DefaultClassPathRegistry
import org.gradle.api.internal.classpath.DefaultModuleRegistry
import org.gradle.api.internal.classpath.ModuleRegistry
import org.gradle.api.internal.project.antbuilder.DefaultIsolatedAntBuilder
import org.gradle.internal.classloader.DefaultClassLoaderFactory

ModuleRegistry moduleRegistry = new DefaultModuleRegistry()
ClassPathRegistry registry = new DefaultClassPathRegistry(new DefaultClassPathProvider(moduleRegistry))
DefaultIsolatedAntBuilder antBuilder = new DefaultIsolatedAntBuilder(registry, new DefaultClassLoaderFactory())

task test << {
    // It doesn't seem to matter which JAR file I pass to withClasspath.
    antBuilder.withClasspath([new File('dita-ot/lib/xml-resolver-1.2.jar')]).execute {
        echo 'hello world'
    }
}

FWIW, the workaround I came up with for this issue was to construct my own instance of DefaultIsolatedAntBuilder and stick it into thread-local variable.

I’m frankly not sure why exactly, but that lets me do multiple consecutive runs on the same daemon that reuse the same DefaultIsolatedAntBuilder instance without the memory leak or classpath conflicts crashing the daemon (except if I use the --parallel option, but DITA-OT is not thread-safe anyway, so that’s a bit of a non-starter). The performance is about the same as when I didn’t use DefaultIsolatedAntBuilder, too.

I do realize that thread-locals are kinda evil, though, so I very much welcome any suggestions to improve the code.