Buildscript.classpath behaviour when folder is missing


(Vadim Chekan) #1

Hi all.

I experienced unintuitive gradle behavior when making a task which deploy yarn application to hadoop cluster. Hadoop requires configuration files to be in classpath and my deployment script is implemented not as subproject but as gradle task itself. So I’ve added folder to buildscript’s classpath.

buildscript {
    dependencies {
        classpath files("build/yarn-conf")
        ...
   }
}

The problem is, that this folder will be created in one of the tasks later and at the moment of gradle start it does not exist. In this case, this classpath call is just ignored (classloader.getResource(‘yarn-site.xml’) will return null). But if the folder exists, let’s say from previous build, then it works just fine (classloader will find resource).

I solved this problem by moving my yarn/hadoop tasks into another gradle file and invoking it as GradleBuild task. External task invocation and classpath evaluation is performed at the moment when config folder exist and this is why it works. but I would like to learn more about the reasons why classpath behaves like file system snapshot at the moment of evaluation. I would think that “classpath files(…)” would just add url to collection and each time classloader.getResource() is called, every url is evaluated. But it seems to me, some validation is performed and if at the moment of adding url does not exist, it won’t be added at all? Any insights on how classpath works would be greatly appreciated.


(Stefan Oehme) #2

The buildscript is what builds your code. It cannot depend on the code it is building. That would be a chicken-and-egg situation.

Instead, use a configuration to hold the classpath for the tool you want and instantiate your own UrlClassLoader in the task’s implementation.


(Vadim Chekan) #3

Thank you Stefan,

Using my own class loader is much cleaner way. I’ve ended up with this code:

def startYarnJob(samzaConfigName) {
    def url = new File("${buildDir}/yarn-conf").toURI().toURL()
    def oldLoader = Thread.currentThread().contextClassLoader
    Thread.currentThread().setContextClassLoader(new URLClassLoader([url] as URL[], oldLoader))

    ...
    org.apache.samza.job.JobRunner.main(params)

    Thread.currentThread().setContextClassLoader(oldLoader)
}

(Stefan Oehme) #4

Hey Vadim,

if you want to run a main class, you can use a JavaExec task to make your code much cleaner still. You won’t need any classloader magic, as it will be an isolated process.

Cheers,
Stefan