Use zipTree as source set for building and static compiling

Background

Project Alice generates Java source code, stores it in sources.jar, then uploads it to a Maven repository. Project Bob pulls sources.jar down and needs to use it when compiling. Bob does not know that Alice exists, only where to find sources.jar.

Versions: JDK 11, Gradle 7.3.1, IntelliJ IDEA 2021.3.1

Problem

Making gradle (and IntelliJ’s IDEA) build using source files embedded in a JAR file. To be clear, the JAR file contents resemble:

$ jar -tvf sources.jar 
     0 Thu Feb 03 08:38:56 PST 2022 META-INF/
    52 Thu Feb 03 08:38:56 PST 2022 META-INF/MANIFEST.MF
     0 Thu Feb 03 08:38:30 PST 2022 com/
     0 Thu Feb 03 08:38:32 PST 2022 com/domain/
     0 Thu Feb 03 08:38:30 PST 2022 com/domain/package/
   938 Thu Feb 03 08:38:32 PST 2022 com/domain/package/SourceCode.java

Attempts

A number of approaches have failed.

sourceSets

Changing sourceSets doesn’t work:

sourceSets.main.java.srcDirs += "jar:file:${projectDir}/sources.jar!/"

The error is:

Cannot convert URL ‘jar:file:/home/user/dev/project/sources.jar!/’ to a file.

Using a zipTree with sourceSets doesn’t work, although the error message is telling:

sourceSets.main.java.srcDirs += zipTree(file: "${projectDir}/sources.jar")

Error:

Cannot convert the provided notation to a File or URI.

The following types/formats are supported:

  • A URI or URL instance.

This was expected. What was unexpected was that URL instances are allowed, but seemingly not if embedded within a JAR file.

The following allows building Bob, but the IDE is unable to find SourceCode.java:

sourceSets.main.java.srcDirs += zipTree("${projectDir}/sources.jar").matching {
  include "com"
}

build task

Modifying the build task to extract the generated code first partially works:

task codeGen {
  copy {
    from( zipTree( "sources.jar" ) )
    into( "build/gen/java" )
  }

  sourceSets.main.java.srcDirs += ["build/gen/java"]
}

build { doFirst { codeGen } }

The issue is that removing the build directory then prevents static compiles (because IDEA cannot find the generated source files). In any case, we don’t want to extract the source files because of all the knock-on problems.

compile task

The following snippet also does not compile:

tasks.withType(JavaCompile) {
  source = zipTree(file: "${projectDir}/sources.jar")
}

And not updating sourceSets ensures that the source files inside the JAR file are not discoverable within the IDE.

Question

How would you instruct Gradle to reference and build source files that are stored in an external Java archive file when compiling a project?

I didn’t really understand why you don’t want to extract the source files first.

create a Sync task, which extracts the Jar and add the output of the task to your sourceSets.

I’ve tested the following and it works:

val fromJarTask by tasks.creating(Sync::class) {
    from(zipTree("test1.jar"))
    into(layout.buildDirectory.dir("jarContent"))
}
sourceSets[SourceSet.MAIN_SOURCE_SET_NAME].java.srcDir(fromJarTask)

(You will need to convert the kotlin code to groovy)

Thank you, Christian, the sync task looks promising.

Unfortunately, the following code causes static build failures in the IDE after running a clean:

def syncTask = task sync(type: Sync) {
    from zipTree("${projectDir}/sources.jar")
    into "${buildDir}/src"

    preserve {
        include 'com/**'
        exclude 'META-INF/**'
    }
}

sourceSets.main.java.srcDir(syncTask)

We could extract the files into the main source directory, instead, such as:

def syncTask = task sync(type: Sync) {
    from zipTree("${projectDir}/sources.jar")
    into "${projectDir}/src/gen/java"

    preserve {
        include 'com/**'
        exclude 'META-INF/**'
    }
}

sourceSets.main.java.srcDir(syncTask)

While that addresses the clean issue, we’re left with the original problems that we’d like to avoid.

I didn’t really understand why you don’t want to extract the source files first.

There are undesired knock-on effects, including:

  • Editable. Extracted source files can be edited in the IDE. We’d like them to be read-only. We could add a task that sets the files read-only, but that feels like solving the wrong problem (added complexity).
  • Synchronization. When a newly generated sources.jar is pulled down, we’ll have to delete the src/gen/java directory to remove any stale .java files that were preserved. If there was a way to avoid extraction, then the act of pulling down the new sources.jar file would ensure correctness (no added complexity). By extracting the .java files, it’s possible to enter an inconsistent state:
$ jar -tvf sources.jar | grep java$ | wc -l
61
$ find src/gen -name "*java" | wc -l
65

If there was a way to treat sources.jar as a source directory without extracting the files, these knock-on effects disappear.

We can add a content root to a module through IntelliJ IDEA as follows:

  1. Click File > Project Structure
  2. Click Modules
  3. Expand and click module name (e.g., main)
  4. Click Sources tab
  5. Click Add Content Root
  6. Browse to and select sources.jar
  7. Click OK
  8. Click Mark as: Sources
  9. Click OK to accept the additional sources

For the most part, this gives us the desired results: an externally generated jar file, read-only source files, no need to extract .java files, no additional configuration steps, and guaranteed consistency. However, this only works within the IDE, compiling from the command-line won’t honour the IDE settings, of course.

Can build.gradle use the idea plugin accomplish the same effect? If so, any ideas how?

@thangalin

You don’t want to manually configure project modules in IntelliJ. You want it to automatically read those things from the Gradle configuration directly itself.

Here is a simple example that I tried and got to work doing what you are trying to do. I made this example using a configuration. Just change the dependency to a Maven format.

After building/compiling, you will see that the sources files are in build/tmp/unzipFiles, and you will see that the files were compiled into build/classes/java/main. After you set this up, you will have to do “Reload All Gradle Projects” from the Gradle tool tab in IntelliJ for it to fully recognize these changes.

As far as the build issues in IntelliJ after clean, I don’t know if there is a way that IntelliJ can be aware that a clean happened and automatically execute a compile task. After clean, if you run the Gradle task (from IntelliJ or command line) “compileJava” or “build” or “classes” …anything that will trigger “compileJava” to run, then the files will be unzipped again due to the task dependencies.

Perhaps you might find a solution to automatically running compile after clean here: Gradle tasks | IntelliJ IDEA

build.gradle.kts

plugins {
    `java-library`
}

repositories {
    mavenCentral()
}

val sourcesJar: Configuration by configurations.creating

dependencies {
    sourcesJar(files(layout.projectDirectory.file("files.jar")))
}

val unzipFiles by tasks.registering(Sync::class) {
    from(zipTree(sourcesJar.singleFile))
    into(temporaryDir)
}

tasks.named("compileJava") {
    dependsOn(unzipFiles)
}

sourceSets {
    main {
        java.srcDir(unzipFiles)
    }
}

The Sync task has the attribute: fileMode which would allow you to set unix file permissions:

val fromJarTask by tasks.creating(Sync::class) {
    from(zipTree("test1.jar"))
    into(layout.buildDirectory.dir("jarContent"))
    fileMode = 444
}

This however doesn’t prevent modifications in a Windows environment.

Using a coniguration as demonstrated by EarthCitizen is definitely the correct way.

You should not extract into your source folder.
If you find a way to automatically extract the jar sources (probably using a sync task) for the intellij ide, please let me know.

Thank you.

Unfortunately, my Gradle knowledge isn’t deep enough to resolve the error encountered. We’re using Gradle 7.3 and the val sourcesJar: declaration results in the following error:

Could not find method val() for arguments [{sourcesJar=interface org.gradle.api.artifacts.Configuration}] on project ‘:x’ of type org.gradle.api.Project.

Regarding:

As far as the build issues in IntelliJ after clean, I don’t know if there is a way that IntelliJ can be aware that a clean happened and automatically execute a compile task.

I’m guessing that it may be possible using the idea plug-in. Without IntelliJ doing a static compile, the IDE complains of compile errors because it cannot find the source code.

Thanks again for taking time to respond, Christian.

You should not extract into your source folder.

We really want to avoid extracting the jar file at all, for those editing and synchronization reasons. Many of our developers use Windows. Maybe it’s simply not possible. Java’s compiler API uses a JavaFileObject, itself a FileObject. It may be that there’s no implementation for FileObject that Gradle can use for reading and compiling .java files via jar: URIs. The API documentation for Gradle shows that only file: URIs are acceptable.

Are you using the Groovy (build.gradle) or Kotlin (build.gradle.kts) DSL?

We’re using Groovy, such as:

plugins {
    id 'idea'
    id 'java-library' // added
}

group = 'com.domain.package.alarms'
description = """Alarms Classes"""

apply from: "${GRADLE_SCRIPT_FOLDER}/junitSettings.gradle"

// added
val sourcesJar: Configuration by configurations.creating

dependencies {
    api project(':common')
    api project(':project.alarms')
    api project(':project.common')

    testImplementation project(path: ':common', configuration: 'tests')

    // added
    sourcesJar(files(layout.projectDirectory.file("${projectDir}/sources.jar")))
}

val unzipFiles by tasks.registering(Sync::class) {
    from(zipTree(sourcesJar.singleFile))
    into(temporaryDir)
}

tasks.named("compileJava") {
    dependsOn(unzipFiles)
}

sourceSets {
    main {
        java.srcDir(unzipFiles)
    }
}

OK. You are using the Groovy DSL. You need to created the configuration like this:

configurations {
  sourcesJar
}

And the task creation syntax is different for Groovy:

task unzipFiles(type: Sync) {
    from(zipTree(sourcesJar.singleFile))
    into(temporaryDir)
}

This appears to be incorrect:

sourcesJar(files(layout.projectDirectory.file("${projectDir}/sources.jar")))

It should be:

sourcesJar(files(layout.projectDirectory.file("sources.jar")))

Almost works?

Could not get unknown property ‘sourcesJar’ for task ‘:x:unzipFiles’ of type org.gradle.api.tasks.Sync.

The Gradle docs show a different syntax:

java {
    withJavadocJar()
    withSourcesJar()
}

However, that syntax appears to be for creating a project-sources.jar file after building the project, rather than reading one as input for compiling the project.

I forgot one thing:

task unzipFiles(type: Sync) {
    from(zipTree(configurations.sourcesJar.singleFile))
    into(temporaryDir)
}

As sourcesJar was no longer a global variable, it has to be referenced from the configurations container.

You can rename the configuration sourcesJar to anything you wish. It does not need to have that exact name.

Here is the example I gave using Groovy syntax. I verified that it worked the same:

plugins {
    id 'java-library'
}

repositories {
    mavenCentral()
}

configurations {
    sourcesJar
}

dependencies {
    sourcesJar(files(layout.projectDirectory.file("files.jar")))
}

tasks.register('unzipFiles', Sync) {
    from(zipTree(configurations.sourcesJar.singleFile))
    into(temporaryDir)
}

tasks.named("compileJava") {
    dependsOn(tasks.unzipFiles)
}

sourceSets {
    main {
        java.srcDir(tasks.unzipFiles)
    }
}

I don’t know the challenge you are facing, but the ideal situation to avoid editing would be to use a JAR that is pre-compiled. I am a bit confused as to why, if you have access to a Maven repository, that you don’t simply download a JAR with class files instead of source.

Anyhow, you don’t need to worry about editing, because if the extracted files are modified, Gradle should detected this, erase the copies, and extract them again from the JAR.

Thanks again. The net effect is similar to the following in that the IDE cannot resolve/statically compile the source files within the .jar file:

task genSources {
    copy {
        from zipTree("${projectDir}/sources.jar")
        into "${buildDir}/src"
    }

    sourceSets.main.java.srcDirs += "${buildDir}/src"
}

gradle.projectsEvaluated {
    compileJava.dependsOn(genSources)
}

This still leaves the original problems: clean will prevent the IDE from finding the necessary .java files, build/src can get out-of-sync with respect to sources.jar, and the extracted files are not read-only.

I appreciate the help. I suspect there’s no solution to this problem. We may have to bite the bullet and merge the separate projects, rather than pull the sources.jar file from a remote repository.

the ideal situation to avoid editing would be to use a JAR that is pre-compiled

Circular dependencies. We have:

  • Project-HAL (C source files, Java API, JNI calls, hardware abstraction layer)
  • Project-Java (uses Java API from Project-A, main application entry point)
  • Project-CodeGen (generates C header files and Java files using XML/XSLT)

Each project is in its own repository. The .java files produced by Project-CodeGen cannot be compiled without classes from Project-Java. I think the decision to split Project-HAL from Project-Java was a poor design choice that doesn’t offer strong enough benefits to keep them separated. That is, Project-HAL is only used by Project-Java, so coupling them in the same repository is fine. At that point, we can bring Project-CodeGen into the fold as well, resolving the circular dependencies, but it’ll take a lot of time to merge the projects.

OK. Well the Gradle code I provided works except that the IDE will not run the initial extraction for you automatically. Once the initial extraction is done, complication will work. From the command line, all will work automatically.

In addition to the example I have provided, I have used this technique for code gen scenarios in the past. Once the generated (unpacked) code exists, there are no errors in the IDE (IntelliJ).

If you find a way to automatically extract the jar sources (probably using a sync task) for the intellij ide, please let me know.

The following may do the trick (without extraction):

plugins {
    id 'java'
    id 'idea'
}

tasks.withType(JavaCompile) {
    source(zipTree("${projectDir}/sources.jar"))
}

idea.module.iml {
    withXml {
        def baseUrl = 'jar://$MODULE_DIR$/sources.jar!/'
        def component = it.asNode().component[0]
        def jarContent = component.appendNode('content', [url: baseUrl])
        jarContent.appendNode('sourceFolder', [
                url: "${baseUrl}com",
                isTestSource: false,
            ])
    }
}

You’ll have to run ./gradlew idea before starting the IDE. The code recreates the steps to set the content root as per the .idea/misc.xml file using the idea plugin. I haven’t verified whether the solution works.

Thank you both for all the insights and help.

1 Like

I am no longer persuaded that your Java Code shouldn’t be placed into the src folder.

If you think of your code as having been generated from the jar file it’s not uncommon to generate into src.

You can also use the idea plugin to tell intellij, that this is generated code:

// https://stackoverflow.com/questions/46640670/how-to-configure-gradle-for-code-generation-so-that-intellij-recognises-generate
apply plugin: "idea"

sourceSets.main.java.srcDir new File(buildDir, '${buildDir}/generated-src/')
idea {
    module {
        // Marks the already(!) added srcDir as "generated"
        generatedSourceDirs += file('${buildDir}/generated-src/')
    }
}

Still necessary to extract the jar, but at least intellij should provide warnings in case you try to modify the code.

That’s another good approach, Christian. I couldn’t make it work. The following partially worked:

apply plugin: "idea"

idea {
    module {
        copy {
            from zipTree("${projectDir}/sources.jar")
            into "${buildDir}/generated-src"
        }

        sourceSets.main.java.srcDirs += "${buildDir}/generated-src"

        generatedSourceDirs += file("${buildDir}/generated-src")
    }
}

This still has the problem that clean will cause a static build failure within the IDE. The IDE warning is nice. Re-extracting the files after a clean seems to resolve the issue:

tasks['clean'].doLast({
    copy {
        from zipTree("${projectDir}/sources.jar")
        into "${buildDir}/generated-src"
    }
})