Java Incremental Compilation

Hello,

What is incremental compilation? I guess incremental means, do small compilation steps after each small code change. Do as little compilation as possible (i.e. don’t compile what is not affected by the code change). Is this a good explanation? Does incremental compilation depend on the compiler or the mechanism (e.g. Gradle) calling the compiler and telling it what to do? So basically Gradle could do incremental compilation with every compiler, even javac? Or does Gradle need a special compiler and why?

First there was the magical Eclipse compiler. Without any noticeable CPU usage or delay your code was immediately executable. I think this is now possible for almost a decade. How was Eclipse doing this? I thought the ECJ was building an AST and whenever a code change took place exactly knew which classes were affected by that change. Rumor had it, that ECJ didn’t even need to compile entire class files but could just replace the code for a single method if the code change was restricted to that method. I am not sure if this is true - maybe somebody knows more?

Then there are build tools like ANT and Gradle. I always thought Gradle was doing incremental compilation even before 2.1? Is this incorrect? Was Gradle always compiling the entire project even if just a single local variable in a private class was modified? Or was it already doing something clever before 2.1, like at least checking file timestamps? Or is checking timestamps not very helpful as long as you don’t have an AST which tells you all the places where global variables etc. are used (inlined by the compiler)? Can you give a few major examples where a small change in a single class potentially affects the entire project and therefore demands compiling everything. I already mentioned global variables. What other examples are there?

Finally there is IDEA which seems to be married to Gradle eventually. IDEA has all kinds of confusing settings. First of all you can pick the Eclipse compiler, automatically make your project, run with compilation errors, etc. I think Gradle doesn’t really benefit from this, right? When using a non-Gradle setup, IDEA can use all these features and is always-ready when it comes to compilation - like Eclipse, correct? When using a Gradle project setup IDEA (I am using EAP14) still compiles its own code but also sometimes uses Gradle tasks (for example) to run a executable Class. Or is EAP14 now fully using Gradle and all the compiler settings in IDEA are not used at all? Can sombody clarify? Also: how will the future look like? Will IDEA be fully fused with Gradle or will IDEA keep doing its own separate compilation to get their own AST etc.?

I am very confused.

Cheers, Dieter

I’ll try to answer :slight_smile: BTW. Did you read the user guide section on the incremental compilation?

Do as little compilation as possible (i.e. don’t compile what is not affected by the code change). Is this a good explanation?

Yes. The high level goal is to make the dev experience better by offering faster and smarter builds. We want to compile as little classes as possible and ensure as little output class files are ‘changed’ after the compilation.

Does incremental compilation depend on the compiler or the mechanism (e.g. Gradle) calling the compiler and telling it what to do? So basically Gradle could do incremental compilation with every compiler, even javac? Or does Gradle need a special compiler and why?

Gradle uses standard java API for compilation (it’s what javac uses behind the hood). We perform bytecode analysis and select only specific classes for compilation.

Then there are build tools like ANT and Gradle. I always thought Gradle was doing incremental compilation even before 2.1? Is this incorrect?

Gradle’s compilation task was incremental ‘at the level of task’ since always. If you haven’t change any compilation inputs (source code, classpath, etc) and outputs are present/untampered, then the entire compilation task is considered UP-TO-DATE and compilation is skipped. The new 2.1 feature operates on the source class level. If only one class was changed, in theory it might be that only this single class needs to recompiled. This should save time and ensure that little outputs are changed (useful for use cases like jrebel).

Was Gradle always compiling the entire project even if just a single local variable in a private class was modified?

Yes :slight_smile:

Can you give a few major examples where a small change in a single class potentially affects the entire project and therefore demands compiling everything. I already mentioned global variables. What other examples are there?

For example, a class a non-private static constant. Due to compiler optimization we cannot detect all references to this constant from other classes via byte code analysis. So we recompile everything. Don’t do non-private constants :slight_smile:

Finally there is IDEA which seems to be married to Gradle eventually.

At some point yes but it’s not going to change soon (it needs a lot of effort). Currently compilation is separate in both tools and we strongly encourage that the IDE output dirs are kept separate from Gradle’s output dirs (that’s the default).

Hope that helps!

Thank you for the answer.

When I said global variables I meant something like public static int LENGTH = 10; You can detect those via byte code analysis but you cannot detect them if I add a ‘final’ and make them constant? I guess the compiler inlines constants and leaves no trace for byte code analysis to figure this out? What else can’t you detect via byte code analysis? Is there an extensive list?

I think the solution would be to use a compiler that does not inline non-privat constants. Did you ever consider using the Eclipse compiler?

Another thing that seems wasteful is how multiproject builds are handled. Example build ProjectA depends on ProjectB. ProjectB depends on ProjectC. The compile configuration of ProjectB contains the jar of ProjectC. Similarly ProjectA has a compile dependency on a jar of ProjectB. The problem here is, that if I change something in ProjectC and want to compile ProjectA, we not only have to recompile ProjectC but also generate its jar so ProjectB can be compiled. Then we generate the jar of ProjectB so ProjectC can be compiled. Why not put the main.output directories in the dependency? Why all this wasteful jar creation? A multi-project build should just be a flat pool of source files and their outputs for Gradle/the compiler.