As I have been converting our existing make / ant build system over to Gradle, I’ve been running various build-time comparison tests. For a full clean & build, the new gradle build is much faster than our old make / ant build system (likely in large part because the old system had grown so large and cumbersome that it was doing a lot of work it didn’t need to be doing). However for small changes to files, the build times are much worse under gradle. According to our analysis, this is because Gradle is “doing it right” and Make & Ant are “doing it wrong”.
Specifically, if I change a single Java source file (say, add an empty line) then in Ant only that one file is generated. By default make behaves the same way. The result is, it is really fast to build! And, the resulting build is completely unreliable. If you changed a file in a way that requires other files to be rebuilt, you’re hosed. So you just have to know what you’re doing to decide whether you need to do a clean & build or just a build.
Gradle on the other hand recognizes that any file changed related to a task may require the task to be re-executed (maybe the end result of re-executing the task is an identical output, but Gradle doesn’t know that until the task runs so it has to re-execute the task whenever anything changed that may impact the final result). This produces reliable builds, but also longer build times. In my case adding a single empty line to Node.java rebuilds the entire “graphics” sub-project which itself takes 30 seconds and then builds a half-dozen other dependent projects which takes over a minute in total (and a full clean & build takes 2 minutes).
However, Intellij IDEA and Eclipse both seem to have the best of both worlds – reliable builds and exceptionally fast build times. I think the difference is that these IDEs both keep extensive metadata information about the classes in a Java project, and so they can minimize which files actually are recompiled.
For example, suppose that following a full build you were to analyze the resulting class files. Each class file indicates the various imports that it relies on. Coupled with the information Gradle already has regarding the class path, it would be possible to determine exactly which class files any given class file depends on.
Suppose I have A.java, B.java, and C.java:
public class A { } public class B extends A { } public class C { }
In this case B is dependent on A, but C is independent of both. In such a case, I should be able to determine statically that if A changes I only have to recompile A and B. If B changes I only have to recompile B. And if C changes I only have to recompile C. All of this holds true regardless of which projects or tasks depends on which other projects or tasks.
This level of optimization could be added to further refine the javac tasks (that is, determining which Java source files need to be recompiled) without affecting the other existing logic for tasks. For example, the tasks could still process resources in the same way as they do now, and task dependencies are calculated and executed exactly as they are now. The only difference is that if a meta-data database exists for the class files in question, then we know what needs to be recomputed. So it would work (perhaps) something like this:
Task buildA determines that A has changed. A set of all source files that rely on A.class is constructed based on pre-existing meta-data that was computed during the last full build cycle. As each remaining task executes, any javac execution will only build those files contained in this set. Note that this set contains the transitive closure of all files that are affected by a change in A. So it includes not only B, but suppose also D where public class D { B b = new B(“Some Constructor”); }.
It is not a simple thing to get right and testing such a system seems like quite a task, but if done right I expect there to be an incredible increase in build time performance for large-project incremental builds when a change to a single file or small number of files is performed.