Hi team. Build caching is indeed pretty awesome.
Local caching is super easy to setup, however it doesn’t really flush out all the build issues I’m seeing when I try out remote caching. I want to share one clarification and ask a question (which I think might be a bug).
First, let me explain my environment. I’m testing a large build with ~1,400 tasks on two different machines. One is linux with en_US locale and, of course, UTF-8 is the default charset. The other is a Win 10 Japanese PC. Its default charset is “windows-31j” and system prop file.encoding is “MS932” a.k.a. Shift-JIS; in other words, most definitely not UTF-8. Despite these environmental differences, all sources are UTF-8 with unix-style line breaks. Finally, local caching is disabled on both machines but they can push to the nginx Docker container suggested by Gradle, running on a different VM. Of the 1,400 tasks I mentioned, 168 are cacheable compilation tasks which I’m expecting to be loaded from the remote cache.
Ultimately, the ‘classpath’ property hash of the compile spec is differing on the two machines and I’m having cache misses. When I compare the SHAs of each JAR in the classpaths, I see that the locally built JARs representing other subproject dependencies have different hashes. This modifies the overall ‘classpath’ hash in the compile spec and I’m forced to recompile, even though the sources as inputs for the JARs are identical.
So, the first speedbump I hit is related to CopySpec#filteringCharset (PR) and Windows. I want to repeat it here in hopes it helps someone else out:
- CopySpec#filteringCharset defaults to Charset#defaultCharset
- The Jar task is an AbstractCopyTask, which implements CopySpec
- AbstractCopyTask has two CopySpecInternal references, rootSpec and mainSpec
Even though the Jar Task is not cacheable, instances of CopySpec are showing up as inputs to its cache key. Here’s what I see in my logs:
[DEBUG] [o.g.c.i.t.DefaultTaskOutputCachingBuildCacheKeyBuilder] Appending taskClass to build cache key: org.gradle.api.tasks.bundling.Jar_Decorated
...
[DEBUG] [o.g.c.i.t.DefaultTaskOutputCachingBuildCacheKeyBuilder] Appending inputPropertyHash for 'rootSpec$1.filteringCharset' to build cache key: d746f44d09fb58d2971f341d24d74c35
...
[DEBUG] [o.g.c.i.t.DefaultTaskOutputCachingBuildCacheKeyBuilder] Appending inputPropertyHash for 'rootSpec$2$1.filteringCharset' to build cache key: d746f44d09fb58d2971f341d24d74c35
...
[DEBUG] [o.g.c.i.t.DefaultTaskOutputCachingBuildCacheKeyBuilder] Appending inputPropertyHash for 'rootSpec$2.filteringCharset' to build cache key: d746f44d09fb58d2971f341d24d74c35
...
[DEBUG] [o.g.c.i.t.DefaultTaskOutputCachingBuildCacheKeyBuilder] Appending inputPropertyHash for 'rootSpec$1' to build cache key: 5cd9402cddb4eab80587e698245670a2
[DEBUG] [o.g.c.i.t.DefaultTaskOutputCachingBuildCacheKeyBuilder] Appending inputPropertyHash for 'rootSpec$2' to build cache key: d41d8cd98f00b204e9800998ecf8427e
[DEBUG] [o.g.c.i.t.DefaultTaskOutputCachingBuildCacheKeyBuilder] Appending inputPropertyHash for 'rootSpec$2$1' to build cache key: 2bb8f38c29360e36d2e7e00893dd165f
[INFO] [o.g.a.i.t.e.ResolveBuildCacheKeyExecuter] Cache key for task ':xxx:yyy:jar' is b44ea89ad1e475a2f38b774fba5533b6
It may be inconsequential, but on the Windows machine with Shift-JIS encoding, the filteringCharset hash was different. This may not have affected the bytes of the resulting JAR, but just to make sure I worked around this with a small bit of configuration:
tasks.withType(AbstractCopyTask) { AbstractCopyTask task ->
rootSpec.filteringCharset = java.nio.charset.StandardCharsets.UTF_8.name()
mainSpec.filteringCharset = java.nio.charset.StandardCharsets.UTF_8.name()
}
This resolved the problem that the inputs to the Jar task were now identical on linux vs. Windows.
But now I have a second issue; that is, the hash of the resulting JARs do not match which causes recompilation. Upon a folder comparison of the same JAR, exploded, on both test machines, I see that the contents are identical. But, I guess as a byproduct of the zip algorithm (?), their hashes are unequal and I have cache misses. Am I doing something wrong or is this an unanticipated use case?
I’m happy to share more detail if necessary. Thanks in advance.
- Kyle