Sharing/Managing global project between docker containers/hosts with Gradle5

I am aware of quiet a couple of questions in the forum/slack and in general about sharing the gradle cache and the new Problems ( gradle > 3). I want to gather, summarize but also narrow the scope of those question.

After i broadly migrated from maven to gradle, i focused on the bits and pieces of development/CI integrations with Gradle.
I found several issues with this setup in general while i was most probably tricked by one issue in particular and maybe digested the wrong thing out of that. Long story short, i learned that there are 2 caches:

a) project cache (class file caches and other): $PROJECT_PWD/.gradle
b) global cache (AFAIR that’s where all the nexus artifacts are gathered) ~/.gradle/cache

A. Project Cache
In usual docker-based java project a) becomes an issue by accident due to the nature of the layout of those projects:

~/project
  ./src
  ./build.gradle
  ./docker-compose.yml
  ./.gralde <- project cache

To be able to build in the container you need the build.gradle file and all its configuration, thus you usually you would mount the code-base using ./:/src into the container (so $PWD i ~/project) and thus accidentally leaking the project cache of the host-system to the container.
This leads to arbitrary issues with --continuous and much more and should be avoided - but i suspect since that is the very much default project layout for any java project, this will be done a lot.

To mitigate that issue i run all tasks in the container using --project-cache-dir=/tmp/projectcache so use a non-share folder.
This solves this issues very much finally.

A little side question - is there any way to set project-cache-dir using any `.properties file or such? Seems to be a cli parameter only ( i think we should do something about that ).

I understand that sharing project cache is not desired and have no further questions here - this was just a finding and a thing i had to learn ( the long /hard way ). AFAICS this particular topic is hardly ever covered when docker-based java development stacks are discussed.

B. Global Cache
First i learned tat sharing the global cache to the container using ~/.gradle/cache:~/.gradle/cache seems to be a bad idea leading to issues - but i seem i was wrong and my issues were from sharing the project-cache ( which i was not aware about at the time)

Since the global cache includes all the downloaded nexus dependecies, which could become quiet a list very fast, it is very desirable to share this, especially since a lot of common dependencies can be found across microservices.

Since those nexus-artefacts are normalized and well packaged, they are actually very “portable” and should be shareable without any big issues.

This leads to the actual question / suggests / thought:


Question
How would one share the gradle-cache which holds all the dependency artifacts across systems/containers,

Motivation
The motivation is for development and CI.

I) Development: This would massively speed up initial compile time across all microservices in the docker container. not downloading the nexus artefacts for each container over and over again.

II) CI : A even more important case is CI, where integration tests, build times can be cut down by more then 50% sharing the artefact storage and is supported widely by a lot of CIs by defining e.g. ~/.m2 being a persitent storage across all build containers.


The scope of this question should not be, how to share compiled classes, build artifacts or more complex, project specific or very “context aware” assets - but only share the dependecy artefacts which are very portable by nature.

  • Can one share ~/.gradle/cache for this purpose, should we share a folder in a deeper level to avoid sharing build caches? According to this issue it seems to be problematic - but is there any part where just the dependencies are stored?
  • Is this possible at all? Is Gradle remote cache an alternative ( or the only alternative ) - is this one part of EE only (for this purpose) - i am not sure this would be the right reference - at least according to that EE Gradle uses it for storing build cache ( not dependency cache)

In case, i am perfectly aware on running nexus-proxies and all those things, which could solve some issues for CI running it there - but the transport / IO / time it needs to consume the packages from a nexus is by far optimal - and for developers thats hardly a good option, i do not thing a lot of companies run an nexus-proxy per office to reduce downloads - still IO is a huge factor anyway, not to speak about the devop complexity … compared to just sharing / mounting a local folders ( as we all used to do with `~/.m2 )

Thanks for reading!

UPDATE:

A good read on the different caches can be found here

1 Like

Using nexus/artifactory or anything, even if it’s a local one, is at least an order of magnitude slower than using the file cache. So this solution is ruled out immediately.

In our earlier project we encountered this issue even with non-containerized builds. We had to set up each executor on the Jenkins nodes to use its own cache directory. This was in the Gradle 1.x ages, so quite some time ago.

Then we containerized the builds. (This was in the early days even before any plugins existed for managing Docker slaves in Jenkins) It became impractical to mount a directory and try to keep it consistent. We decided it was easier to warm up the cache in the image build phase. This way most of the dependencies were already there during build. And, of course, an image build was quite easy to use to keep the cache very much up to date. The drawback of storing the cache this way is the larger image size. But storage is much cheaper than time. But considering Docker’s layering approach, this actually helped us reduce the duplication.

Another approach, if supported by the build system, .e.g Circle CI, is caching the directories. Without build system support this can get messy and probably raises the IO requirements quite a bit.

So, although the limitation is an annoying, I believe it is better to not rely on infrastructure provided features to keep the build fast. I prefer not to have to mount anything and make the caching an inherent feature of the way we build the software. Not having to do any setup and having the caching work out of the box everywhere (CI and dev machines) is a huge benefit.

Sorry if i may confused, it seems like this is kind of conflicting in itself. What CircleCI/Concourse/Travis and others are offering is cache-directories - right.
But then you talk about “not wanting to mount anything” - but that is exactly what those systems are doing - they mount a global named volume on the location and remount it on the containers again and again.

Of course, you do no need to do it yourself - but it is easy to do this as a developer locally to - and that is a good thing :slight_smile:

Just wanting to clarify to achieve what you enjoy on CircleCI, mounted folders are needed - thus gradle exposing nexus / dependencies in a specific folder becomes a requirement.

Maybe i just got you the wrong way, but i think in the end we agree on the above, do we? Thanks!

Just wanting to clarify to achieve what you enjoy on CircleCI, mounted folders are needed - thus gradle exposing nexus / dependencies in a specific folder becomes a requirement.

Build platforms provide different capabilities. Unfortunately, I have no production related experience with Circle CI, I’m only aware of this feature from reading up the documentation. On Circle CI, yes, those are cache directories. I’m not sure if those are mounted or how those are implemented. But it’s internal to the platform implementation. The whole feature is platform specific.

So to clarify, I think it’s better to avoid using these in production setups. At least I would try to avoid depending on them as much as possible. I believe warming up Gradle/m2 caches during image build is fairly easy to implement the way I described and it’s a completely build platform independent solution.

I understand that there may be other types of artifacts that may be cached, too. Gradle implements a build cache for some of those use cases.

PS:

Of course, you do no need to do it yourself - but it is easy to do this as a developer locally to - and that is a good thing :slight_smile:

Unfortunately, as long as we don’t want to build as root, mounting directories on dev machines is another horrible story. :slight_smile:

I try to bust some myths if you allow

  • a) Cache directories as a CI feature are global folders or in docker, global name volumes mounted into each docker container

  • b) the feature of “Cache Directory in CI” is absolutely no platform dependent. It can be done with and without docker, with or without circle ci.

This is all you need

docker volume create gradlecache
docker run --name service1 -v gradlecache:/root/.gradle/cache alpine
docker run --name service2 -v gradlecache:/root/.gradle/cache alpine

This is easy as pie, it can be done with anything, including non docker NFS/SMB mounts or whatever. It is by no means a feature by CircleCI since it’s just basic tool usage.

Same works and worked for ~/.m2

  • c) We are not talking about a build-cache in terms of class-files or something else - just dependencies and thus portable .jar only

  • d) Using it inlcuded in the docker layers is by no means a good solution and by no means it is “saving space or time”. If you build 2 different images with the exact same dependencies ( lets say a spring application with the exact same deps ) and build 2 docker images, the extra layer for the deps will be counted twice, transfered twice and taks 2x the storage on the target platform.
    It helps you when you create X containers from such an image, then “most” of the deps are preloaded and thus shared in the layer and the runtime takes less space in the end. But still you are transporting the same dependecies over and over again for all the spring-microservices you have, 95% of the size will be spring, copied over and over again.
    I would count this solution as a very specific, limited use cases one and by no means as generic as a shared folder would be - and also far less efficient and even more complex since this includes actually changing your dockerfiles actively, while share folders does not require this change.

  • e) Root permissions are not required when you properly use global volumes and use gosu or whatever to create a non-root user on each image, which always ends up to be uid/gid 1000:1000 - so you can easily share those in the end. Having them all root is not an issue either - for sure you would not need to “match your uid/gid” of your host - never mount host folders

To cut it short. I hope we can agree agree that an inevitable advantage of the image-based caching is that it works independent from the build platform or the capabilities of Gradle. Computing the efficiency of this solution depends on many factors. I wouldn’t simplify it so much. In our case it caused a few GBs transferred once-twice a week (when the cache was updated) and prevented several GBs to be transferred during every, probably TBs cumulative. The benefits were immediate and long term.

Note to your examples. Fortunately, the world has realized early on that making things look like files is a general solution that makes this abstraction suitable for several types of problems. However, sometimes we need to understand its implications. If we consider a database, it would be a bold statement to claim that putting its files on a shared filesystem and running multiple instances of the engine on the same files is safe. In this respect, regarding the Gradle cache as a database can be a valid statement.

In case of Maven, the cache is a significantly simpler construct. For Maven, it’s really just files without much contention and concurrency handling involved. Maybe “dumbing down” the sophisticated (coordination) features of the Gradle cache could result in a good enough solution, too.

I’m not completely aware of the technical issues why Gradle does not support the mounted volumes. It may be worth looking into the code and see how it could be fixed.

I am genuinely surprised that that is working.

What I observed when Gradle was running multiple, parallel builds was that there would be collisions with the lock files and builds would subsequently fail. Hence why I went and wrote the BASH script in issue 851 to copy files and checksums back-and-forth.

Whilst this is not optimal, it does work and avoids the lock-file collisions.

Thanks for starting the conversation here, and let us indeed be clear that there are many Gradle caches.

I will comment only on what seems to be the main issue discussed here:

Sharing the downloaded dependencies between builds.

All this data is to be found under ${GRADLE_USER_HOME}/caches/modules-<version>

And by downloaded dependencies we mean multiple things:

  • metadata file(s) (Maven, Ivy or lately Gradle)
  • and their associated binaries (JARs mostly but not only).

All these files are found in ${GRADLE_USER_HOME}/caches/modules-<version>/files-<filesversion>

But the Gradle dependency cache is more than just that, there are other dependency related caches:

  • Indexes to map a given repository / module pair to files on disk. That is useful because it allows Gradle to differentiate between library foo from JCenter and library foo from Maven Central. Of course they should not be different, but if they are, builds are able to see consistently the difference. These indexes currently have absolute paths in them, making their copy useful only when GRADLE_USER_HOME does not change.
    (found in ${GRADLE_USER_HOME}/caches/modules-<version>/metadata-<version>/*.bin
  • Parsed module metadata, in a binary form that can be loaded fast by Gradle instead of having to reparse the text format.
    (found in ${GRADLE_USER_HOME}/caches/modules-<version>/metadata-<version>/descriptors

Gradle goes to great length to guarantee that this cache will not be corrupted with half downloads, empty files and the like. This is one of the reasons behind its current inability to share this cache between multiple running instances from isolated environments.
When multiple build run on the same machine, there are lock / unlock operations coordinated across the different builds. But with the boundaries of a container, the Gradle daemons cannot synchronize for this.

As commented, copying and re-using the files cache is your best action for now.

Anything smarter will need Gradle support. This discussion, here and elsewhere, and the active issues show that the community cares about this issue. However the Gradle team cannot (yet) share a design or implementation timeline for this topic.

1 Like

Offt, I was not aware that the caches used absolute paths. Luckily that will not affect us at the moment (within any given build Container, our paths will always be the same and the Host never accesses them) but I could see this biting someone rather hard if they did not know.

I appreciate the lengths Gradle is going to in order to resolve some problems, it’s just an unfortunate fact of life that these edge cases come up.

Thanks for the explanation. I can understand why isolated containers are not simple to coordinate. However, I have a few observations:

  • What if is it ensured that the cache is not used in isolated environments concurrently? With jobs not being able to run in in multiple instances at the same time, coordinating within a single container should not be prohibited.
  • When we started to spin up several builds in the same container concurrently, we started seeing unpredictable errors. The cache locking was completely overwhelmed and messed up. We had to resort to running all builds with their own caches. I will have to try with the newest versions of Gradle but this seems to be a quite adverse result despite the sophisticated methods to keep the cache consistent.

Hi @EugenMayer,
Acording to your last answer, seems like you found right solution for question B with Docker cached volume in case global gradle cache

Right answer for question A is add .gradle directory to .dockerignore file.

@ljacomet Thank you so much for the insight, very helpful. I will need some time to process / follow up on that in detail.

It is by far not so easy, as @ljacomet pointed out - the best solution for the global cache / dependency cache seems to be Mastering Gradle Caching and Incremental Builds | by Fedor Korotkov | CirrusLabs | Medium and interestingly, the solution is already part of a CI (Cirrus CI). I see that lock-files are handled and some other things @ljacomet has mentioned.

@ljacomet could you comment a little on the outlined solution in the article - for me it sounds valid.

That is wrong - sorry. .dockerignore deals with the exclusion of folders/files during building the docker-context during docker build and has nothing to do with volume mounts - the latter cannot have any excludes and that makes the project layout of Gradle one of the key elements and issues.

1

it’s not very complex, this is how I’m building and running tests in one of my projects using docker cached volume

test all apps:
  stage: test
  before_script:
    - docker build --force-rm --no-cache
                   --build-arg GRADLE_TASKS='firefox build -S -b ./apps/build.gradle --no-daemon'
                   -f ./settings/docker/Dockerfile.all-tests
                   -t all-tests .
    - docker volume create data
  script:
    - docker run -v data:/home/e2e/.gradle --rm --name run-all-tests all-tests

and my Dockerfile.all-tests file looks like so:

FROM daggerok/e2e:trusty-xvfb-jdk8-firefox-v3
LABEL MAINTAINER="Maksim Kostromin <maksim.kostromin@...com>"
# /home/e2e/app
WORKDIR 'app/'
ARG GRADLE_TASKS='tasks --all'
ENV GRADLE_TASKS_ENV=${GRADLE_TASKS} \
    JAVA_OPTS="${JAVA_OPTS} -Xms1g -Xmx1g -Dselenide.driverManagerEnabled=false"
ENTRYPOINT start-xvfb && ./gradlew ${GRADLE_TASKS_ENV}
COPY . .

Here, if data volume was already created previously, I will run my tests imidiatelly without resolving missing dependencies

2

  1. build Docker image you going to run with .dockerignore
  2. If on Docker run you will mount only gradle build files with sources directories like you mentioned, then local .cache directory will be created in runtime inside container, so it shouldn’t cause any issues during development with -t

probably not applicable for you, but idea is same:

I did in a past such development with maven for spring-boot app with devtools:

docker run --rm --name dev -it \
  -v data:/root/.m2 \
  -v $(pwd):/tmp/app \
  -w /tmp/app \
  maven:3.6.0-jdk-8-alpine mvn spring-boot:run

and everything worked nice for me.
maybe you can use similar approach for gradle, something like this:

docker run --rm --name dev -it \
  -v data:/root/.gradle \
  -v $(pwd):/tmp/app \
  -w /tmp/app \
  openjdk:8u201-jdk-alpine3.9 ash -c './gradlew test -t'

on first build you going to download dependencies from scratch, but next time this will be not needed anymore, because of docker cached volume

Are you running builds in parallel? (Which means the data volume will be used by multiple containers at the same time)

Yes, I’m running them all in parallel. But data volume container is going to be use 1 per runner. We have 24 gitlab runners, each runner will have its own data volume cached container and once dependencies was resolved on same node, they will be reused next time when that node will be used again

Hmm…OK. I experienced issues with multiple Gradle Containers running at once and trying to share the same volume; they hit problems with the lock files.
I still share the volume, but now copy all the files into the global cache within the container, do the build, then copy any updates back (ignoring lock files etc, obviously).
It’s not ideal but it does work.

When multiple build run on the same machine, there are lock / unlock operations coordinated across the different builds. But with the boundaries of a container, the Gradle daemons cannot synchronize for this.

What mechanism does Gradle use for synchronization? Containers can share filesystems and networks. Can this be set up in a way that makes the container layer transparent to gradle?

would also like to know if there’s any answers to this question