Gradle cache over re-cloned repo

When your project build has Gradle property like org.gradle.caching=true and the sources of subprojects are not touched you will get the build product from the cache.

Surprisingly (what is not a bug I suppose) even if you delete the project workspace and then clone it from remote git repo you will still get the build products(output) from cache. [here I mean java classes built by ‘java’ plugin]

Maybe it is completely ok from Gradle perspective as Gradle examines only sources’ FileSystem properties and does not bother what kind of git orchestration set them up.
But for the build machine devOps it is hard to swallow … “we do not expect some subprojects’ classes to be taken from Gradle’s cache when new a repo has just been cloned …”

Let’s assume that devOps do not want to edit Gradle scripts in any ways…
they do not want to turn off org.gradle.caching=true
or add outputs.cacheIf(false) to some compileJava tasks
They do not also want to manually clean the Gradle cache.

Is there a way not to get the outputs from cache for some subprojects … just by using a specific switch in gradle build command-line? (only for the first build they run with the newly cloned repo)

To be honest, this request makes absolutely no sense.
One of the biggest strengths of Gradle is to avoid unnecessary work.
And if you compile the exact same sources with the same Java version and the same compiler configuration, why should Gradle not use the already existing result but redo the work it already did before?

By forcing it to do so the only change you get is, that you lost time needlessly.

When your project build has Gradle property like org.gradle.caching=true and the sources of subprojects are not touched you will get the build product from the cache.

That is not true.
The result will simply be UP-TO-DATE and not taken from cache.
If you change some sources and compile and then revert the change and compile again, then the results would be taken from cache, as the inputs changed from the last execution, but the result is present in the cache.

Surprisingly (what is not a bug I suppose) even if you delete the project workspace and then clone it from remote git repo you will still get the build products(output) from cache.

That is not surprising, but exactly what the build cache is for.
If you did the same task with the same inputs before, you can use the result that is already present.
This works if you have different workspaces with the same sources, it works if you switches back and forth between branches, it works if you change files and change them back, …
There is even a remote build cache that you could enable, so that even other machines could reuse the results that are already present, so that for example a CI build can fill the remote cache, and every developer that needs to do the same task with the same inputs can simply use the result the CI already has built and put into the remote build cache.

So no, this is not a bug, but exactly the build cache feature.

and does not bother what kind of git orchestration set them up

Of course it does not bother, how the inputs came into existence is absolutely irrelevant. If you need to do the same piece of work with the same inputs, there is no reason to redo the work instead of simply using the already present result.

we do not expect some subprojects’ classes to be taken from Gradle’s cache when new a repo has just been cloned

Really, this statement is complete non-sense.
It effectively says “I don’t want to use one of the biggest strengths of Gradle but instead waste time and resources needlessly”.

Let’s assume that devOps do not want to edit Gradle scripts in any ways…
they do not want to turn off org.gradle.caching=true
or add outputs.cacheIf(false) to some compileJava tasks
They do not also want to manually clean the Gradle cache.

Definitely, they shouldn’t.
At most the should use an init script which is the tool to customize a build from outside, but even that should not be used here.

Is there a way not to get the outputs from cache for some subprojects … just by using a specific switch in gradle build command-line? (only for the first build they run with the newly cloned repo)

Not for “some subprojects”.
If you want it for “some subprojects”, you probably need to use an init script that does tasks.configureEach { cacheIf { false } } on those subprojects.
To do it for all projects in a given build, you can simply say --no-build-cache.
Alternatively, on a freshly cloned project also --rerun-tasks has the same effect.
The latter would also prevent any UP-TO-DATE.
Another alternative would be to use an init script that disables the configured caches which then would have the same effect as --no-build-cache.

But really, non of that should be used, as the only “benefit” you get from it is wasted time and resources.

1 Like

Yes, but no sense questioning provokes answers which expose the deeper sense of good design and principles… as you presented above. Thanks a lot.

I must read your post again and again to understand why vanilla Gradle behavior does not stick to our project… (which might need reshaping)

I must read your post again and again to understand why vanilla Gradle behavior does not stick to our project

What do you mean?

I mean to comprehend and understand what is wrong in our script and tasks that standard Gradle behavior … “I don’t want to use one of the biggest strengths of Gradle but instead waste time and resources needlessly”. is against our devOpses’ expectations.

I think the issue is that we generate java sources(classes) basing on a template just before compileJava and we add these sources to the source set … on one hand these sources’ byte code is not cached … on the other hand without them the build fails.

What do admins observe? :
After project workspace deletion and new git clone they got the build failure …
IMO the task dependencies and task inputs & outputs are not defined correctly

I think the issue is that we generate java sources(classes) basing on a template just before compileJava and we add these sources to the source set …

That should not be a problem, especially if you do it correctly (not registering paths and doing explicit dependsOn, but registering the task that generates the files or a provider thereof as source directory directly).
All files that are compiled are inputs of the compile task and thus their result outputs that are then part of the cache.

on one hand these sources’ byte code is not cached

Why should it not?
If they are compiled, they are part of the cache.
And only if the generated files are the same as before the cache entry will be used.

Well, unless you do very strange things like registering a custom task action where you produce outputs from inputs that are not registered as inputs of the task, but that would also make up-to-date checks wrong, so should anyway be fixed.

After project workspace deletion and new git clone they got the build failure …

Then the cause should be fixed instead.
Disabling the cache is just symptom treatment, but does not help with solving the actual problem.
If you for example have inputs that are not defined, then as I said also the up-to-date checks would be wrongly up-to-date even if the inputs did have changed.

So the way better approach to make the devops happy is, to fix the broken build logic you have, as it also might have other bad effects besides those build failures.

IMO the task dependencies and task inputs & outputs are not defined correctly

Exactly, that is what you should fix. :slight_smile:

1 Like