Identical Gradle distributions duplicated in user home


(Marcin Kuszczak) #1

In our team we have many projects which are using Gradle. Every project have in its root directory ‘externals’.

|-hpa |

– externals |-shardconnector

– externals

Externals folder is shared between all projects (It’s implemented using Subversion externals property, so it is same for every project). In ‘externals’ folder we Gradle distribution: gradle-1.4-bin.zip. Gradle wrapper is using this exact distribution. Now, when I am building projects I can see that every project is creating it’s own distribution in my home:

c:\Users\me.gradle\wrapper\dists\gradle-1.4-bin\

It causes big problems on CI server, as we have 7 projects using gradle. Every project is taking in CI server home about 80 MB for same files: zipped gradle distribution and its unpacked version. Currently its about 500 MB on server and will be even more if we will add another projects. Also other (human) users can use Gradle and the same problem will appear by them.

My question is if it is necessary to take into consideration Gradle distribution path? If there is no essential reasons behind current behavior, I would say that this is a bug.

I would be thankful for your comments and eventually solution for this problem.


#2

The Gradle wrapper is designed to download Gradle from a known location and make it available for local use. It does this by storing the distribution in the ‘~/.gradle’ directory. As far as I know, there’s no way to prevent the Gradle wrapper from copying the distribution zip file and unzipping it.

I assume that you have your Gradle Wrapper configured to ‘download’ the distribution from the SVN externals directory instead? Are all of your CI jobs running with different user home directories?


(Marcin Kuszczak) #3

Storing Gradle distribution in home directory is ok. I assume it is necessary for Gradle to work. Problem is that same distribution is copied multiple times.

CI jobs are run by same user. It’s not root cause of problem.

I think that current Gradle behavior is like below (I didn’t look at source code, though): 1. Gradle wrapper is calculating hash code from whole path to Gradle distribution. 2. If there is directory named with same hash in c:\Users\me.gradle\wrapper\dists\gradle-1.4-bin\ it is assumed that distribution is already installed. If there is different hash code of distribution path, then distribution is installed again.

Assumption behind above algorithm works in many cases. But in general you can not assume that different distribution location always mean different distribution. E.g. in my case different distribution paths point to same distribution.

I think it would be better to make distribution resolution in 2 steps: 1. Gradlew should first check distribution path (as it is fast) 2. Then if there is already information in cache it should use installed distribution 3. If there is not information in gradlew cache about distribution, then there should be calculated hash from whole distribution zip (or at least from distribution name). a. if according to calculated hash there is already installed gradle distribution in dists directory, cache should be updated and existing distribution should be used. b. if there is no distribution in dists, then distribution should be downloaded, unpacked and installed.

Above algorithm should be fast and should not cause duplicated distributions, even if distribution paths are different.


#4

It seems odd that you’re experiencing this behaviour. What does your ‘grade-wrapper.properties’ file look like? Are you configuring the ‘distributionPath’ or the ‘distributionUrl’?


(Marcin Kuszczak) #5

Well, it works this way. It’s simple to reproduce, I have even created test case:

  1. Create 3 projects in some directory: project1, project2, project3 2. Put into projects following ‘gradle.build’ file:
apply plugin: 'java'
task wrapper(type: Wrapper) {
 gradleVersion = '1.5-rc-3'
 distributionUrl = 'gradle-1.5-rc-3-bin.zip'
}
  1. In every project’s root directory execute: gradle wrapper 4. Put bin distribution of gradle 1.5 rc3 into created /gradle/wrapper directory in every project 5. Execute ./gradlew test in root of every project 6. You will see that gradle is unzipped and installed 3 times - once for every project

I have a zipped files, but can’t attach it to this thread.


#6

Thanks for the detailed report. I wasn’t aware of the fact that we use a hash of the distributionUrl to isolate Gradle distributions downloaded from different locations. While this is safe, it is not a very efficient mechanism.

At some point we would like to convert the Wrapper to use the same caching/download code that we use for dependency downloads: this always stores files using the artifact hash, and makes use of published ‘.sha1’ files to reduce redundant downloads.

I’ve raised GRADLE-2732 for this issue. The only workaround I can see is for you to ensure that the distributionUrl used by the Gradle wrapper is always the same for a particular Gradle version. I understand this may not be convenient in your case.


(Marcin Kuszczak) #7

Thanks for information!

I will try to migrate Gradle distribution to storage available through http.

My intention while reporting this issue was mostly to let you know about it. This knowledge might help others, which will experience same issue.


(Marcin Kuszczak) #8

It’s in fact issue: GRADLE-2723.

Any time plan to fix this issue? Can you provide information what is a current status of this issue?


(Luke Daley) #9

The reason for this is that gradle distributions can be customised, and there is also no other way to “cache” downloads.

What’s your use case for checking the Gradle distribution into your project? This is not a common practice.