Copy task takes multiple seconds although UP-TO-DATE


(haasip.satang) #1

Hi all,

We have some copy tasks that copy files from a zip file to the final distribution. Although the input zip files do not change those tasks take up to 10 seconds per task to execute before they state UP-TO-DATE. So having about multiple of those it takes up quite some time of the total build time (a little more than 50% of a complete UP-TO-DATE build to be precise) and I would like to reduce this as much as possible.

The configuration of the task is quite simple:

task getDojo(type: Copy) {

def source = file(project.configurations.distributeDojo.find { it.path.endsWith(".zip") })

into “${buildDir}/tmp”

from zipTree(source) } /* getDojo {

inputs.file ‘source’ } */

As you can guess by the name the task is copying the whole dojo distribution (about 3000 files) from the zip file (which was retrieved from Nexus). The problem seems to be that the up to date check is extracting the zip file in any case to generate hashes based on every file in the zip rather than to just generate a hash for the single zip input.

Is there a way to work around that? Thank :slight_smile:


(Peter Niederwieser) #2

Does this do any better in terms of performance? You’d have to change all similar Copy tasks in the same way, even if you don’t execute them. Or comment them out for the time being.

task getDojo(type: Copy) {
     into "${buildDir}/tmp"
     from { zipTree(configurations.distributeDojo.find { it.path.endsWith(".zip") }) }
 }

(haasip.satang) #3

Nope, unfortunately it doesn’t change anything. I figured that the problem seems to be related to both, input and output up to date checking:

It is taskArtifactState.isUpToDate() in SkipUpToDataTaskExecutor.execute that is taking long and I can see that quite some time get’s lost in the constructor of TaskUpToDateState for the calculation of outputFilesState and inputFilesState.

For the outputFilesState gradle seems to analyze the whole output dir containing the extraxted files from the previous run which also takes some seconds. Setting “outputs.upToDateWhen {true}” doesn’t prevent this check either. In my opinion the output check could be ignored completey as the only thing that matters here is the existence of a new version of the input file. Is there any way to disable this check completely?

For the inputFilesState it seems like no matter which of the configuration I use gradle always uses all the files in the zip to generate the hash.

Any other ideas?


(Tye Howard) #4

Have you tried extending the Copy task?

If you create a task with something like:

@Input
File zipFileToExplode

and after that just pass the ziptree to the Copy task, gradle should generate the checksum based on the entire zip.


(Peter Niederwieser) #5

This will add another input, but won’t make the existing ones go away. (It would also have to be ‘@InputFile’.)


(Peter Niederwieser) #6

For the outputFilesState gradle seems to analyze the whole output dir containing the extraxted files from the previous run which also takes some seconds.

This is expected.

Setting “outputs.upToDateWhen {true}” doesn’t prevent this check either.

I think the programmatic check is and-ed to the built-in declared outputs check in the same way that multiple programmatic checks are (as per the docs).

Any other ideas?

Instead of using a ‘Copy’ task, you could script a custom task (or task type) that internally uses the ‘project.copy’ method (which doesn’t do any up-to-date checking). This will allow you to tailor up-to-date checking to fit your needs. For example, you could declare one input (file), no outputs, and an ‘outputs.upToDateWhen’ that just checks if the target directory exists.

I’ve heard of plans to improve efficiency of up-to-date checks, but I can’t say when this will be tackled.


(haasip.satang) #7

Thanks Peter. That did the trick. The task now takes less than a seconds (if up to date) and was written like that:

def dojoSource = file(project.configurations.distributeDojo.find { it.path.endsWith(".zip") })
  task getDojo << {
 copy {
  into "${buildDir}/tmp"
  from zipTree(dojoSource)
 }
}
  getDojo {
 inputs.files dojoSource
 outputs.upToDateWhen { file("${buildDir}/tmp/dojo-1.5.0").isDirectory() }
}