Strip toplevel directory from tarTree

Is there a way to strip a toplevel directory when copying from a tarTree (or any other archive)?

I try to copy the full contents of the tomcat .tar.gz download into my project distribution. the Tarball contains a single directory called “apache-tomcat-x.x.x”, but I want to have it in the distribution as “tomcat”.

So I need to be able to either

  • rename the directory while copying OR

  • strip the directory from the target path entirely (and use into())

I’d like to avoid renaming the target directory after copying, as I’m trying to directly copy into a distribution archive.

Thanks for any hints Rainer

Check out the ‘rename()’ method of copy task, or the ‘eachFile {}’ API which gives you powerful file by file renaming/relocating capabilities.

‘rename()’ doesn’t work, as it works on the file name only, not the full path.

Thanks for the pointer to ‘eachFile {}’, I overlooked that. I found one problem though: the directories within the tree are still created with their original paths though (this is within a larger Tar task with a lot of child specs):

into("tomcat"){
   def tomcatFile = configurations.tomcat.singleFile
   def archivePath = tomcatFile.name - ".tar.gz"
   includeEmptyDirs = false
   from(tarTree(tomcatFile))
   exclude "**/webapps/**"
   eachFile {details ->
    def targetPath = (details.path - "$archivePath/")
    details.path = targetPath
   }
  }

The Tar file contains (tar -tf | sort):

server-4.3.0.999/tomcat/
// these dirs are unwanted
server-4.3.0.999/tomcat/apache-tomcat-7.0.23/
server-4.3.0.999/tomcat/apache-tomcat-7.0.23/bin/
server-4.3.0.999/tomcat/apache-tomcat-7.0.23/conf/
server-4.3.0.999/tomcat/apache-tomcat-7.0.23/lib/
server-4.3.0.999/tomcat/apache-tomcat-7.0.23/logs/
server-4.3.0.999/tomcat/apache-tomcat-7.0.23/temp/
server-4.3.0.999/tomcat/apache-tomcat-7.0.23/work/
// these paths are correct
server-4.3.0.999/tomcat/bin/
server-4.3.0.999/tomcat/bin/bootstrap.jar
server-4.3.0.999/tomcat/bin/catalina-tasks.xml
server-4.3.0.999/tomcat/bin/catalina.bat
server-4.3.0.999/tomcat/bin/catalina.sh
// ....
server-4.3.0.999/tomcat/conf/catalina.properties
// ...

What happens if you specify ‘includeEmptyDirs = false’ at the top of the copy task? i.e. not in the ‘into’ child.

1 Like

makes no difference. I also tried to do something with ‘FileCopyDetails.isDirectory()’ but the eachFile closure seems not to be called for directory entries.

I can’t find a way around this using our API. You’ll have to manually remove the directories in a doLast for the time being.

I found a bunch of open issues about this in the tracker. GRADLE-2255 is an example.

If you don’t mind, I’ll close this topic off as there’s nothing more we can do until this issue is fixed.

I see.

Could you give me a hint on how to remove the directories from the target, as this is actually within a Zip or Tar archive task, not within a file copy task. Or do I need to resort to a copy-delete-archive setup? (I might have to do that for other reasons anyway)

That’s one option.

Another is to just add another filtering step…

task createZip(type: Zip) {
  appendix = "interim" // need to differentiate the name from the zip below
  // configuration
}
  task correctZip(type: Zip, dependsOn: createZip) {
  from zipTree(createZip.archivePath)
  includeEmptyDirs = false
}

Thanks for that suggestion. Do you see an advantage of a temporary archive, compared to a temporary directory?

I face another problem with the same task: I have some kind of overlay directory in the project, that replaces some files from this original archive - copying this directory additionally leads to duplicate archive entries. Of course I could try to define excludes, but “exclude all files that also exist in a different source directory” is not trivially possible, at least with my knowledge.

Both problems seem to be resolved more easily with a temp. directory, thus the question whether that has any practical disadvantage (except being an anti-pattern for the majority of standard use cases).

Go with the temp directory. If the original issue was the only problem, backing out my solution when the core issue is fixed would have been less work which is why I suggested it.

I’ll do that. Luke, thanks for your help - I appreciate that a lot.

Just came across another case of basically the same problem - this time copying a subtree of the archive into the destination. This seems to be quite a common use case - so could gradle provide some way to do the logical equivalent of

from("path/to/archive.zip/path/within/archive") {....}

Even when the original issue here is fixed, having to use an eachFile call and coding the destination path modification seems bloated and not gradle-like.

(Another approach would be a ‘rename’ method in ‘CopySpec’ that operates on the full path instead of the file name only, but I think specifying a source directory within an archive is expressing the intent much more clearly).

Should I make this a Jira issue or a separate suggestion post in the forum?

1 Like

Hi Rainer, please open a separate idea post in the forum. We can discuss it there.

I solved this problem using the gradle filter api.

Lets say you have a tar archive with a directory structure like:

org | gradle |
              | wrapper.properties
             | conf |
                    | gradle.conf

and you want to just copy gradle.conf and wrapper.properties into the rootDir without their parent directories. You can filter on the pathname like so:

copy {
   from tarTree(myArchive).filter{it.path.matches(".*(gradle.conf | wrapper.properties).*"}
   into rootDir
}

And that will copy only the two files, and not their directories.

original case was different: Source archive:

org | gradle |
              | wrapper.properties
             | conf |
                    | gradle.conf

Desired output:

gradle |
        | wrapper.properties
       | conf |
              | gradle.conf
1 Like