Ant filtering (ExpandProperties) does not handle UTF8 encoding


(jesper.holmberg) #1

We are in the middle of transitioning from a rather large Maven build, and because of this, I need to do filtering on some resources containing the ${property} kind of place holders. In previous gradle builds, I have always used camel case properties, but as I can’t change the way Maven is set up, I am forced to use properties in the form of multi.level.property, i.e. with dots in them. Because of this, I use the Ant ExpandProperties filter.

The problem is that ExpandProperties doesn’t seem to handle UTF-8 very well. Files that are filtered get corrupted if they use non-ascii characters.

I have the following build.gradle:

import org.apache.tools.ant.filters.ExpandProperties
  apply plugin: 'java'
  processResources {
    project.properties.each() { k, v ->
        if (v != null) {
            ant.properties[k] = v
        }
    }
     filter(ExpandProperties, project: ant.project)
 }

And the following file in src/main/resources/test.txt:

Try “this”.

Note the curly quotes in the resource file. This file actually doesn’t contain any placeholders, but that’s just because I’ve simplified the example as much as I can.

When I look at the resulting test.txt inside the build jar, it looks like this:

Try “thisâ?.

So the curly end quote has not been handled correctly during filtering. If I don’t use the ExpandProperties filtering, the file is copied correctly, so I assume it is a problem with the Ant filter. Is this a known limitation/bug in the Ant filter, and does anyone have an idea how to get around it?


(Dmitry Maslakov) #2

I have same issue with ‘Copy’ task and custom filter class. The source file is text in utf8. After copying the resulting file contains ? in place of any unicode character (multi-byte characters are translated to one-byte question mark).

I suppose that the problem is where the data is read from filter class instance (as char[] where symbols are still correct) and written into target file using some instance of ‘Writer’ without explicitly specified encoding (so that default encoding from system property ‘file.encoding’ is used).

To verify that I’m right, it’s enough to add command line parameter ‘-Dfile.encoding=UTF-8’. Although it could be a workaround, it has undesired side effects.


(Luke Daley) #3

This is a known issue: GRADLE-1267