5.3.1, 6.8.3: ubuntu: zip distribution replaces BOM with question marks

I have a v5.3.1 (edit: I can also reproduce this behavior with 6.8.3): build script that creates a zip distribution including files that have substitutions performed on them like so:

apply plugin: 'distribution'
distributions {
    main {
        contents {
            baseName = project.ext.mavenArtifactId
            into ("/node_modules") { from "${projectDir}/node_modules" }
            into ("/src") { 
                from "${projectDir}/src" 
                    filter{ it.replaceAll('\\$VERSION_MAJOR\\$', '1')}
                    filter{ it.replaceAll('\\$VERSION_MINOR\\$', '4')}
                    filter{ it.replaceAll('\\$VERSION_BUILD\\$', project.hasProperty('buildNumber') ? project.buildNumber : defaultBuildNumber)}
            }
...

This has worked fine for years, until we recently changed the os where this build script is run. It used to run on Windows, then on Centos for a couple days until we needed to switch again to Ubuntu.

Now that we are on Ubuntu (18.04.4 LTS - Bionic Beaver), the javascript source files that are included in that filter are getting the character sequence ??? injected at the beginning of these files where you’d typically find the UTF-8 BOM, and the BOM itself is now removed.

Has anyone else encountered this problem? Is there a workaround that doesn’t involve stripping the BOM out of the source files?

When you do filter { ... }, the files are read and written as text.
For this it has to use an encoding.
If the encoding that is used does not match the files you are manipulating it might destroy your files.
If you don’t specify an encoding, the systems default encoding is used, whatever that is currently.
You should always specify the filteringCharset on the CopySpec and if the files you filter have different encodings, you need to use different copy specs as unfortunately filteringCharset is a property of CopySpec and not of ContentFilterable (Make filteringCharset a property of ContentFilterable · Issue #1938 · gradle/gradle · GitHub)

1 Like

excellent - thank you. adding the encoding fixed the problem.