Feature request reproducible jars

not-a-bug

(Marc Pawlowsky) #1

There is a lot of discussion on getting a byte-for-byte identical JAR file if a jar task is executed with the same inputs multiple times.

If we had two new parameters for the the Jar task, we should be able to get the same jar file, across multiple builds.

  • sort option that would ensure that the files are inserted into the jar file in a deterministic manner.

  • timestamp option that would override the timestamps of the component files, in a manner like tar’s`–mtime=date’
    http://www.gnu.org/software/tar/manual/tar.html


(Steven Ruppert) #2

Seconding this request. There is some slim documentation about making deterministic jars post facto from debian:

https://wiki.debian.org/ReproducibleBuilds/TimestampsInJarFiles

Bazel (google’s build system open-sourced) will produce jars with timestamps set to 80-Jan-01 01:00.

I have also used this simple script:

import java.util.zip.*;
import java.nio.file.attribute.FileTime;

public class EpochJar {
public static void main(String[] args) throws Throwable {
  FileTime epoch = FileTime.fromMillis(0);
  try (
    ZipOutputStream o = new ZipOutputStream(System.out);
    ZipInputStream i = new ZipInputStream(System.in)
  ) {
    byte[] buffer = new byte[8192];
    for (ZipEntry z = i.getNextEntry(); z != null; z = i.getNextEntry()) {
      o.putNextEntry(z.setCreationTime(epoch)
                      .setLastAccessTime(epoch)
                      .setLastModifiedTime(epoch));

      int n;
      while ((n = i.read(buffer, 0, buffer.length)) > 0) {
          o.write(buffer, 0, n);
      }
    }
  }
}
}

usage:

java -cp EpochJar.class <input.jar >output.jar

It’d be nicer to have this done automatically, however.


(Stefan Wolf) #3

Hi Marc and Steven,

as part of the task output cache project we need to fix this exactly for the reasons given since we would never be able to reuse a jar from the task output cache. So expect updates on this soon.

Regards,
Stefan