Signing breaks reproducible builds?

Following the docs, I have set up my build process to produce reproducible builds, which initially worked just fine:

1dd63064294febffc20a4611b25ad619  machine1/app-release-unsigned.apk
1dd63064294febffc20a4611b25ad619  machine2/app-release-unsigned.apk

The APK resulting from running the build process on either machine 1 or machine 2 share the same MD5 checksum. However, when I add signing (with the same keystore and the same key alias of course) to my build.gradle, I am starting to see differences:

57adfd7c2a7240dc9c3ed79a525f71be  machine1/app-release.apk
91e9168e5d5242059408b4fdff0e34aa  machine2/app-release.apk

The APK contents stay identical though, as I was able to verify by extracting each APK and calling find . -type f -exec md5sum {} \; | sort | md5sum on its contents:

be2eec6df5ead25198bc0a711849b3ee  -  # cumulative checksum of contents from machine1/app-release.apk
be2eec6df5ead25198bc0a711849b3ee  -  # cumulative checksum of contents from machine2/app-release.apk

Is this a bug or is that just what happens when an APK gets signed, i.e. despite using same certificate, some other metadata must be added which inevitably leads to differing outputs?

I was of the impression, that - assuming I have access to the original keystore, which is the case on both machine 1 and machine 2 - it would be possible to produce two identical signed APKs. Only if someone else (who does not have access to my keystore) wants to reproduce the build, they would have to resort to techniques like signature copying.

I have no idea how Android signatures are created as I’m not into Android development.
But typically a signature contains a verified timestamp of a trusted timestamping authority,
so that the signature is also valid after the signing certificate expired or was revoked,
because with that timestamp you know when the signature was created and whether the signature was done when the certificate was still valid.

If that is also the case for the Android signature, every signing process will produce a different result unless you disable the timestamping.

Björn, thanks for the reply! Yes, what you’re saying absolutely makes sense. I was basically wondering what the point of a reproducible build is, if the final signed binary has a differing checksum every time anyway.

However, I didn’t realize how straightforward it is to simply remove the signature from a signed upstream release - and then you’re obviously able to verify the reproducible build again, by comparing it to your local unsigned build:

$ # Mismatching checksums due to signature
$ md5sum *
44a01e06d17913b51de6130f6ddb995c  app-release.apk
66f8a12ed303bf7fd446fdbeb7e59030  app-release-unsigned.apk
$ # Remove signature from signed release
$ zip -d app-release.apk META-INF/CERT.SF META-INF/CERT.RSA META-INF/MANIFEST.MF
deleting: META-INF/CERT.SF
deleting: META-INF/CERT.RSA
deleting: META-INF/MANIFEST.MF
$ # Now both APKs are unsigned and checksums match 🎉
$ md5sum *
66f8a12ed303bf7fd446fdbeb7e59030  app-release.apk
66f8a12ed303bf7fd446fdbeb7e59030  app-release-unsigned.apk

Conclusion: It’s okay that signed builds have differing checksums every time. Use unsigned builds to verify the reproducible build process by comparing your locally created unsigned build to the upstream signed build after removing its signature.

I was basically wondering what the point of a reproducible build is, if the final signed binary has a differing checksum every time anyway.

Depends on the intention for a reproducible build.
A reproducible build also means that there are no unlocked version ranges and thus not use different versions of libraries to compile against when compiling at different points in time, so reproducible builds are also for reliable builds and no surprises, not necessarily to have a byte-by-byte identical result.

If you want it for verification and it is a signed jar, yes, just delete the signature files before doing the comparison.