While publishing a large file, 1.6GiB, to an S3 bucket using the maven-publish plugin, I keep getting the following exception in Gradle 6.0:
Caused by: com.amazonaws.ResetException: The request to the service failed with a retryable reason, but resetting the request input stream has failed. See exception.getExtraInfo or debug-level logging for the original failure that caused this retry.; If the request involves an input stream, the maximum stream buffer size can be configured via request.getRequestClientOptions().setReadLimit(int)
According to various resources, including, it seems this is due to the reset it occurring further back that the read limit on the stream. One can change the read limit, either by calling the function mentioned in the error, or by passing the following option to the AWS SDK:
I’ve verified that the option above modifies the read limit accordingly, but at the moment I’m running out of Java heap space, so the upload still fails.
I’m not sure if the S3 upload API needs to be used in a different way to support retries for large files, or if running with a large Java heap is my only option, but any suggestions are welcome.
I started running into this issue recently on Gradle 8.3. After looking through the maven publish plugin source code I found that it does switch to multi-part upload when the upload size exceeds roughly 100MB (104857600 bytes, to be exact). So the stream buffer doesn’t need to be as big as your largest published file (1.6G in your case), it only needs to be big enough for the largest single partition (100MB).
100Mb is still too large in my opinion but it’s a fair bit smaller than the over 1G file I need to publish, and with this setting I’m able to publish without intermittent failures and without running out of heap space.
In my opinion this should be addressed. Since the maven publish plugin is setting the partition size for the multipart upload, it should also set the stream buffer size for the request to match that partition size instead of requiring a global setting to be changed. I think the 100M threshold is also a little too big of a threshold for switching to multipart, especially in multithreaded builds where each thread could be allocating that much.
If you think “this should be addressed”, you should probably address it correctly.
This is a community forum and writing a post here will most probably not cause any code change.
Instead you should open one or multiple feature requests or bug reports or pull requests on GitHub.