Good day!
At my company we have gradle build cache node installed, and it can be under pretty heavy load (35+ rps at peaks), because of the number of developers and commits.
Lately we started seeing this error come up in nearly all builds:
Could not load entry c9cb9a9025f3db96589b76afd8316793 from remote build cache: Loading entry from 'https://<cache-addr>/cache/c9cb9a9025f3db96589b76afd8316793' response status 504: Gateway time out
We tried to find a rootcause, but failed:
Node currently has 8 cores, but is not using more than 50% of CPU
We gave server 4G of RAM instead of default 1G, no effect, but, at least, we excluded GC from reasons list.
Network TX/RX as well as SSD are also underutilized
I’m now leaning to an idea, that node just has N threads and unable to serve more than N concurrent connections, and N is fixed. But I cannot find any info on this in docs, as well as there is no guide to scaling the node.
Is number of threads hardcoded, and there is no possible way of scaling the node?
Also, it left me quite frustrated to find out, that, althou I can see 504 in build logs, there aren’t any indication of it in node.log or even stdout/stderr. How can I find a reason of 504, if node gives me no data to work with?