Details
-
Bug
-
Status: Resolved
-
P0
-
Resolution: Fixed
-
None
Description
Non-batch requests uses RetryHttpRequestInitializer, which set read timeout as 80 seconds, and does more retries.
Google Cloud auto generated Json library doesn't set HttpRequestInitializer for batch requests.
GcsUtil uses storageClient.batch(), and it is defined in here:
https://github.com/vparfonov/google-api-java-client/blob/master/google-api-client/src/main/java/com/google/api/client/googleapis/services/AbstractGoogleClient.java#L256
Without the HttpRequestInitializer, the default read timeout is 20 seconds.
Possible fix is: https://github.com/apache/incubator-beam/pull/1608
In additional, we can partially rollback https://github.com/apache/incubator-beam/pull/1359 to keep using non-batch API for fileSize() for single files. This will make sure existing code will keep work as the same way.
PR: https://github.com/apache/incubator-beam/pull/1611