Details
-
Bug
-
Status: Resolved
-
P1
-
Resolution: Fixed
-
None
-
None
Description
When writing files using FileIO, the given temporary directory has a subdirectory created in it for each FileBasedSink. This is useful for non-windowed output where the temporary directory can be matched to delete leftover files that were lost during processing.
However for windowed writes such subdirectories are unnecessary and cause a common prefix to be shared for the temporary files. Additionally this common prefix varies per job and thus the autoscaling for the previous prefix is no longer effective, see
https://cloud.google.com/storage/docs/request-rate#randomness_after_sequential_prefixes_is_not_as_effective
Attachments
Issue Links
- links to