Details
-
New Feature
-
Status: Resolved
-
P2
-
Resolution: Fixed
-
None
Description
I'd like to write elements as individual files.
Rather than smashing thousands of outputs into a handful of files as TextIO does (output-00000-of-00005, output-00001-of-00005,...), I want to write each element into unique files.
So if I used WholeFileIO from BEAM-2750 to read in three files (hi.txt, what.txt, and yes.txt) then I'd like to write the processed files out to individual files with user or data-defined filenames (like hi-modified.txt, what-modified.txt, and yes-modified.txt).
With a WholeFileIO, this would look like:
PCollection<KV<String, Byte[]>> fileNamesAndBytes = p.apply("Read", WholeFileIO.read().from("/path/to/input/dir/*")); ... // Do stuff that change contents and file names PCollection<KV<String, Byte[]>> modifedFileNamesAndBytes = ... ... modifedFileNamesAndBytes.apply("Write", WholeFileIO.write().to("/path/to/output/dir/"));
This ticket complements BEAM-2750.