Details
-
Bug
-
Status: Open
-
P3
-
Resolution: Unresolved
-
2.34.0, 2.35.0
-
None
-
None
Description
When calling apache_beam.io.fileio.WriteToFiles with a file_naming argument that adds a directory to the path, the current implementation fails to write files if a mkdirs or analogous call is needed in the underlying file storage.
Example,
apache_beam.io.fileio.WriteToFiles( path="some/base/dir", sink=..., destination=lambda x: "events", file_naming=lambda *x: "subdir/file.txt" )
the current fileio implementation will call mkdirs with some/base/dir instead of some/base/dir/subdir.
The bug is currently at https://github.com/apache/beam/blob/67bcf1e16e3fdf68cdea7a4b42b9c003e4b8948c/sdks/python/apache_beam/io/fileio.py#L605.
====
Personally, I would recommend changing the FileSystems interface to have `open` call `mkdirs` in storages that require root parent directory creation.