Details
-
Bug
-
Status: Resolved
-
P2
-
Resolution: Fixed
-
2.20.0
-
None
Description
While using `WriteToParquet` I encounter this issue
File "/usr/local/lib/python3.7/site-packages/apache_beam/io/iobase.py", line 1066, in finish_bundle self.writer.close(), File "/usr/local/lib/python3.7/site-packages/apache_beam/io/filebasedsink.py", line 423, in close self.sink.close(self.temp_handle) File "/usr/local/lib/python3.7/site-packages/apache_beam/io/parquetio.py", line 538, in close self._flush_buffer() File "/usr/local/lib/python3.7/site-packages/apache_beam/io/parquetio.py", line 570, in _flush_buffer size = size + b.size AttributeError: 'NoneType' object has no attribute 'size'
This is because when instantiating an empty array `array=pa.array([])`, then `array.buffers()` returns `[None]`. However right now `_flush_buffer` always assume that buffers are not empty when incrementing the `size`.
One simple fix would be simply to add `if b is not None:` before incrementing `size`