Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
3.5.0, 4.0.0
Description
When an Avro file contains empty blocks, Spark returns 0 records while "fastavro" and "avro-python-3" both read the file correctly and return records.
This is due to the way Spark handles empty blocks (or does not handle). Call to `hasNext` loads the next block and if that block is empty, it returns false. But instead of exiting the loop, we need to probe the next block until sync point is reached.
Attachments
Issue Links
- links to