Description
We have introduced a blocking version of writer in Java that enables readers to efficiently skip large arrays and maps. The avro format encoded arrays and maps by encoding the number of elements by the the elements themselves. Zero element count indicates that the array/map has ended. The change we introduced is that if the element count is negative, it is followed by the byte-count of the encoded elements that follow. The reader, on seeing a negative element count should flip the sign to get the actual number of elements. In addition, if it is interested in supporting fast skip, it should use the byte count to skip the elements en-bloc instead of decoding them individually. If it does not want to support fast skip, it has to just read the byte count and ignore its value.
The changes are already made in Java's ValueReader to support this. Similar changes need to done in Python as well.
Attachments
Attachments
Issue Links
- is blocked by
-
AVRO-88 BlockingBinaryEncoder should override writeEnum() method
- Closed