Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
This is a follow-up of the AVRO-1282 issue.
AVRO-1282 has used Unsafe to significantly improve performance of Reflection-based serialization. But Unsafe can be also used to improve performance of IO streams, which would be beneficial not only for Reflection-based, but for all kinds of serializers. Experience with Kryo shows that it can boost performance even higher that the speedups provided by AVRO-1282.
Pros:
- Overall performance boost
- Biggest speedups of this optimizations are expected for the arrays of primitive types, as they can be very efficiently written using bulk operations instead of writing their elements one by one.
- It is possible to write directly into the off-heap memory buffers at the native speed, without using intermediate byte arrays. This can be interesting for Big Data apps, which often keep a lot of data off-heap
Cons:
- Unsafe can efficiently write only primitive types in their native byte order and using their fixed size. This is not quite compatible with Avro's format.
(While one can still use Unsafe to more efficiently write single elements even using their variable length encoding, the biggest benefits of bulk array serialization would be lost.) - Introducing this feature may require a definition of a new format for Avro. This format would be very fast, but not very space efficient as it would use fixed-size representation.
BTW, initial tests where a sketch of the proposed optimization is applied only to Floats and Doubles has shown immediate boost of 35%.