-
Type:
Sub-task
-
Status: Closed
-
Priority:
Major
-
Resolution: Fixed
-
Affects Version/s: None
-
Component/s: Connectors / Common
-
Labels:None
The Split Readers hand over a batch of records at a time from the I/O thread (fetching and decoding) to the main operator processing thread.
These structures can memory intensive and expensive and performance greatly benefits from reusing them. This is especially true for high-performance format readers like ORC and Parquet.
While previous sources (where I/O was in the main thread) could reuse objects in a trivial manner, the new Split Reader API (with multiple threads) needs an explicit recycle() hook to allow returning/reusing these objects.