Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
Description
Currently the table_source node does not appear in our documentation.
Also, in TableSourceNodeOptions we have:
// Size of batches to emit from this node // If the table is larger the node will emit multiple batches from the // the table to be processed in parallel. int64_t batch_size;
However, when looking into a performance issue today, I realized this description is incomplete. In reality we should probably call this parameter max_batch_size.
Furthermore, we should make it clear that a table with smaller batches will emit smaller batches directly (this is a good thing in my case) and will not concatenate small batches together into a larger batch.
Attachments
Issue Links
- links to