Details
-
Improvement
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
None
-
None
-
None
Description
Maybe I'm overlooking something, but I don't see something on the API surface to get the number of rows in a arrow file without reading all the record batches. This is useful when we want to read into contiguous buffers, because it allows us to allocate the right sizes up front.
I'd like to propose that we add `num_rows` as a field in the file footer so it's easy to query without reading the whole file.
Meanwhile, before we get that added to the official format fbs, it would be nice to haveĀ a method that iterates over the record batch headers and sums up the lengths without reading the actual record batch body.