Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-2296

[C++] Add num_rows to file footer

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: C++, Format
    • Labels:
      None

      Description

      Maybe I'm overlooking something, but I don't see something on the API surface to get the number of rows in a arrow file without reading all the record batches. This is useful when we want to read into contiguous buffers, because it allows us to allocate the right sizes up front.

      I'd like to propose that we add `num_rows` as a field in the file footer so it's easy to query without reading the whole file.

      Meanwhile, before we get that added to the official format fbs, it would be nice to haveĀ a method that iterates over the record batch headers and sums up the lengths without reading the actual record batch body.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              llchan Lawrence Chan
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated: