Details
-
Improvement
-
Status: Closed
-
Minor
-
Resolution: Implemented
-
None
-
None
Description
As discussed in ARROW-10901, 64 bit integer vectors have values64 getters available for systems with support for BigInt typed arrays. Column-oriented dataframe libraries (such as UW's arquero) generally use the Chunked::toArray convenience method in favour of directly dealing with chunks or vectors, and therefore always receive the int32/uint32 data.
I think there are a few alternatives for improving high level access to a 64 bit column's values:
- An optional bit width (or is64Bit, like the <T>::from variants) parameter in Chunked::toArray, IntVector::toArray.
- A new Chunked::toArray64 method, and the same on IntVector (or at least, the 64 bit variants).
- Use values64 directly in the consuming library (loop over the chunks, copy into a destination typed array).
The toArray64 option would probably be a bit of a mess (requiring a fallback to toArray on BaseVector), an optional parameter might be the cleanest approach.