Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
4.0.0
Description
Currently when the array_sort_indices compute function is called on a ChunkedArray of two or more Arrays, it returns a ChunkedArray of Arrays of local sort indices for each Array. Demonstrating this with the R bindings:
> x <- ChunkedArray$create(c(2L, 1L), c(4L, 3L))
> arrow:::call_function("array_sort_indices", x, options = list(order = FALSE))
ChunkedArray
[
[
1,
0
],
[
1,
0
]
]
Compare to the sort_indices compute function which returns an Array of global sort indices in this case:
> arrow:::call_function("sort_indices", x, options = list(names = "", orders = 0L))
Array
<uint64>
[
1,
0,
3,
2
]
Is this behavior deliberate? If so, we should document it clearly. If not, we should change it.
Note that the docs currently states that array_sort_indices only works on Arrays https://arrow.apache.org/docs/cpp/compute.html#sorts-and-partitions (see note (4)) but evidently that is not exactly correct.
Attachments
Issue Links
- links to