Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
Description
The C++ docs of SetLookupOptions has this explanation of the skip_nulls option:
/// Whether nulls in `value_set` count for lookup. /// /// If true, any null in `value_set` is ignored and nulls in the input /// produce null (IndexIn) or false (IsIn) values in the output. /// If false, any null in `value_set` is successfully matched in /// the input. bool skip_nulls;
However, for IsIn this explanation doesn't seem to hold in practice:
In [16]: arr = pa.array([1, 2, None]) In [17]: pc.is_in(arr, value_set=pa.array([1, None]), skip_null=True) Out[17]: <pyarrow.lib.BooleanArray object at 0x7fcf666f9408> [ true, false, true ] In [18]: pc.is_in(arr, value_set=pa.array([1, None]), skip_null=False) Out[18]: <pyarrow.lib.BooleanArray object at 0x7fcf666b13a8> [ true, false, true ]
This documentation was added in https://github.com/apache/arrow/pull/7695 (ARROW-8989)/
.
BTW, for "index_in", it works as documented:
In [19]: pc.index_in(arr, value_set=pa.array([1, None]), skip_null=True) Out[19]: <pyarrow.lib.Int32Array object at 0x7fcf666f04c8> [ 0, null, null ] In [20]: pc.index_in(arr, value_set=pa.array([1, None]), skip_null=False) Out[20]: <pyarrow.lib.Int32Array object at 0x7fcf666f0ee8> [ 0, null, 1 ]
Attachments
Issue Links
- links to