Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
Description
I am trying to join 2 Arrow tables where some columns are of list<float> data type. Note that my join columns/keys are primitive data types and some my non-join columns/keys are of list<float>. But, PyArrow join() cannot join such as table, although pandas can. It says
ArrowInvalid: Data type list<item: float> is not supported in join non-key field
when I execute this piece of code
joined_table = table_1.join(table_2, ['k1', 'k2', 'k3'])
A stackoverflow response pointed out that Arrow currently cannot handle non-fixed types for joins. Can this be fixed ? Or is this intentional ?
Attachments
Issue Links
- relates to
-
ARROW-8991 [C++][Compute] Add scalar_hash function
- In Progress