[ARROW-5566] [Python] Overhaul type unification from Python sequence in arrow::py::InferArrowType - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Component/s: Python
Labels:
- python-conversion

External issue URL:
https://github.com/apache/arrow/issues/16758

Description

I'm working on ~~ARROW-4324~~ and there's some technical debt lying in arrow/python/inference.cc because the case where NumPy scalars are mixed with non-NumPy Python scalar values, all hell breaks loose. In particular, the innocuous numpy.nan is a Python float, not a NumPy float64, so the sequence [np.float16(1.5), np.nan] can be converted incorrectly.

Part of what's messy is that NumPy dtype unification is split from general type unification. This should all be combined together with the NumPy types mapping onto an intermediate value (for unification purposes) that then maps ultimately onto an Arrow type

Attachments

Issue Links

is related to

ARROW-4324 [Python] Array dtype inference incorrect when created from list of mixed numpy scalars

Resolved

Activity

People

Assignee:: Unassigned

Reporter:: Wes McKinney

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Dates

Created:: 12/Jun/19 03:13

Updated:: 11/Jan/23 07:41