Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-5566

[Python] Overhaul type unification from Python sequence in arrow::py::InferArrowType

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • Python

    Description

      I'm working on ARROW-4324 and there's some technical debt lying in arrow/python/inference.cc because the case where NumPy scalars are mixed with non-NumPy Python scalar values, all hell breaks loose. In particular, the innocuous numpy.nan is a Python float, not a NumPy float64, so the sequence [np.float16(1.5), np.nan] can be converted incorrectly.

      Part of what's messy is that NumPy dtype unification is split from general type unification. This should all be combined together with the NumPy types mapping onto an intermediate value (for unification purposes) that then maps ultimately onto an Arrow type

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              wesm Wes McKinney
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated: