Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-2722

[Python] ndarray to arrow conversion fails when downcasted from pandas to_numeric

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.9.0
    • Fix Version/s: 0.10.0
    • Component/s: C++, Python
    • Environment:
      Windows 10 64-bit

      Description

      The following snippet:

      import numpy as np
      import pandas as pd
      import pyarrow as pa
      
      pa.array(pd.to_numeric(pd.Series(np.array([65536,2,3], dtype=np.uint64)), downcast='unsigned'), 
      from_pandas=True, type='uint32')
      

      fails to convert with message:

      ArrowNotImplementedError Traceback (most recent call last)
      <ipython-input-2-b259c5cb7044> in <module>()
      4 
      5 pa.array(pd.to_numeric(pd.Series(np.array([65536,2,3], dtype=np.uint64)), downcast='unsigned'), 
      ----> 6 from_pandas=True, type='uint32')
      
      array.pxi in pyarrow.lib.array()
      
      array.pxi in pyarrow.lib._ndarray_to_array()
      
      error.pxi in pyarrow.lib.check_status()
      
      ArrowNotImplementedError: Unsupported numpy type 6

       

      This is a Windows 64-bit machine, running Python 3.6.5, pyarrow 0.9.0, pandas 0.23.1 and numpy 1.14.5.

      Seems to be fine for uint16 or uint8 downcasting. Unfortunately I didn't had the time to dig deeper or try on a Linux machine but it feels like its related to the LLP64 model.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                pitrou Antoine Pitrou
                Reporter:
                aradtke Augusto Radtke
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 40m
                  40m