Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-4885

[Python] read_csv() can't handle decimal128 columns

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.12.1
    • Fix Version/s: 0.14.0
    • Component/s: C++, Python
    • Environment:
      Python: 3.7.2, 2.7.15
      PyArrow: 0.12.1
      OS: MacOS 10.13.4 (High Sierra)

      Description

      Summary

      read_csv() crashes when given a decimal128 column type in its convert options. The cause is that there's no converter listed here.

      To Reproduce

      1) First, create a CSV file like so and save it somewhere:

      Header
      123.45
      

      2) Run the following code on Python 2 or 3:

      import pyarrow.csv as pa_csv
      import pyarrow as pa
      import io
      
      types = {'Header': pa.decimal128(11, 2)}
      convert_options = pa_csv.ConvertOptions(column_types=types)
      pa_csv.read_csv('/home/dargueta/Desktop/test.csv', convert_options=convert_options)
      

      read_csv() crashes with the following exception:

      Traceback (most recent call last):
        File "<stdin>", line 1, in <module>
        File "pyarrow/_csv.pyx", line 397, in pyarrow._csv.read_csv
        File "pyarrow/error.pxi", line 89, in pyarrow.lib.check_status
      pyarrow.lib.ArrowNotImplementedError: CSV conversion to decimal(11, 2) is not supported
      
      CSV conversion to decimal(11, 2) is not supported
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                emkornfield@gmail.com Micah Kornfield
                Reporter:
                yiannisliodakis Diego Argueta
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1.5h
                  1.5h