Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-3053

[Python] Pandas decimal conversion segfault

    Details

      Description

      This example segfaults when trying to convert a pandas DataFrame with a decimal column and at least one other object column to a pyarrow Table after a round trip through HDF5:

      import decimal
      import pandas as pd
      import pyarrow as pa
      
      data = {'a': {0: 'a'}, 'b': {0: decimal.Decimal('0.0')}}
      
      df = pd.DataFrame.from_dict(data)
      df.to_hdf('test.h5', 'test')
      df = pd.read_hdf('test.h5', 'test')
      
      table = pa.Table.from_pandas(df)
      

      This is the gdb backtrace:

      #0 0x00007f188a08fc0b in arrow::py::internal::PandasObjectIsNull(_object*) () from /home/ashieh/.local/lib/python2.7/site-packages/pyarrow/libarrow_python.so.10
      #1 0x00007f188a09931c in arrow::py::NumPyConverter::ConvertDecimals() () from /home/ashieh/.local/lib/python2.7/site-packages/pyarrow/libarrow_python.so.10
      #2 0x00007f188a09ef4b in arrow::py::NumPyConverter::ConvertObjectsInfer() () from /home/ashieh/.local/lib/python2.7/site-packages/pyarrow/libarrow_python.so.10
      #3 0x00007f188a09f5db in arrow::py::NumPyConverter::ConvertObjects() () from /home/ashieh/.local/lib/python2.7/site-packages/pyarrow/libarrow_python.so.10
      #4 0x00007f188a09f715 in arrow::py::NumPyConverter::Convert() () from /home/ashieh/.local/lib/python2.7/site-packages/pyarrow/libarrow_python.so.10
      #5 0x00007f188a0a0f5e in arrow::py::NdarrayToArrow(arrow::MemoryPool*, _object*, _object*, bool, std::shared_ptr<arrow::DataType> const&, std::shared_ptr<arrow::ChunkedArray>*) () from /home/ashieh/.local/lib/python2.7/site-packages/pyarrow/libarrow_python.so.10
      #6 0x00007f188ab1a13e in __pyx_pw_7pyarrow_3lib_79array(_object*, _object*, _object*) () from /home/ashieh/.local/lib/python2.7/site-packages/pyarrow/lib.so
      #7 0x00000000004c37ed in PyEval_EvalFrameEx ()
      #8 0x00000000004b9ab6 in PyEval_EvalCodeEx ()
      #9 0x00000000004c1e6f in PyEval_EvalFrameEx ()
      #10 0x00000000004b9ab6 in PyEval_EvalCodeEx ()
      #11 0x00000000004d55f3 in ?? ()
      #12 0x00007f188aa75eac in __pyx_pw_7pyarrow_3lib_5Table_17from_pandas(_object*, _object*, _object*) () from /home/ashieh/.local/lib/python2.7/site-packages/pyarrow/lib.so
      #13 0x00000000004bc3fa in PyEval_EvalFrameEx ()
      #14 0x00000000004b9ab6 in PyEval_EvalCodeEx ()
      #15 0x00000000004eb30f in ?? ()
      #16 0x00000000004e5422 in PyRun_FileExFlags ()
      #17 0x00000000004e3cd6 in PyRun_SimpleFileExFlags ()
      #18 0x0000000000493ae2 in Py_Main ()
      #19 0x00007f18a79c4830 in __libc_start_main (main=0x4934c0 <main>, argc=2, argv=0x7fffcf079508, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffcf0794f8) at ../csu/libc-start.c:291
      #20 0x00000000004933e9 in _start ()
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                wesmckinn Wes McKinney
                Reporter:
                adshieh Albert Shieh
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 3h 10m
                  3h 10m