Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-7781

[C++][Dataset] Filtering on a non-existent column gives a segfault

    XMLWordPrintableJSON

Details

    Description

      Example with python code:

      In [1]: import pandas as pd
      
      In [2]: df = pd.DataFrame({'a': [1, 2, 3]})
      
      In [3]: df.to_parquet("test-filter-crash.parquet")
      
      In [4]: import pyarrow.dataset as ds
      
      In [5]: dataset = ds.dataset("test-filter-crash.parquet")
      
      In [6]: dataset.to_table(filter=ds.field('a') > 1).to_pandas()
      Out[6]:
         a
      0  2
      1  3
      
      In [7]: dataset.to_table(filter=ds.field('b') > 1).to_pandas()
      ../src/arrow/dataset/filter.cc:929:  Check failed: _s.ok() Operation failed: maybe_value.status()
      Bad status: Invalid: attempting to cast non-null scalar to NullScalar
      /home/joris/miniconda3/envs/arrow-dev/lib/libarrow.so.16(+0x11f744c)[0x7fb1390f444c]
      /home/joris/miniconda3/envs/arrow-dev/lib/libarrow.so.16(+0x11f73ca)[0x7fb1390f43ca]
      /home/joris/miniconda3/envs/arrow-dev/lib/libarrow.so.16(+0x11f73ec)[0x7fb1390f43ec]
      /home/joris/miniconda3/envs/arrow-dev/lib/libarrow.so.16(_ZN5arrow4util8ArrowLogD1Ev+0x57)[0x7fb1390f4759]
      /home/joris/miniconda3/envs/arrow-dev/lib/libarrow_dataset.so.16(+0x169fc6)[0x7fb145594fc6]
      /home/joris/miniconda3/envs/arrow-dev/lib/libarrow_dataset.so.16(+0x16b9be)[0x7fb1455969be]
      /home/joris/miniconda3/envs/arrow-dev/lib/libarrow_dataset.so.16(_ZN5arrow7dataset15VisitExpressionINS0_23InsertImplicitCastsImplEEEDTclfp0_fp_EERKNS0_10ExpressionEOT_+0x2ae)[0x7fb1455a0dee]
      /home/joris/miniconda3/envs/arrow-dev/lib/libarrow_dataset.so.16(_ZN5arrow7dataset19InsertImplicitCastsERKNS0_10ExpressionERKNS_6SchemaE+0x44)[0x7fb145596d4e]
      /home/joris/scipy/repos/arrow/python/pyarrow/_dataset.cpython-37m-x86_64-linux-gnu.so(+0x48286)[0x7fb1456dd286]
      /home/joris/scipy/repos/arrow/python/pyarrow/_dataset.cpython-37m-x86_64-linux-gnu.so(+0x49220)[0x7fb1456de220]
      /home/joris/miniconda3/envs/arrow-dev/bin/python(+0x170f37)[0x55e5127e1f37]
      /home/joris/scipy/repos/arrow/python/pyarrow/_dataset.cpython-37m-x86_64-linux-gnu.so(+0x22bd6)[0x7fb1456b7bd6]
      /home/joris/scipy/repos/arrow/python/pyarrow/_dataset.cpython-37m-x86_64-linux-gnu.so(+0x33b81)[0x7fb1456c8b81]
      /home/joris/miniconda3/envs/arrow-dev/bin/python(_PyMethodDef_RawFastCallKeywords+0x305)[0x55e5127d9c75]
      /home/joris/miniconda3/envs/arrow-dev/bin/python(_PyCFunction_FastCallKeywords+0x21)[0x55e5127d9cf1]
      /home/joris/miniconda3/envs/arrow-dev/bin/python(_PyEval_EvalFrameDefault+0x5460)[0x55e512847c40]
      /home/joris/miniconda3/envs/arrow-dev/bin/python(_PyEval_EvalCodeWithName+0x2f9)[0x55e5127881a9]
      /home/joris/miniconda3/envs/arrow-dev/bin/python(PyEval_EvalCodeEx+0x44)[0x55e512789064]
      /home/joris/miniconda3/envs/arrow-dev/bin/python(PyEval_EvalCode+0x1c)[0x55e51278908c]
      /home/joris/miniconda3/envs/arrow-dev/bin/python(+0x1e1650)[0x55e512852650]
      /home/joris/miniconda3/envs/arrow-dev/bin/python(_PyMethodDef_RawFastCallKeywords+0xe9)[0x55e5127d9a59]
      /home/joris/miniconda3/envs/arrow-dev/bin/python(_PyCFunction_FastCallKeywords+0x21)[0x55e5127d9cf1]
      /home/joris/miniconda3/envs/arrow-dev/bin/python(_PyEval_EvalFrameDefault+0x48e4)[0x55e5128470c4]
      /home/joris/miniconda3/envs/arrow-dev/bin/python(_PyGen_Send+0x2a2)[0x55e5127e31a2]
      /home/joris/miniconda3/envs/arrow-dev/bin/python(_PyEval_EvalFrameDefault+0x1a83)[0x55e512844263]
      /home/joris/miniconda3/envs/arrow-dev/bin/python(_PyGen_Send+0x2a2)[0x55e5127e31a2]
      /home/joris/miniconda3/envs/arrow-dev/bin/python(_PyEval_EvalFrameDefault+0x1a83)[0x55e512844263]
      /home/joris/miniconda3/envs/arrow-dev/bin/python(_PyGen_Send+0x2a2)[0x55e5127e31a2]
      /home/joris/miniconda3/envs/arrow-dev/bin/python(_PyMethodDef_RawFastCallKeywords+0x8c)[0x55e5127d99fc]
      /home/joris/miniconda3/envs/arrow-dev/bin/python(_PyMethodDescr_FastCallKeywords+0x4f)[0x55e5127e1fdf]
      /home/joris/miniconda3/envs/arrow-dev/bin/python(_PyEval_EvalFrameDefault+0x4ddc)[0x55e5128475bc]
      /home/joris/miniconda3/envs/arrow-dev/bin/python(_PyFunction_FastCallKeywords+0xfb)[0x55e5127d915b]
      /home/joris/miniconda3/envs/arrow-dev/bin/python(_PyEval_EvalFrameDefault+0x416)[0x55e512842bf6]
      /home/joris/miniconda3/envs/arrow-dev/bin/python(_PyFunction_FastCallKeywords+0xfb)[0x55e5127d915b]
      /home/joris/miniconda3/envs/arrow-dev/bin/python(_PyEval_EvalFrameDefault+0x6f3)[0x55e512842ed3]
      /home/joris/miniconda3/envs/arrow-dev/bin/python(_PyEval_EvalCodeWithName+0x2f9)[0x55e5127881a9]
      /home/joris/miniconda3/envs/arrow-dev/bin/python(_PyFunction_FastCallKeywords+0x387)[0x55e5127d93e7]
      /home/joris/miniconda3/envs/arrow-dev/bin/python(_PyEval_EvalFrameDefault+0x14e4)[0x55e512843cc4]
      /home/joris/miniconda3/envs/arrow-dev/bin/python(_PyEval_EvalCodeWithName+0x2f9)[0x55e5127881a9]
      /home/joris/miniconda3/envs/arrow-dev/bin/python(_PyFunction_FastCallKeywords+0x325)[0x55e5127d9385]
      /home/joris/miniconda3/envs/arrow-dev/bin/python(_PyEval_EvalFrameDefault+0x6f3)[0x55e512842ed3]
      /home/joris/miniconda3/envs/arrow-dev/bin/python(_PyEval_EvalCodeWithName+0x2f9)[0x55e5127881a9]
      /home/joris/miniconda3/envs/arrow-dev/bin/python(_PyFunction_FastCallKeywords+0x325)[0x55e5127d9385]
      /home/joris/miniconda3/envs/arrow-dev/bin/python(_PyEval_EvalFrameDefault+0x6f3)[0x55e512842ed3]
      /home/joris/miniconda3/envs/arrow-dev/bin/python(_PyFunction_FastCallKeywords+0xfb)[0x55e5127d915b]
      /home/joris/miniconda3/envs/arrow-dev/bin/python(_PyEval_EvalFrameDefault+0x6f3)[0x55e512842ed3]
      /home/joris/miniconda3/envs/arrow-dev/bin/python(_PyEval_EvalCodeWithName+0x2f9)[0x55e5127881a9]
      /home/joris/miniconda3/envs/arrow-dev/bin/python(_PyFunction_FastCallDict+0x400)[0x55e5127894a0]
      /home/joris/miniconda3/envs/arrow-dev/bin/python(_PyObject_Call_Prepend+0x63)[0x55e5127a8393]
      /home/joris/miniconda3/envs/arrow-dev/bin/python(PyObject_Call+0x6e)[0x55e51279adce]
      /home/joris/miniconda3/envs/arrow-dev/bin/python(_PyEval_EvalFrameDefault+0x1f5b)[0x55e51284473b]
      /home/joris/miniconda3/envs/arrow-dev/bin/python(_PyEval_EvalCodeWithName+0x2f9)[0x55e5127881a9]
      /home/joris/miniconda3/envs/arrow-dev/bin/python(_PyFunction_FastCallKeywords+0x387)[0x55e5127d93e7]
      /home/joris/miniconda3/envs/arrow-dev/bin/python(_PyEval_EvalFrameDefault+0x416)[0x55e512842bf6]
      /home/joris/miniconda3/envs/arrow-dev/bin/python(_PyEval_EvalCodeWithName+0x2f9)[0x55e5127881a9]
      /home/joris/miniconda3/envs/arrow-dev/bin/python(PyEval_EvalCodeEx+0x44)[0x55e512789064]
      /home/joris/miniconda3/envs/arrow-dev/bin/python(PyEval_EvalCode+0x1c)[0x55e51278908c]
      /home/joris/miniconda3/envs/arrow-dev/bin/python(+0x230344)[0x55e5128a1344]
      /home/joris/miniconda3/envs/arrow-dev/bin/python(PyRun_FileExFlags+0xa1)[0x55e5128ab5c1]
      /home/joris/miniconda3/envs/arrow-dev/bin/python(PyRun_SimpleFileExFlags+0x1c3)[0x55e5128ab7b3]
      /home/joris/miniconda3/envs/arrow-dev/bin/python(+0x23b8cf)[0x55e5128ac8cf]
      /home/joris/miniconda3/envs/arrow-dev/bin/python(_Py_UnixMain+0x3c)[0x55e5128ac9ec]
      /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe7)[0x7fb1576bfb97]
      /home/joris/miniconda3/envs/arrow-dev/bin/python(+0x1e171d)[0x55e51285271d]
      Aborted (core dumped)
      

      which is not very nice

      Attachments

        Issue Links

          Activity

            People

              fsaintjacques Francois Saint-Jacques
              jorisvandenbossche Joris Van den Bossche
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 0.5h
                  0.5h