Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
Description
I triggered these from R, so that's what the reproducers are in.
1. Calling "filter" with no args segfaults.
arrow:::compute__CallFunction("filter", list(), list(keep_na = FALSE))
Top of the backtrace from lldb:
* frame #0: 0x0000000109e1c2c7 libarrow.100.dylib`arrow::Datum::type() const + 7 frame #1: 0x000000010a14a232 libarrow.100.dylib`arrow::compute::internal::(anonymous namespace)::FilterMetaFunction::ExecuteImpl(std::__1::vector<arrow::Datum, std::__1::allocator<arrow::Datum> > const&, arrow::compute::FunctionOptions const*, arrow::compute::ExecContext*) const + 66 frame #2: 0x0000000109fc32c9 libarrow.100.dylib`arrow::compute::MetaFunction::Execute(std::__1::vector<arrow::Datum, std::__1::allocator<arrow::Datum> > const&, arrow::compute::FunctionOptions const*, arrow::compute::ExecContext*) const + 41 frame #3: 0x0000000109fb3d3c libarrow.100.dylib`arrow::compute::CallFunction(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::vector<arrow::Datum, std::__1::allocator<arrow::Datum> > const&, arrow::compute::FunctionOptions const*, arrow::compute::ExecContext*) + 844 frame #4: 0x0000000109fb3c47 libarrow.100.dylib`arrow::compute::CallFunction(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::vector<arrow::Datum, std::__1::allocator<arrow::Datum> > const&, arrow::compute::FunctionOptions const*, arrow::compute::ExecContext*) + 599
This is not the case with at least some other functions. If I try to call "sum" with no args, I get Invalid: Function accepts 1 arguments but passed 0 and no segfault.
2. Something is strange with is_null. It creates what appears to be a valid boolean array, but if I pass it to filter, it segfaults. I'm adding bindings for this in ARROW-9187 but this should run on current master:
library(arrow)
a <- Array$create(1:4)
b <- arrow:::shared_ptr(Array, arrow:::call_function("is_null", a))
a$Filter(b)
Backtrace:
* frame #0: 0x000000010a120bb6 libarrow.100.dylib`arrow::compute::internal::GetFilterOutputSize(arrow::ArrayData const&, arrow::compute::FilterOptions::NullSelectionBehavior) + 38 frame #1: 0x000000010a125659 libarrow.100.dylib`arrow::compute::internal::(anonymous namespace)::PrimitiveFilter(arrow::compute::KernelContext*, arrow::compute::ExecBatch const&, arrow::Datum*) + 121 frame #2: 0x0000000109fbbea4 libarrow.100.dylib`arrow::compute::detail::VectorExecutor::ExecuteBatch(arrow::compute::ExecBatch const&, arrow::compute::detail::ExecListener*) + 996 frame #3: 0x0000000109fba3e6 libarrow.100.dylib`arrow::compute::detail::VectorExecutor::Execute(std::__1::vector<arrow::Datum, std::__1::allocator<arrow::Datum> > const&, arrow::compute::detail::ExecListener*) + 150 frame #4: 0x0000000109fc0948 libarrow.100.dylib`arrow::compute::Function::Execute(std::__1::vector<arrow::Datum, std::__1::allocator<arrow::Datum> > const&, arrow::compute::FunctionOptions const*, arrow::compute::ExecContext*) const + 1016 frame #5: 0x0000000109fb3d3c libarrow.100.dylib`arrow::compute::CallFunction(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::vector<arrow::Datum, std::__1::allocator<arrow::Datum> > const&, arrow::compute::FunctionOptions const*, arrow::compute::ExecContext*) + 844 frame #6: 0x000000010a14a9b5 libarrow.100.dylib`arrow::compute::internal::(anonymous namespace)::FilterMetaFunction::ExecuteImpl(std::__1::vector<arrow::Datum, std::__1::allocator<arrow::Datum> > const&, arrow::compute::FunctionOptions const*, arrow::compute::ExecContext*) const + 1989 frame #7: 0x0000000109fc32c9 libarrow.100.dylib`arrow::compute::MetaFunction::Execute(std::__1::vector<arrow::Datum, std::__1::allocator<arrow::Datum> > const&, arrow::compute::FunctionOptions const*, arrow::compute::ExecContext*) const + 41 frame #8: 0x0000000109fb3d3c libarrow.100.dylib`arrow::compute::CallFunction(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::vector<arrow::Datum, std::__1::allocator<arrow::Datum> > const&, arrow::compute::FunctionOptions const*, arrow::compute::ExecContext*) + 844 frame #9: 0x0000000109fb3c47 libarrow.100.dylib`arrow::compute::CallFunction(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::vector<arrow::Datum, std::__1::allocator<arrow::Datum> > const&, arrow::compute::FunctionOptions const*, arrow::compute::ExecContext*) + 599
BUT: if I call as.vector on b before using it as a Filter, it works--even though I've discarded the as.vector result and am still using the Array to filter.
library(arrow)
a <- Array$create(1:4)
b <- arrow:::shared_ptr(Array, arrow:::call_function("is_null", a))
as.vector(b)
a$Filter(b)
Just printing (calling ToString) on b doesn't prevent the segfault. And I have not observed this with other boolean kernels. E.g. this does not segfault:
library(arrow)
a <- Array$create(1:4)
b <- arrow:::shared_ptr(Array, arrow:::call_function("greater", a, Scalar$create(3L)))
a$Filter(b)
Attachments
Issue Links
- links to