Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
Description
In several places in Arrow code, invocations of Expression::Bind() default the ExecContext argument. This leads to the default function registry being used in expression manipulations, and this becomes a problem when the user wishes to use a non-default function registry, e.g., when passing one to the ExecContext of an ExecPlan, which is how I discovered this issue. The problematic places I found for such Expression::Bind() invocation are:
- cpp/src/arrow/dataset/file_parquet.cc
- cpp/src/arrow/dataset/scanner.cc
- cpp/src/arrow/compute/exec/project_node.cc
- cpp/src/arrow/compute/exec/hash_join_node.cc
- cpp/src/arrow/compute/exec/filter_node.cc
There are also other places in test and benchmark code (grep for 'Bind()').
Another case of bad defaulting of an ExecContext argument is in Inequality::simplifies_to in cpp/src/compute/exec/expression.cc where a fresh ExecContext is created, instead of being received from the caller, and passed to BindNonRecursive.
I'd argue that an ExecContext variable should not be allowed to default, except perhaps in the highest-level/user-facing APIs.
Attachments
Issue Links
- links to