Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
0.17.1
-
None
-
None
Description
See discussion https://github.com/apache/arrow/pull/7410#discussion_r439704017
Testing against random data fuzzes kernel implementations and provides sanity checks across a wide swath of parameter space with minimal configuration. Currently our random tests have a lot of boilerplate for generating the inputs and a lot of ad-hoc code for computing the expected values. It might be worthwhile to have an interface for specifying randomized tests more uniformly.
Since kernels provide introspection of their input and output types we can generate inputs of those types (both scalar and array). For ScalarFunctions, a function with signature Result<std::shared_ptr<Scalar>>(const ScalarVector& args, const FunctionOptions*) will be sufficient to specify the expected behavior, and the expected output can be generated by applying that specification to the broadcast inputs.
Example impl https://github.com/apache/arrow/pull/7410/commits/908504810ee9332f651cceb33a8b40f253383efe