Uploaded image for project: 'Apache Trafodion (Retired)'
  1. Apache Trafodion (Retired)
  2. TRAFODION-2912

Non-deterministic scalar UDFs not executed once per row


    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 2.0-incubating
    • 2.3
    • sql-cmp
    • None


      This problem was found by Andy Yang.

      Andy created a random generator scalar UDF and found that it did not return a different random value for each row:

      >>select scalar_rand_udf(), scalar_rand_udf()
      +>from (values (1), (2), (3)) T(s);
      RND RND 
      ----------- -----------
      846930886 1804289383
      846930886 1804289383
      846930886 1804289383
      --- 3 row(s) selected.

      Here is the explain, it shows that we are using hash joins, not nested joins, to evaluate the UDFs:

      >>explain options 'f' s;
      LC   RC   OP   OPERATOR              OPT       DESCRIPTION           CARD
      ---- ---- ---- --------------------  --------  --------------------  ---------
      5    .    6    root                                                  3.00E+000
      4    1    5    hybrid_hash_join                                      3.00E+000
      3    2    4    hybrid_hash_join                                      1.00E+000
      .    .    3    isolated_scalar_udf             SCALAR_RAND_UDF       1.00E+000
      .    .    2    isolated_scalar_udf             SCALAR_RAND_UDF       1.00E+000
      .    .    1    tuplelist                                             3.00E+000
      --- SQL operation complete.

      The problem is that we don't check for non-deterministic UDFs when we transform a TSJ to a regular join in the transformer or normalizer. We don't even set the non-deterministic flag in the group attributes of the IsolatedScalarUDF node.

      The fix is to set this flag correctly and to add a check and not transform routine joins for non-deterministic isolated scalar UDFs into a regular join.

      To recreate:

      Here is the source code of the UDF:

      #include "sqludr.h"
      #include <stdlib.h>
      SQLUDR_LIBFUNC SQLUDR_INT32 scalar_rand_udf(SQLUDR_INT32 *out1,
                                                  SQLUDR_INT16 *outInd1,
        if (calltype == SQLUDR_CALLTYPE_FINAL)
          return SQLUDR_SUCCESS;
        (*out1) = rand();
        return SQLUDR_SUCCESS;

      Compile the UDF:

      gcc -g -Wall -I$TRAF_HOME/export/include/sql -shared -fPIC -o scalar_rand_udf.so scalar_rand_udf.c

      Create the UDF and run it:

      drop function scalar_rand_udf;
      drop library scalar_rand_udf_lib;
      create library scalar_rand_udf_lib
       file '/home/zellerh/src/scalar_rand_udf/scalar_rand_udf.so';
      create function scalar_rand_udf() returns (rnd int)
        external name 'scalar_rand_udf' library scalar_rand_udf_lib
        not deterministic no sql no transaction required;
      prepare s from
      select scalar_rand_udf(), scalar_rand_udf()
      from (values (1), (2), (3)) T(s);
      explain options 'f' s;
      execute s;


        Issue Links



              hzeller Hans Zeller
              hzeller Hans Zeller
              0 Vote for this issue
              2 Stop watching this issue



                Issue deployment