Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-3760

Rewriting non-deterministic function can break query semantics

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • core

    Description

      Calcite rewrite some SqlFunctions during validation. But whether the function is deterministic is not considered. For a non-deterministic operator, the rewriting can break semantics. Additionally there's no interface for user to specify the determinism for a UDF/UDAF. 

      Say I have non-deterministic UDF & UDAF and run sql like below

      select coalesce(udf(col0), 100) from foo;
      select nullif(udaf(col0), 1024) from foo;

      They will be rewritten as

      select case when udf(col0) is not null then udf(col0) else 100 end
      from foo;
      
      select case when udaf(col0)=1024 then null udaf(col0)
      from foo

      As we can see that non-deterministic UDF & UDAF are called multiple times after written. Thus the condition in WHEN clause might NOT be held all the time.

      We need to provide an interface for user to specify the determinism in UDF/UDAF and consider whether a SqlNode is deterministic when rewriting.

      Attachments

        Issue Links

          Activity

            People

              jinxing6042@126.com Jin Xing
              jinxing6042@126.com Jin Xing
              Votes:
              0 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

                Created:
                Updated:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 20m
                  20m