Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-4779

Conditional functions isfalse(), istrue(), isnotfalse() and isnottrue() don't work with codegen

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: Impala 2.3.0, Impala 2.5.0, Impala 2.4.0, Impala 2.6.0, Impala 2.7.0, Impala 2.8.0
    • Fix Version/s: Impala 2.9.0
    • Component/s: Backend
    • Labels:

      Description

      As titled, the conditional functions isfalse(), istrue(), isnotfalse(), isnottrue() can only be interpreted as they are cross-compiled. This will lead to unexpected query failure when they are used in expressions in ExecNodes which are codegen'ed.

      [localhost:21000] > select isfalse(true);
      Query: select isfalse(true)
      Query submitted at: 2017-01-18 10:24:52 (Coordinator: http://kwho-desktop:25000)
      Query progress can be monitored at: http://kwho-desktop:25000/query_plan?query_id=a3455b5e54215a5c:2468bbc00000000
      +---------------+
      | isfalse(true) |
      +---------------+
      | false         |
      +---------------+
      Fetched 1 row(s) in 0.17s
      [localhost:21000] > set enable_expr_rewrites=0;
      ENABLE_EXPR_REWRITES set to 0
      [localhost:21000] > set EXEC_SINGLE_NODE_ROWS_THRESHOLD=0;
      EXEC_SINGLE_NODE_ROWS_THRESHOLD set to 0
      [localhost:21000] > select isfalse(true);
      Query: select isfalse(true)
      Query submitted at: 2017-01-18 10:25:31 (Coordinator: http://kwho-desktop:25000)
      ERROR: Builtin 'isfalse' with symbol '_ZN6impala20ConditionalFunctions7IsFalseEPN10impala_udf15FunctionContextERKNS1_10BooleanValE' does not exist. Verify that all your impalads are the same version.
      
      [localhost:21000] > select isnotfalse(true);
      Query: select isnotfalse(true)
      Query submitted at: 2017-01-18 10:25:46 (Coordinator: http://kwho-desktop:25000)
      ERROR: Builtin 'isnotfalse' with symbol '_ZN6impala20ConditionalFunctions10IsNotFalseEPN10impala_udf15FunctionContextERKNS1_10BooleanValE' does not exist. Verify that all your impalads are the same version.
      
      [localhost:21000] > select istrue(true);
      Query: select istrue(true)
      Query submitted at: 2017-01-18 10:25:51 (Coordinator: http://kwho-desktop:25000)
      ERROR: Builtin 'istrue' with symbol '_ZN6impala20ConditionalFunctions6IsTrueEPN10impala_udf15FunctionContextERKNS1_10BooleanValE' does not exist. Verify that all your impalads are the same version.
      
      [localhost:21000] > select isnottrue(true);
      Query: select isnottrue(true)
      Query submitted at: 2017-01-18 10:25:55 (Coordinator: http://kwho-desktop:25000)
      ERROR: Builtin 'isnottrue' with symbol '_ZN6impala20ConditionalFunctions9IsNotTrueEPN10impala_udf15FunctionContextERKNS1_10BooleanValE' does not exist. Verify that all your impalads are the same version.
      

        Activity

        Hide
        kwho Michael Ho added a comment -

        https://github.com/apache/incubator-impala/commit/1d933919ee964d8766ba028623d66ec20cd123ac

        IMPALA-4705, IMPALA-4779, IMPALA-4780: Fix some Expr bugs with codegen
        This change fixes expr-test.cc to work with codegen as it's
        originally intended. Fixing it uncovers a couple of bugs fixed
        in this patch:

        IMPALA-4705: When an IR function is materialized, its
        function body is parsed to find all its callee functions
        to be materialized too. However, the old code doesn't
        detect callee fnctions referenced indirectly (e.g. a
        callee function passed as argument to another function).

        This change fixes the problem above inspecting the use
        lists of llvm::Function objects. When parsing the bitcode
        module into memory, LLVM already establishes a use list
        for each llvm::Value object which llvm::Function is a
        subclass of. A use list contains all the locations in
        the module in which the Value is referenced. For a
        llvm::Function object, that would be its call sites and
        constant expressions referencing the functions. By using
        the use lists of llvm::Function in the module, a global
        map is established at Impala initialization time to map
        functions to their corresponding callee functions. This
        map is then used when materializing a function to ensure
        all its callee functions are also materialized recursively.

        IMPALA-4779: conditional function isfalse(), istrue(),
        isnotfalse(), isnotrue() aren't cross-compiled so they
        will lead to unexpected query failure when codegen is enabled.
        This change will cross-compile these functions.

        IMPALA-4780: next_day() always returns NULL when codegen
        is enabled. The bound checks for next_day() use some class
        static variables initialized in the global constructors
        (@llvm.global_ctors). However, we never execute the global
        constructors before calling the JIT compiled functions.
        This causes these variables to remain as zero, causing all
        executions of next_day() to fail the bound checks. The reason
        why these class static variables aren't compiled as global
        constants in LLVM IR is that TimestampFunctions::MIN_YEAR is
        not a compile time constant. This change fixes the problem
        above by setting TimestampFunctions::MIN_YEAR to a known constant
        value. A DCHECK is added to verify that it matches the value
        defined in the boost library.

        Change-Id: I40fdb035a565ae2f9c9fbf4db48a548653ef7608
        Reviewed-on: http://gerrit.cloudera.org:8080/5732
        Reviewed-by: Michael Ho <kwho@cloudera.com>
        Tested-by: Impala Public Jenkins

        Show
        kwho Michael Ho added a comment - https://github.com/apache/incubator-impala/commit/1d933919ee964d8766ba028623d66ec20cd123ac IMPALA-4705 , IMPALA-4779 , IMPALA-4780 : Fix some Expr bugs with codegen This change fixes expr-test.cc to work with codegen as it's originally intended. Fixing it uncovers a couple of bugs fixed in this patch: IMPALA-4705 : When an IR function is materialized, its function body is parsed to find all its callee functions to be materialized too. However, the old code doesn't detect callee fnctions referenced indirectly (e.g. a callee function passed as argument to another function). This change fixes the problem above inspecting the use lists of llvm::Function objects. When parsing the bitcode module into memory, LLVM already establishes a use list for each llvm::Value object which llvm::Function is a subclass of. A use list contains all the locations in the module in which the Value is referenced. For a llvm::Function object, that would be its call sites and constant expressions referencing the functions. By using the use lists of llvm::Function in the module, a global map is established at Impala initialization time to map functions to their corresponding callee functions. This map is then used when materializing a function to ensure all its callee functions are also materialized recursively. IMPALA-4779 : conditional function isfalse(), istrue(), isnotfalse(), isnotrue() aren't cross-compiled so they will lead to unexpected query failure when codegen is enabled. This change will cross-compile these functions. IMPALA-4780 : next_day() always returns NULL when codegen is enabled. The bound checks for next_day() use some class static variables initialized in the global constructors (@llvm.global_ctors). However, we never execute the global constructors before calling the JIT compiled functions. This causes these variables to remain as zero, causing all executions of next_day() to fail the bound checks. The reason why these class static variables aren't compiled as global constants in LLVM IR is that TimestampFunctions::MIN_YEAR is not a compile time constant. This change fixes the problem above by setting TimestampFunctions::MIN_YEAR to a known constant value. A DCHECK is added to verify that it matches the value defined in the boost library. Change-Id: I40fdb035a565ae2f9c9fbf4db48a548653ef7608 Reviewed-on: http://gerrit.cloudera.org:8080/5732 Reviewed-by: Michael Ho <kwho@cloudera.com> Tested-by: Impala Public Jenkins
        Hide
        jbapple Jim Apple added a comment -
        Show
        jbapple Jim Apple added a comment - Patch from Michael Ho in review: https://gerrit.cloudera.org/#/c/5732/
        Hide
        kwho Michael Ho added a comment -

        The scan example was done with the default query options (i.e. no need to disable expr rewrite etc). So, yes, they will get a failed query even without changing the query options. They can work around it by disabling codegen.

        Show
        kwho Michael Ho added a comment - The scan example was done with the default query options (i.e. no need to disable expr rewrite etc). So, yes, they will get a failed query even without changing the query options. They can work around it by disabling codegen.
        Hide
        jbapple Jim Apple added a comment -

        I notice that this is an ERROR in the report but WARNING here. Is that due to the different contexts? Could users get an ERROR without setting the variables?

        Show
        jbapple Jim Apple added a comment - I notice that this is an ERROR in the report but WARNING here. Is that due to the different contexts? Could users get an ERROR without setting the variables?
        Hide
        kwho Michael Ho added a comment -

        It will happen as long as any of this function is embedded in codegen enabled operator. The query options were just for working around the logic in fe-support.cc for the simple query select istrue(false)

        [localhost:21000] > select count(*) from tpch_parquet.lineitem where istrue(cast(l_orderkey as BOOLEAN));
        Query: select count(*) from tpch_parquet.lineitem where istrue(cast(l_orderkey as BOOLEAN))
        Query submitted at: 2017-01-26 10:37:26 (Coordinator: http://kwho-desktop:25000)
        Query progress can be monitored at: http://kwho-desktop:25000/query_plan?query_id=ff41db6dddd8decd:1c130e5800000000
        WARNINGS:
        Builtin 'istrue' with symbol '_ZN6impala20ConditionalFunctions6IsTrueEPN10impala_udf15FunctionContextERKNS1_10BooleanValE' does not exist. Verify that all your impalads are the same version.
        
        
        
        
        [localhost:21000] >
        
        Show
        kwho Michael Ho added a comment - It will happen as long as any of this function is embedded in codegen enabled operator. The query options were just for working around the logic in fe-support.cc for the simple query select istrue(false) [localhost:21000] > select count(*) from tpch_parquet.lineitem where istrue(cast(l_orderkey as BOOLEAN)); Query: select count(*) from tpch_parquet.lineitem where istrue(cast(l_orderkey as BOOLEAN)) Query submitted at: 2017-01-26 10:37:26 (Coordinator: http://kwho-desktop:25000) Query progress can be monitored at: http://kwho-desktop:25000/query_plan?query_id=ff41db6dddd8decd:1c130e5800000000 WARNINGS: Builtin 'istrue' with symbol '_ZN6impala20ConditionalFunctions6IsTrueEPN10impala_udf15FunctionContextERKNS1_10BooleanValE' does not exist. Verify that all your impalads are the same version. [localhost:21000] >
        Hide
        jbapple Jim Apple added a comment -

        In order for a user to hit this, they have to set enable_expr_rewrites=0 and EXEC_SINGLE_NODE_ROWS_THRESHOLD=0? Is there another way they could end up with those variables set that way?

        If not, is this really a blocker?

        Show
        jbapple Jim Apple added a comment - In order for a user to hit this, they have to set enable_expr_rewrites=0 and EXEC_SINGLE_NODE_ROWS_THRESHOLD=0? Is there another way they could end up with those variables set that way? If not, is this really a blocker?

          People

          • Assignee:
            kwho Michael Ho
            Reporter:
            kwho Michael Ho
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development