Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-4705

Impala may miss materialization of indirectly referenced functions

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: Impala 2.7.0, Impala 2.8.0
    • Fix Version/s: Impala 2.9.0
    • Component/s: Backend
    • Labels:
    • Environment:
      Staging

      Description

      F1221 06:39:38.647959 30325 llvm-codegen.cc:106] LLVM hit fatal error: Program used external function '__cxx_global_array_dtor.92' which could not be resolved!

      The daemon is running on CentOS 6.6
      CDH 5.9.0
      Impala version v2.7.0-cdh5.9.0

        Activity

        Hide
        tarmstrong Tim Armstrong added a comment -

        Do you have any info about what query was running at the time? Are you using any IR udfs?

        Show
        tarmstrong Tim Armstrong added a comment - Do you have any info about what query was running at the time? Are you using any IR udfs?
        Hide
        stoevp_impala_2d6a Plamen Stoev added a comment -

        HI,

        Not sure, but it happens occasionally on different nodes
        The stack trace is :
        {{
        Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
        E1227 00:55:55.578109 5338 logging.cc:121] stderr will be logged to this file.
        E1227 00:57:14.915649 5828 impala-server.cc:1345] There was an error processing the impalad catalog update. Requesting a full topic update to recover: CatalogException: Detected catalog service ID change. Aborting updateCatalog()
        F1227 04:07:03.869304 20930 llvm-codegen.cc:106] LLVM hit fatal error: Program used external function '__cxx_global_array_dtor.92' which could not be resolved!

            • Check failure stack trace: ***
              @ 0x1b9673d (unknown)
              @ 0x1b99066 (unknown)
              @ 0x1b9625d (unknown)
              @ 0x1b99b0e (unknown)
              @ 0xc21ce8 (unknown)
              @ 0x1ae0373 (unknown)
              @ 0x1879aef (unknown)
              @ 0x1879b14 (unknown)
              @ 0x172b06f (unknown)
              @ 0x172b3f1 (unknown)
              @ 0xc29050 (unknown)
              @ 0xdf6f52 (unknown)
              @ 0xdf9989 (unknown)
              @ 0xb4e7d8 (unknown)
              @ 0xb47422 (unknown)
              @ 0xb488da (unknown)
              @ 0xbf5b09 (unknown)
              @ 0xbf64a4 (unknown)
              @ 0xe5c7aa (unknown)
              @ 0x7f2bf270f9d1 start_thread
              @ 0x7f2bf245c8fd clone
              Picked up JAVA_TOOL_OPTIONS:
              Wrote minidump to 15aadfbd-6529-3b4a-4635b0ed-2c1f089d.dmp.gz
              }}

        Attached minidump file (if this could help)

        Show
        stoevp_impala_2d6a Plamen Stoev added a comment - HI, Not sure, but it happens occasionally on different nodes The stack trace is : {{ Log line format: [IWEF] mmdd hh:mm:ss.uuuuuu threadid file:line ] msg E1227 00:55:55.578109 5338 logging.cc:121] stderr will be logged to this file. E1227 00:57:14.915649 5828 impala-server.cc:1345] There was an error processing the impalad catalog update. Requesting a full topic update to recover: CatalogException: Detected catalog service ID change. Aborting updateCatalog() F1227 04:07:03.869304 20930 llvm-codegen.cc:106] LLVM hit fatal error: Program used external function '__cxx_global_array_dtor.92' which could not be resolved! Check failure stack trace: *** @ 0x1b9673d (unknown) @ 0x1b99066 (unknown) @ 0x1b9625d (unknown) @ 0x1b99b0e (unknown) @ 0xc21ce8 (unknown) @ 0x1ae0373 (unknown) @ 0x1879aef (unknown) @ 0x1879b14 (unknown) @ 0x172b06f (unknown) @ 0x172b3f1 (unknown) @ 0xc29050 (unknown) @ 0xdf6f52 (unknown) @ 0xdf9989 (unknown) @ 0xb4e7d8 (unknown) @ 0xb47422 (unknown) @ 0xb488da (unknown) @ 0xbf5b09 (unknown) @ 0xbf64a4 (unknown) @ 0xe5c7aa (unknown) @ 0x7f2bf270f9d1 start_thread @ 0x7f2bf245c8fd clone Picked up JAVA_TOOL_OPTIONS: Wrote minidump to 15aadfbd-6529-3b4a-4635b0ed-2c1f089d.dmp.gz }} Attached minidump file (if this could help)
        Hide
        stoevp_impala_2d6a Plamen Stoev added a comment -

        Maybe it is related to IMPALA-4266

        Show
        stoevp_impala_2d6a Plamen Stoev added a comment - Maybe it is related to IMPALA-4266
        Hide
        tarmstrong Tim Armstrong added a comment -

        Thanks for attaching the minidump - we can take a look to see if there are any clues there.

        This error is likely to be associate with a particular query - if you have any ideas about which query might be triggering the crash, that would also help us investigate.

        Show
        tarmstrong Tim Armstrong added a comment - Thanks for attaching the minidump - we can take a look to see if there are any clues there. This error is likely to be associate with a particular query - if you have any ideas about which query might be triggering the crash, that would also help us investigate.
        Hide
        stoevp_impala_2d6a Plamen Stoev added a comment -

        Back again

        What I found so far :
        Executing query with distinct and function:
        set DISABLE_CODEGEN=0;
        select distinct id,timeofday() from history.items limit 10;
        results in Impala daemon crash

        With DISABLE_CODEGEN=1 it passes , but results in reduced performance and increased memory usage.

        Show
        stoevp_impala_2d6a Plamen Stoev added a comment - Back again What I found so far : Executing query with distinct and function: set DISABLE_CODEGEN=0; select distinct id,timeofday() from history.items limit 10; results in Impala daemon crash With DISABLE_CODEGEN=1 it passes , but results in reduced performance and increased memory usage.
        Hide
        tarmstrong Tim Armstrong added a comment -

        Thanks, we'll have to try to reproduce this. I'm setting it to a blocker since it sounds like a crash that may be a now one.

        Show
        tarmstrong Tim Armstrong added a comment - Thanks, we'll have to try to reproduce this. I'm setting it to a blocker since it sounds like a crash that may be a now one.
        Hide
        jbapple Jim Apple added a comment -

        Patch available for review: https://gerrit.cloudera.org/#/c/5732/

        Show
        jbapple Jim Apple added a comment - Patch available for review: https://gerrit.cloudera.org/#/c/5732/
        Hide
        kwho Michael Ho added a comment -

        https://github.com/apache/incubator-impala/commit/1d933919ee964d8766ba028623d66ec20cd123ac

        IMPALA-4705, IMPALA-4779, IMPALA-4780: Fix some Expr bugs with codegen
        This change fixes expr-test.cc to work with codegen as it's
        originally intended. Fixing it uncovers a couple of bugs fixed
        in this patch:

        IMPALA-4705: When an IR function is materialized, its
        function body is parsed to find all its callee functions
        to be materialized too. However, the old code doesn't
        detect callee fnctions referenced indirectly (e.g. a
        callee function passed as argument to another function).

        This change fixes the problem above inspecting the use
        lists of llvm::Function objects. When parsing the bitcode
        module into memory, LLVM already establishes a use list
        for each llvm::Value object which llvm::Function is a
        subclass of. A use list contains all the locations in
        the module in which the Value is referenced. For a
        llvm::Function object, that would be its call sites and
        constant expressions referencing the functions. By using
        the use lists of llvm::Function in the module, a global
        map is established at Impala initialization time to map
        functions to their corresponding callee functions. This
        map is then used when materializing a function to ensure
        all its callee functions are also materialized recursively.

        IMPALA-4779: conditional function isfalse(), istrue(),
        isnotfalse(), isnotrue() aren't cross-compiled so they
        will lead to unexpected query failure when codegen is enabled.
        This change will cross-compile these functions.

        IMPALA-4780: next_day() always returns NULL when codegen
        is enabled. The bound checks for next_day() use some class
        static variables initialized in the global constructors
        (@llvm.global_ctors). However, we never execute the global
        constructors before calling the JIT compiled functions.
        This causes these variables to remain as zero, causing all
        executions of next_day() to fail the bound checks. The reason
        why these class static variables aren't compiled as global
        constants in LLVM IR is that TimestampFunctions::MIN_YEAR is
        not a compile time constant. This change fixes the problem
        above by setting TimestampFunctions::MIN_YEAR to a known constant
        value. A DCHECK is added to verify that it matches the value
        defined in the boost library.

        Change-Id: I40fdb035a565ae2f9c9fbf4db48a548653ef7608
        Reviewed-on: http://gerrit.cloudera.org:8080/5732
        Reviewed-by: Michael Ho <kwho@cloudera.com>
        Tested-by: Impala Public Jenkins

        Show
        kwho Michael Ho added a comment - https://github.com/apache/incubator-impala/commit/1d933919ee964d8766ba028623d66ec20cd123ac IMPALA-4705 , IMPALA-4779 , IMPALA-4780 : Fix some Expr bugs with codegen This change fixes expr-test.cc to work with codegen as it's originally intended. Fixing it uncovers a couple of bugs fixed in this patch: IMPALA-4705 : When an IR function is materialized, its function body is parsed to find all its callee functions to be materialized too. However, the old code doesn't detect callee fnctions referenced indirectly (e.g. a callee function passed as argument to another function). This change fixes the problem above inspecting the use lists of llvm::Function objects. When parsing the bitcode module into memory, LLVM already establishes a use list for each llvm::Value object which llvm::Function is a subclass of. A use list contains all the locations in the module in which the Value is referenced. For a llvm::Function object, that would be its call sites and constant expressions referencing the functions. By using the use lists of llvm::Function in the module, a global map is established at Impala initialization time to map functions to their corresponding callee functions. This map is then used when materializing a function to ensure all its callee functions are also materialized recursively. IMPALA-4779 : conditional function isfalse(), istrue(), isnotfalse(), isnotrue() aren't cross-compiled so they will lead to unexpected query failure when codegen is enabled. This change will cross-compile these functions. IMPALA-4780 : next_day() always returns NULL when codegen is enabled. The bound checks for next_day() use some class static variables initialized in the global constructors (@llvm.global_ctors). However, we never execute the global constructors before calling the JIT compiled functions. This causes these variables to remain as zero, causing all executions of next_day() to fail the bound checks. The reason why these class static variables aren't compiled as global constants in LLVM IR is that TimestampFunctions::MIN_YEAR is not a compile time constant. This change fixes the problem above by setting TimestampFunctions::MIN_YEAR to a known constant value. A DCHECK is added to verify that it matches the value defined in the boost library. Change-Id: I40fdb035a565ae2f9c9fbf4db48a548653ef7608 Reviewed-on: http://gerrit.cloudera.org:8080/5732 Reviewed-by: Michael Ho <kwho@cloudera.com> Tested-by: Impala Public Jenkins

          People

          • Assignee:
            kwho Michael Ho
            Reporter:
            stoevp_impala_2d6a Plamen Stoev
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development