Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-5252

Java UDF returning string can lead to crash under memory pressure.

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: Impala 2.8.0
    • Fix Version/s: Impala 2.9.0
    • Component/s: Backend
    • Labels:

      Description

      Look at this code from hive-udf-call.cc HiveUdfCall::GetStringVal():

      StringVal HiveUdfCall::GetStringVal(ExprContext* ctx, const TupleRow* row) {
        DCHECK_EQ(type_.type, TYPE_STRING);
        StringVal result = *reinterpret_cast<StringVal*>(Evaluate(ctx, row));
      
        // Copy the string into a local allocation with the usual lifetime for expr results.
        // Needed because the UDF output buffer is owned by the Java UDF executor and may be
        // freed or reused by the next call into the Java UDF executor.
        FunctionContext* fn_ctx = ctx->fn_context(fn_context_index_);
        uint8_t* local_alloc = fn_ctx->impl()->AllocateLocal(result.len);
        memcpy(local_alloc, result.ptr, result.len); <--- crash when local_alloc is nullptr
        result.ptr = local_alloc;
        return result;
      }
      

        Issue Links

          Activity

          Hide
          alex.behm Alexander Behm added a comment -

          Michael Brown, just to put this on your radar, if not already. We should add UDFs to the stress test.

          Show
          alex.behm Alexander Behm added a comment - Michael Brown , just to put this on your radar, if not already. We should add UDFs to the stress test.
          Hide
          mikesbrown Michael Brown added a comment -

          Alexander Behm Do you mean adding/removing UDFs to the stress test rapidly, in a stressful way (these are metadata operations), or do you mean setting a set of UDFs as part of a setup phase, and then using them repeatedly in queries during the stress test phase?

          Show
          mikesbrown Michael Brown added a comment - Alexander Behm Do you mean adding/removing UDFs to the stress test rapidly, in a stressful way (these are metadata operations), or do you mean setting a set of UDFs as part of a setup phase, and then using them repeatedly in queries during the stress test phase?
          Hide
          alex.behm Alexander Behm added a comment -

          Michael Brown, in the context of this bug, the latter: Doing a one-time setup, and then invoking UDFs in queries.

          Show
          alex.behm Alexander Behm added a comment - Michael Brown , in the context of this bug, the latter: Doing a one-time setup, and then invoking UDFs in queries.
          Hide
          dhecht Dan Hecht added a comment -

          We also have FLAGS_stress_free_pool_alloc option that we use to exercise these paths, so hopefully we can exercise this during normal end-to-end tests.

          Show
          dhecht Dan Hecht added a comment - We also have FLAGS_stress_free_pool_alloc option that we use to exercise these paths, so hopefully we can exercise this during normal end-to-end tests.
          Hide
          dhecht Dan Hecht added a comment -

          commit 741421de09f4236d9a45c00753501d6e6abe90ca
          Author: Dan Hecht <dhecht@cloudera.com>
          Date: Thu Apr 27 16:03:31 2017 -0700

          IMPALA-5252: Fix crash in HiveUdfCall::GetStringVal() when mem_limit exceeded

          We need to check for AllocateLocal() returning NULL. CopyFrom() takes
          care of that for us. Also adjust a few other places in the code base
          that didn't have the check.

          The new test reproduces the crash, but in order to get this test file to
          execute, I had to move the xfail to be a function decorator. Apparently
          xfail as a statement causes the test to not run at all. We should run
          all of these queries even if they are non-determistic to at least verify
          that impalad does not crash.

          Change-Id: Iafefef24479164cc4d2b99191d2de28eb8b311b6
          Reviewed-on: http://gerrit.cloudera.org:8080/6761
          Reviewed-by: Dan Hecht <dhecht@cloudera.com>
          Tested-by: Impala Public Jenkins

          Show
          dhecht Dan Hecht added a comment - commit 741421de09f4236d9a45c00753501d6e6abe90ca Author: Dan Hecht <dhecht@cloudera.com> Date: Thu Apr 27 16:03:31 2017 -0700 IMPALA-5252 : Fix crash in HiveUdfCall::GetStringVal() when mem_limit exceeded We need to check for AllocateLocal() returning NULL. CopyFrom() takes care of that for us. Also adjust a few other places in the code base that didn't have the check. The new test reproduces the crash, but in order to get this test file to execute, I had to move the xfail to be a function decorator. Apparently xfail as a statement causes the test to not run at all. We should run all of these queries even if they are non-determistic to at least verify that impalad does not crash. Change-Id: Iafefef24479164cc4d2b99191d2de28eb8b311b6 Reviewed-on: http://gerrit.cloudera.org:8080/6761 Reviewed-by: Dan Hecht <dhecht@cloudera.com> Tested-by: Impala Public Jenkins

            People

            • Assignee:
              dhecht Dan Hecht
              Reporter:
              alex.behm Alexander Behm
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development