Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-2284

Handle large string allocations (>1GB) in built-in UDFs gracefully

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      When group_concat() is used in large queries, it can happen that the UDF wants to allocate more than 1GB of data. If this is the case, Impala will crash.

      To reproduce:

      select length(group_concat(l_comment, "!")) from (select l_comment from tpch_parquet.lineitem union all
      select l_comment from tpch_parquet.lineitem union all
      select l_comment from tpch_parquet.lineitem union all
      select l_comment from tpch_parquet.lineitem union all
      select l_comment from tpch_parquet.lineitem union all
      select l_comment from tpch_parquet.lineitem union all
      select l_comment from tpch_parquet.lineitem union all
      select l_comment from tpch_parquet.lineitem union all
      select l_comment from tpch_parquet.lineitem union all
      select l_comment from tpch_parquet.lineitem union all
      select l_comment from tpch_parquet.lineitem union all
      select l_comment from tpch_parquet.lineitem union all
      select l_comment from tpch_parquet.lineitem union all
      select l_comment from tpch_parquet.lineitem union all
      select l_comment from tpch_parquet.lineitem union all
      select l_comment from tpch_parquet.lineitem union all
      select l_comment from tpch_parquet.lineitem union all
      select l_comment from tpch_parquet.lineitem union all
      select l_comment from tpch_parquet.lineitem union all
      select l_comment from tpch_parquet.lineitem union all
      select l_comment from tpch_parquet.lineitem union all
      select l_comment from tpch_parquet.lineitem union all
      select l_comment from tpch_parquet.lineitem union all
      select l_comment from tpch_parquet.lineitem union all
      select l_comment from tpch_parquet.lineitem union all
      select l_comment from tpch_parquet.lineitem union all
      select l_comment from tpch_parquet.lineitem union all
      select l_comment from tpch_parquet.lineitem union all
      select l_comment from tpch_parquet.lineitem union all
      select l_comment from tpch_parquet.lineitem union all
      select l_comment from tpch_parquet.lineitem union all
      select l_comment from tpch_parquet.lineitem union all
      select l_comment from tpch_parquet.lineitem union all
      select l_comment from tpch_parquet.lineitem union all
      select l_comment from tpch_parquet.lineitem union all
      select l_comment from tpch_parquet.lineitem union all
      select l_comment from tpch_parquet.lineitem union all
      select l_comment from tpch_parquet.lineitem union all
      select l_comment from tpch_parquet.lineitem union all
      select l_comment from tpch_parquet.lineitem union all
      select l_comment from tpch_parquet.lineitem union all
      select l_comment from tpch_parquet.lineitem union all
      select l_comment from tpch_parquet.lineitem union all
      select l_comment from tpch_parquet.lineitem union all
      select l_comment from tpch_parquet.lineitem union all
      select l_comment from tpch_parquet.lineitem union all
      select l_comment from tpch_parquet.lineitem union all
      select l_comment from tpch_parquet.lineitem union all
      select l_comment from tpch_parquet.lineitem union all
      select l_comment from tpch_parquet.lineitem) a;
      

      callstack

      #0 0x0000003eaf232925 in raise () from /lib64/libc.so.6 
      #1 0x0000003eaf234105 in abort () from /lib64/libc.so.6 
      #2 0x00007f4eba0a9155 in os::abort(bool) () from /usr/java/jdk1.7.0_45-cloudera/jre/lib/amd64/server/libjvm.so 
      #3 0x00007f4eba228087 in VMError::report_and_die() () from /usr/java/jdk1.7.0_45-cloudera/jre/lib/amd64/server/libjvm.so 
      #4 0x00007f4eba0adadf in JVM_handle_linux_signal () from /usr/java/jdk1.7.0_45-cloudera/jre/lib/amd64/server/libjvm.so 
      #5 <signal handler called> 
      #6 0x00000000009965ba in impala_udf::FunctionContext::Allocate(int) () 
      #7 0x000000000078f3c8 in impala::AggregateFunctions::StringConcat(impala_udf::FunctionContext*, impala_udf::StringVal const&, impala_udf::StringVal const&, impala_udf::StringVal*) () 
      

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            mgrund_impala_bb91 Martin Grund
            mgrund_impala_bb91 Martin Grund
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment