Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-12223

Coordinator crash in serializing huge profile

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • Impala 4.2.0
    • Impala 4.3.0
    • Backend
    • None
    • ghx-label-13

    Description

      Bugs like IMPALA-11200, IMPALA-12204 could cause huge profiles (>4GB). When serializing such profiles, coordinator might crash. Here is the resolved backtrace:

      (gdb) bt
      #0  0x00007f208a9bd514 in __memcpy_ssse3_back () from /lib64/libc.so.6
      #1  0x00000000030a2a82 in apache::thrift::transport::TMemoryBuffer::writeSlow(unsigned char const*, unsigned int) ()
      #2  0x0000000001156d25 in apache::thrift::protocol::TCompactProtocolT<apache::thrift::transport::TMemoryBuffer>::writeBinary (this=0x269e1970, str=...)
          at ../../../toolchain/toolchain-packages-gcc7.5.0/thrift-0.16.0-p3/include/thrift/protocol/TCompactProtocol.tcc:285
      #3  0x00000000011fb3d2 in writeString (str=..., this=0x269e1970) at ../../../toolchain/toolchain-packages-gcc7.5.0/thrift-0.16.0-p3/include/thrift/protocol/TProtocol.h:345
      #4  impala::TRuntimeProfileNode::write<apache::thrift::protocol::TProtocol> (this=this@entry=0xf28f2490, oprot=oprot@entry=0x269e1970) at ../../generated-sources/gen-cpp/RuntimeProfile_types.tcc:2003
      #5  0x00000000011fd18b in impala::TRuntimeProfileTree::write<apache::thrift::protocol::TProtocol> (this=0x7f16c4ba0920, oprot=0x269e1970) at ../../generated-sources/gen-cpp/RuntimeProfile_types.tcc:2181
      #6  0x0000000001686020 in SerializeToBuffer<impala::TRuntimeProfileTree> (buffer=<synthetic pointer>, len=<synthetic pointer>, obj=0x7f16c4ba0920, this=0x7f16c4ba08e0) at ../rpc/thrift-util.h:83
      #7  SerializeToVector<impala::TRuntimeProfileTree> (result=<synthetic pointer>, obj=0x7f16c4ba0920, this=0x7f16c4ba08e0) at ../rpc/thrift-util.h:71
      #8  impala::RuntimeProfile::Compress (this=this@entry=0x32d1f2c0, out=out@entry=0x7f16c4ba0c20) at runtime-profile.cc:1563
      #9  0x0000000001686538 in impala::RuntimeProfile::SerializeToArchiveString (this=this@entry=0x32d1f2c0, out=0x7f16c4ba0e60) at runtime-profile.cc:1627
      #10 0x00000000013cf91e in impala::ImpalaServer::GetRuntimeProfileOutput (this=this@entry=0xefa7400, user=..., query_handle=..., format=format@entry=impala::TRuntimeProfileFormat::BASE64, profile=profile@entry=0x7f16c4ba0dd0)
          at impala-server.cc:695
      #11 0x00000000013d167e in impala::ImpalaServer::GetRuntimeProfileOutput (this=this@entry=0xefa7400, query_id=..., user=..., format=format@entry=impala::TRuntimeProfileFormat::BASE64, profile=profile@entry=0x7f16c4ba0dd0) at impala-server.cc:809
      #12 0x00000000013b3bff in impala::ImpalaHttpHandler::QueryProfileHelper (this=0xe3e8660, req=..., document=document@entry=0x7f16c4ba1130, format=format@entry=impala::TRuntimeProfileFormat::BASE64) at impala-http-handler.cc:330
      #13 0x00000000013b5c86 in impala::ImpalaHttpHandler::QueryProfileEncodedHandler (this=<optimized out>, req=..., document=0x7f16c4ba1130) at impala-http-handler.cc:344
      #14 0x00000000016cf73d in operator() (a1=0x7f16c4ba1130, a0=..., this=0xf507538) at ../../../toolchain/toolchain-packages-gcc7.5.0/boost-1.61.0-p2/include/boost/function/function_template.hpp:770
      #15 impala::Webserver::RenderUrlWithTemplate (this=this@entry=0xe62eec0, connection=connection@entry=0x401aa000, req=..., url_handler=..., output=output@entry=0x7f16c4ba1730, content_type=content_type@entry=0x7f16c4ba15bc) at webserver.cc:897
      #16 0x00000000016d1cf3 in impala::Webserver::BeginRequestCallback (this=0xe62eec0, connection=0x401aa000, request_info=0x401aa000) at webserver.cc:772
      #17 0x00000000016e7ed1 in handle_request ()
      #18 0x00000000016ea5f8 in worker_thread ()
      #19 0x00007f208de89ea5 in start_thread () from /lib64/libpthread.so.0
      #20 0x00007f208a965b0d in clone () from /lib64/libc.so.6

      It crashes In memcpy at a move instruction writing to memory:

      (gdb) x/5i $pc-6
         0x7f208a9bd50e <__memcpy_ssse3_back+6302>:	add    %al,(%rax)
         0x7f208a9bd510 <__memcpy_ssse3_back+6304>:	movdqu (%rsi),%xmm1
      => 0x7f208a9bd514 <__memcpy_ssse3_back+6308>:	movdqu %xmm0,(%r8)
         0x7f208a9bd519 <__memcpy_ssse3_back+6313>:	movdqa %xmm1,(%rdi)
         0x7f208a9bd51d <__memcpy_ssse3_back+6317>:	sub    $0x10,%rdx 

      Futher debug shows the cause is a write overflow in the thrift lib (THRIFT-5716). The bug exists since thrift-0.14.0.

      Attachments

        Issue Links

          Activity

            People

              stigahuang Quanlong Huang
              stigahuang Quanlong Huang
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: