Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-7014

Disable stacktrace symbolisation by default

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: Not Applicable
    • Fix Version/s: Impala 2.13.0, Impala 3.1.0
    • Component/s: Backend
    • Labels:
      None

      Description

      We got burned by the code of producing stacktrace again with IMPALA-6996. I did a quick investigation into this, based on the hypothesis that the symbolisation was the expensive part, rather than getting the addresses. I added a stopwatch to GetStackTrace() to measure the time in nanoseconds and ran a test that produces a backtrace

      The first experiment was

      $ start-impala-cluster.py --impalad_args='--symbolize_stacktrace=true' && impala-py.test tests/query_test/test_scanners.py -k codec
      
      I0511 09:45:11.897944 30904 debug-util.cc:283] stacktrace time: 75175573
      I0511 09:45:11.897956 30904 status.cc:125] File 'hdfs://localhost:20500/test-warehouse/test_bad_compression_codec_308108.db/bad_codec/bad_codec.parquet' uses an unsupported compression: 5000 for column 'id'.
          @          0x18782ef  impala::Status::Status()
          @          0x2cbe96f  impala::ParquetMetadataUtils::ValidateRowGroupColumn()
          @          0x205f597  impala::BaseScalarColumnReader::Reset()
          @          0x1feebe6  impala::HdfsParquetScanner::InitScalarColumns()
          @          0x1fe6ff3  impala::HdfsParquetScanner::NextRowGroup()
          @          0x1fe58d8  impala::HdfsParquetScanner::GetNextInternal()
          @          0x1fe3eea  impala::HdfsParquetScanner::ProcessSplit()
          @          0x1f6ba36  impala::HdfsScanNode::ProcessSplit()
          @          0x1f6adc4  impala::HdfsScanNode::ScannerThread()
          @          0x1f6a1c4  _ZZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS_18ThreadResourcePoolEENKUlvE_clEv
          @          0x1f6c2a6  _ZN5boost6detail8function26void_function_obj_invoker0IZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS3_18ThreadResourcePoolEEUlvE_vE6invokeERNS1_15function_bufferE
          @          0x1bd3b1a  boost::function0<>::operator()()
          @          0x1ebecd5  impala::Thread::SuperviseThread()
          @          0x1ec6e71  boost::_bi::list5<>::operator()<>()
          @          0x1ec6d95  boost::_bi::bind_t<>::operator()()
          @          0x1ec6d58  boost::detail::thread_data<>::run()
          @          0x31b3ada  thread_proxy
          @     0x7f9be67d36ba  start_thread
          @     0x7f9be650941d  clone
      
      

      The stacktrace took 75ms, which is pretty bad! It would be worse on a production system with more memory maps.

      The next experiment was to disable it:

      start-impala-cluster.py --impalad_args='--symbolize_stacktrace=false' && impala-py.test tests/query_test/test_scanners.py -k codec
      
      I0511 09:43:47.574185 29514 debug-util.cc:283] stacktrace time: 29528
      I0511 09:43:47.574193 29514 status.cc:125] File 'hdfs://localhost:20500/test-warehouse/test_bad_compression_codec_cb5d0225.db/bad_codec/bad_codec.parquet' uses an unsupported compression: 5000 for column 'id'.
          @          0x18782ef
          @          0x2cbe96f
          @          0x205f597
          @          0x1feebe6
          @          0x1fe6ff3
          @          0x1fe58d8
          @          0x1fe3eea
          @          0x1f6ba36
          @          0x1f6adc4
          @          0x1f6a1c4
          @          0x1f6c2a6
          @          0x1bd3b1a
          @          0x1ebecd5
          @          0x1ec6e71
          @          0x1ec6d95
          @          0x1ec6d58
          @          0x31b3ada
          @     0x7fbdcbdef6ba
          @     0x7fbdcbb2541d
      

      That's 2545x faster! If the addresses are in the statically linked binary, we can use addrline to get back the line numbers:

      $ addr2line -e be/build/latest/service/impalad 0x2cbe96f
      /home/tarmstrong/Impala/incubator-impala/be/src/exec/parquet-metadata-utils.cc:166
      

        Attachments

          Activity

            People

            • Assignee:
              zoram Zoram Thanga
              Reporter:
              tarmstrong Tim Armstrong
            • Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: