[IMPALA-7014] Disable stacktrace symbolisation by default - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Critical
Resolution: Fixed
Affects Version/s: Not Applicable
Fix Version/s: Impala 2.13.0, Impala 3.1.0
Component/s: Backend
Labels:
None

Target Version:

Impala 3.1.0
Epic Color:
ghx-label-8

Description

We got burned by the code of producing stacktrace again with ~~IMPALA-6996~~. I did a quick investigation into this, based on the hypothesis that the symbolisation was the expensive part, rather than getting the addresses. I added a stopwatch to GetStackTrace() to measure the time in nanoseconds and ran a test that produces a backtrace

The first experiment was

$ start-impala-cluster.py --impalad_args='--symbolize_stacktrace=true' && impala-py.test tests/query_test/test_scanners.py -k codec

I0511 09:45:11.897944 30904 debug-util.cc:283] stacktrace time: 75175573
I0511 09:45:11.897956 30904 status.cc:125] File 'hdfs://localhost:20500/test-warehouse/test_bad_compression_codec_308108.db/bad_codec/bad_codec.parquet' uses an unsupported compression: 5000 for column 'id'.
    @          0x18782ef  impala::Status::Status()
    @          0x2cbe96f  impala::ParquetMetadataUtils::ValidateRowGroupColumn()
    @          0x205f597  impala::BaseScalarColumnReader::Reset()
    @          0x1feebe6  impala::HdfsParquetScanner::InitScalarColumns()
    @          0x1fe6ff3  impala::HdfsParquetScanner::NextRowGroup()
    @          0x1fe58d8  impala::HdfsParquetScanner::GetNextInternal()
    @          0x1fe3eea  impala::HdfsParquetScanner::ProcessSplit()
    @          0x1f6ba36  impala::HdfsScanNode::ProcessSplit()
    @          0x1f6adc4  impala::HdfsScanNode::ScannerThread()
    @          0x1f6a1c4  _ZZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS_18ThreadResourcePoolEENKUlvE_clEv
    @          0x1f6c2a6  _ZN5boost6detail8function26void_function_obj_invoker0IZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS3_18ThreadResourcePoolEEUlvE_vE6invokeERNS1_15function_bufferE
    @          0x1bd3b1a  boost::function0<>::operator()()
    @          0x1ebecd5  impala::Thread::SuperviseThread()
    @          0x1ec6e71  boost::_bi::list5<>::operator()<>()
    @          0x1ec6d95  boost::_bi::bind_t<>::operator()()
    @          0x1ec6d58  boost::detail::thread_data<>::run()
    @          0x31b3ada  thread_proxy
    @     0x7f9be67d36ba  start_thread
    @     0x7f9be650941d  clone

The stacktrace took 75ms, which is pretty bad! It would be worse on a production system with more memory maps.

The next experiment was to disable it:

start-impala-cluster.py --impalad_args='--symbolize_stacktrace=false' && impala-py.test tests/query_test/test_scanners.py -k codec

I0511 09:43:47.574185 29514 debug-util.cc:283] stacktrace time: 29528
I0511 09:43:47.574193 29514 status.cc:125] File 'hdfs://localhost:20500/test-warehouse/test_bad_compression_codec_cb5d0225.db/bad_codec/bad_codec.parquet' uses an unsupported compression: 5000 for column 'id'.
    @          0x18782ef
    @          0x2cbe96f
    @          0x205f597
    @          0x1feebe6
    @          0x1fe6ff3
    @          0x1fe58d8
    @          0x1fe3eea
    @          0x1f6ba36
    @          0x1f6adc4
    @          0x1f6a1c4
    @          0x1f6c2a6
    @          0x1bd3b1a
    @          0x1ebecd5
    @          0x1ec6e71
    @          0x1ec6d95
    @          0x1ec6d58
    @          0x31b3ada
    @     0x7fbdcbdef6ba
    @     0x7fbdcbb2541d

That's 2545x faster! If the addresses are in the statically linked binary, we can use addrline to get back the line numbers:

$ addr2line -e be/build/latest/service/impalad 0x2cbe96f
/home/tarmstrong/Impala/incubator-impala/be/src/exec/parquet-metadata-utils.cc:166

Attachments

Activity

People

Assignee:: Zoram Thanga

Reporter:: Tim Armstrong

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Dates

Created:: 11/May/18 16:53

Updated:: 14/Jun/19 02:19

Resolved:: 18/Jul/18 21:37