Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
Impala 4.1.0
-
None
-
None
-
ghx-label-4
Description
When there is a crash in codegen'd code, the minidump's stack trace often does not cleanly hook up with the regular code that called into codegen. It can sometimes find the first frame through stack scanning, but then it often also finds garbage:
Thread 499 (crashed) 0 0x7fd40930df9c rax = 0x00000000048d1d80 rdx = 0x0000000000000001 rcx = 0x0000000000000000 rbx = 0x000000000cf83960 rsi = 0x0000000000000000 rdi = 0x000000000ab04000 rbp = 0x0000000000000000 rsp = 0x00007fd31c1dc340 r8 = 0x00000000001d0000 r9 = 0xffffffffffffe000 r10 = 0x00007fd31c1dc3b0 r11 = 0x0000000000000001 r12 = 0x00000000034a34f0 r13 = 0x000000000c613800 r14 = 0x000000000cf83960 r15 = 0x000000000c613a98 rip = 0x00007fd40930df9c Found by: given as instruction pointer in context 1 linux-gate.so + 0xc30 rsp = 0x00007fd31c1dc350 rip = 0x00007ffd41f73c30 Found by: stack scanning 2 impalad!impala::TopNNode::Open(impala::RuntimeState*) [topn-node.cc : 325 + 0xb] rsp = 0x00007fd31c1dc3d0 rip = 0x00000000018935fe Found by: stack scanning 3 impalad!std::pair<std::_Rb_tree_iterator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, bool> std::_Rb_tree<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::_Identity<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >::_M_insert_unique<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) [stl_tree.h : 2049 + 0x1] rsp = 0x00007fd31c1dc430 rip = 0x0000000000cd5d01 Found by: stack scanning
If we add "-fno-omit-frame-pointer" to the LLVM IR compilation (i.e. CLANG_IR_CXX_FLAGS), the minidump cleanly resolves the connection to the regular code. This would be very useful for debugging and performance tracing work.
A basic small scale TPC-H (42) on parquet shows no regression. We should double-check performance impacts. Unless there is a clear performance problem, we should add "-fno-omit-frame-pointer".
Attachments
Issue Links
- relates to
-
IMPALA-7524 Keep frame pointers for generated code
- Open