Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
Impala 4.0.0
-
None
-
ghx-label-2
Description
FROM_UNIXTIME function is implemented by calling TimestampValue::ToString() in TimestampFunctions::FromUnix().
We found out that evaluation of TimestampValue::ToString() can get trapped in tcmalloc::CentralFreeList lock, as shown in this pstack
#0 0x000000000277d81a in base::internal::SpinLockDelay(int volatile*, int, int) () #1 0x00000000027d17f9 in SpinLock::SlowLock() () #2 0x000000000287a399 in tcmalloc::CentralFreeList::RemoveRange(void**, void**, int) () #3 0x00000000028882f3 in tcmalloc::ThreadCache::FetchFromCentralCache(unsigned long, unsigned long) () #4 0x00000000029c5e88 in tc_newarray () #5 0x00007faedc677169 in std::string::_Rep::_S_create(unsigned long, unsigned long, std::allocator<char> const&) () from /opt/cloudera/parcels/CDH-6.2.0-1.cdh6.2.0.p4948.16676264/lib/impala/lib/libstdc++.so.6 #6 0x0000000000f769de in impala::TimestampValue::ToString() const () #7 0x00007faeb317e08e in ?? () #8 0x00007fad62af6068 in ?? () #9 0x00007faedc8c20c0 in ?? () from /opt/cloudera/parcels/CDH-6.2.0-1.cdh6.2.0.p4948.16676264/lib/impala/lib/libstdc++.so.6 #10 0x0000000000000000 in ?? ()
This is presumably due to the combination use of stringstream, boost::gregorian::to_iso_extended_string and boost::posix_time::to_simple_string that involve multiple string allocation and copying.
This can be problematic when FROM_UNIXTIME is being evaluated for millions of rows.
We should come up with better implementation that involve less string allocation and copying.
Attachments
Issue Links
- is related to
-
IMPALA-12724 Better Implementation for Impala PrettyPrint Functions
- Open