Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
-
ghx-label-11
Description
Queries with select distinct on a complex type fail.
With structs, we get a FE exception:
set disable_codegen=1; use functional_parquet; select distinct(struct_val) from alltypes_structs;
[1] com.google.common.base.Preconditions.checkState (Preconditions.java:486) [2] org.apache.impala.analysis.SlotRef.addStructChildrenAsSlotRefs (SlotRef.java:249) [3] org.apache.impala.analysis.SlotRef.<init> (SlotRef.java:91) [4] org.apache.impala.analysis.AggregateInfoBase.createTupleDesc (AggregateInfoBase.java:135) [5] org.apache.impala.analysis.AggregateInfoBase.createTupleDescs (AggregateInfoBase.java:101) [6] org.apache.impala.analysis.AggregateInfo.create (AggregateInfo.java:150) [7] org.apache.impala.analysis.AggregateInfo.create (AggregateInfo.java:171) [8] org.apache.impala.analysis.MultiAggregateInfo.analyze (MultiAggregateInfo.java:297) [9] org.apache.impala.analysis.SelectStmt$SelectAnalyzer.buildAggregateExprs (SelectStmt.java:1,148) [10] org.apache.impala.analysis.SelectStmt$SelectAnalyzer.analyze (SelectStmt.java:355) [11] org.apache.impala.analysis.SelectStmt$SelectAnalyzer.access$100 (SelectStmt.java:282) [12] org.apache.impala.analysis.SelectStmt.analyze (SelectStmt.java:274) [13] org.apache.impala.analysis.AnalysisContext.analyze (AnalysisContext.java:521) [14] org.apache.impala.analysis.AnalysisContext.analyzeAndAuthorize (AnalysisContext.java:468) [15] org.apache.impala.service.Frontend.doCreateExecRequest (Frontend.java:2,116) [16] org.apache.impala.service.Frontend.getTExecRequest (Frontend.java:2,003) [17] org.apache.impala.service.Frontend.createExecRequest (Frontend.java:1,805) [18] org.apache.impala.service.JniFrontend.createExecRequest (JniFrontend.java:164)
With collections the BE hits a DCHECK and crashes:
use functional_parquet; select distinct(arr1) from complextypes_arrays;
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51 #1 0x00007f2bae54b7f1 in __GI_abort () at abort.c:79 #2 0x00000000020a7763 in google::DumpStackTraceAndExit() [clone .cold] () #3 0x000000000593c4ad in google::LogMessage::Fail() () #4 0x000000000593e3e4 in google::LogMessage::SendToLog() () #5 0x000000000593be8c in google::LogMessage::Flush() () #6 0x000000000593e909 in google::LogMessageFatal::~LogMessageFatal() () #7 0x000000000278b90e in impala::SlotDescriptor::SlotDescriptor (this=0xe82a300, tdesc=..., parent=0xe40d710, children_tuple_descriptor=0x0) at /home/danielbecker/Impala/be/src/runtime/descriptors.cc:114 #8 0x00000000027909aa in impala::DescriptorTbl::CreateInternal (pool=0x18ec7b30, thrift_tbl=..., tbl=0x18ec7aa8) at /home/danielbecker/Impala/be/src/runtime/descriptors.cc:638 #9 0x000000000279050a in impala::DescriptorTbl::Create (pool=0x18ec7b30, serialized_thrift_tbl=..., tbl=0x18ec7aa8) at /home/danielbecker/Impala/be/src/runtime/descriptors.cc:609 #10 0x00000000026dfd6d in impala::QueryState::StartFInstances (this=0x18ec7200) at /home/danielbecker/Impala/be/src/runtime/query-state.cc:822 #11 0x00000000026cedd7 in impala::QueryExecMgr::ExecuteQueryHelper (this=0xeeb4480, qs=0x18ec7200) at /home/danielbecker/Impala/be/src/runtime/query-exec-mgr.cc:162 #12 0x00000000026d84b4 in boost::_mfi::mf1<void, impala::QueryExecMgr, impala::QueryState*>::operator() (this=0x12444800, p=0xeeb4480, a1=0x18ec7200) at /home/danielbecker/Impala/toolchain/toolchain-packages-gcc10.4.0/boost-1.74.0-p1/include/boost/bind/mem_fn_template.hpp:165 #13 0x00000000026d7d44 in boost::_bi::list2<boost::_bi::value<impala::QueryExecMgr*>, boost::_bi::value<impala::QueryState*> >::operator()<boost::_mfi::mf1<void, impala::QueryExecMgr, impala::QueryState*>, boost::_bi::list0> (this=0x12444810, f=..., a=...) at /home/danielbecker/Impala/toolchain/toolchain-packages-gcc10.4.0/boost-1.74.0-p1/include/boost/bind/bind.hpp:319 #14 0x00000000026d729b in boost::_bi::bind_t<void, boost::_mfi::mf1<void, impala::QueryExecMgr, impala::QueryState*>, boost::_bi::list2<boost::_bi::value<impala::QueryExecMgr*>, boost::_bi::value<impala::QueryState*> > >::operator() (this=0x12444800) at /home/danielbecker/Impala/toolchain/toolchain-packages-gcc10.4.0/boost-1.74.0-p1/include/boost/bind/bind.hpp:1294 #15 0x00000000026d66db in boost::detail::function::void_function_obj_invoker0<boost::_bi::bind_t<void, boost::_mfi::mf1<void, impala::QueryExecMgr, impala::QueryState*>, boost::_bi::list2<boost::_bi::value<impala::QueryExecMgr*>, boost::_bi::value<impala::QueryState*> > >, void>::invoke (function_obj_ptr=...) at /home/danielbecker/Impala/toolchain/toolchain-packages-gcc10.4.0/boost-1.74.0-p1/include/boost/function/function_template.hpp:158 #16 0x000000000267535a in boost::function0<void>::operator() (this=0x7f2988fa9ba0) at /home/danielbecker/Impala/toolchain/toolchain-packages-gcc10.4.0/boost-1.74.0-p1/include/boost/function/function_template.hpp:763 #17 0x0000000002d7a671 in impala::Thread::SuperviseThread(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, boost::function<void ()>, impala::ThreadDebugInfo const*, impala::Promise<long, (impala::PromiseMode)0>*) (name=..., category=..., functor=..., parent_thread_info=0x7f2b6cbb2840, thread_started=0x7f2b6cbb14e0) at /home/danielbecker/Impala/be/src/util/thread.cc:360 #18 0x0000000002d82ffe in boost::_bi::list5<boost::_bi::value<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, boost::_bi::value<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, boost::_bi::value<boost::function<void ()> >, boost::_bi::value<impala::ThreadDebugInfo*>, boost::_bi::value<impala::Promise<long, (impala::PromiseMode)0>*> >::operator()<void (*)(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, boost::function<void ()>, impala::ThreadDebugInfo const*, impala::Promise<long, (impala::PromiseMode)0>*), boost::_bi::list0>(boost::_bi::type<void>, void (*&)(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, boost::function<void ()>, impala::ThreadDebugInfo const*, impala::Promise<long, (impala::PromiseMode)0>*), boost::_bi::list0&, int) (this=0xf07c300, f=@0xf07c2f8: 0x2d7a32e <impala::Thread::SuperviseThread(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, boost::function<void ()>, impala::ThreadDebugInfo const*, impala::Promise<long, (impala::PromiseMode)0>*)>, a=...) at /home/danielbecker/Impala/toolchain/toolchain-packages-gcc10.4.0/boost-1.74.0-p1/include/boost/bind/bind.hpp:531 #19 0x0000000002d82f29 in boost::_bi::bind_t<void, void (*)(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, boost::function<void ()>, impala::ThreadDebugInfo const*, impala::Promise<long, (impala::PromiseMode)0>*), boost::_bi::list5<boost::_bi::value<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, boost::_bi::value<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, boost::_bi::value<boost::function<void ()> >, boost::_bi::value<impala::ThreadDebugInfo*>, boost::_bi::value<impala::Promise<long, (impala::PromiseMode)0>*> > >::operator()() (this=0xf07c2f8) at /home/danielbecker/Impala/toolchain/toolchain-packages-gcc10.4.0/boost-1.74.0-p1/include/boost/bind/bind.hpp:1294 #20 0x0000000002d82ef0 in boost::detail::thread_data<boost::_bi::bind_t<void, void (*)(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, boost::function<void ()>, impala::ThreadDebugInfo const*, impala::Promise<long, (impala::PromiseMode)0>*), boost::_bi::list5<boost::_bi::value<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, boost::_bi::value<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, boost::_bi::value<boost::function<void ()> >, boost::_bi::value<impala::ThreadDebugInfo*>, boost::_bi::value<impala::Promise<long, (impala::PromiseMode)0>*> > > >::run() (this=0xf07c1c0) at /home/danielbecker/Impala/toolchain/toolchain-packages-gcc10.4.0/boost-1.74.0-p1/include/boost/thread/detail/thread.hpp:120 #21 0x00000000046f3c67 in thread_proxy () #22 0x00007f2bb18b46db in start_thread (arg=0x7f2988faa700) at pthread_create.c:463 #23 0x00007f2bae62c61f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
In both cases the error is connected to the itemTupleDesc (or itemTupleId) not being set and we hit a precondition check (in the struct case) or a DCHECK (in the collection case).
The error also occurs if "DISTINCT" is not applied on the complex type but the complex type is also present in the tuple:
select distinct(id), struct_val from alltypes_structs; select distinct(id), arr1 from complextypes_arrays;
Until we implement equality and hash for complex types, which are needed for select distinct to work, we should return a good error message in these cases instead of failing a precondition check or a DCHECK.