Details
-
Bug
-
Status: Resolved
-
Blocker
-
Resolution: Fixed
-
Impala 4.0.0
-
ghx-label-13
Description
Hit the DCHECK when inserting to a partitioned parquet table with zorder. I'm on master branch (commit=b8a2b75).
F1012 15:04:27.726274 3868 dml-exec-state.cc:432] a6479cc4725101fd:b86db2a100000003] Check failed: per_partition_status_.find(name) == per_partition_status_.end() *** Check failure stack trace: *** @ 0x51ff3cc google::LogMessage::Fail() @ 0x5200cbc google::LogMessage::SendToLog() @ 0x51fed2a google::LogMessage::Flush() @ 0x5202928 google::LogMessageFatal::~LogMessageFatal() @ 0x234ba18 impala::DmlExecState::AddPartition() @ 0x2817786 impala::HdfsTableSink::GetOutputPartition() @ 0x2813151 impala::HdfsTableSink::WriteClusteredRowBatch() @ 0x28156c4 impala::HdfsTableSink::Send() @ 0x23139dd impala::FragmentInstanceState::ExecInternal() @ 0x230fe10 impala::FragmentInstanceState::Exec() @ 0x227bb79 impala::QueryState::ExecFInstance() @ 0x2279f7b _ZZN6impala10QueryState15StartFInstancesEvENKUlvE_clEv @ 0x227e2c2 _ZN5boost6detail8function26void_function_obj_invoker0IZN6impala10QueryState15StartFInstancesEvEUlvE_vE6invokeERNS1_15function_bufferE @ 0x2137699 boost::function0<>::operator()() @ 0x2715d7d impala::Thread::SuperviseThread() @ 0x271dd1a boost::_bi::list5<>::operator()<>() @ 0x271dc3e boost::_bi::bind_t<>::operator()() @ 0x271dbff boost::detail::thread_data<>::run() @ 0x3f05f01 thread_proxy @ 0x7fb18bebb6b9 start_thread @ 0x7fb188a474dc clone
It seems the zorder sort node doesn't keep the rows sorted by partition keys. Thus violates the assumption of HdfsTableSink::WriteClusteredRowBatch() that input must be ordered by the partition key expressions. So a partition key was deleted and then inserted again to the partition_keys_to_output_partitions_ map.
/// Maps all rows in 'batch' to partitions and appends them to their temporary Hdfs /// files. The input must be ordered by the partition key expressions. Status WriteClusteredRowBatch(RuntimeState* state, RowBatch* batch) WARN_UNUSED_RESULT;
The key got removed here: https://github.com/apache/impala/blob/b8a2b754669eb7f8d164e8112e594ac413e436ef/be/src/exec/hdfs-table-sink.cc#L334 when processing a new partition key.
It got reinserted here: https://github.com/apache/impala/blob/b8a2b754669eb7f8d164e8112e594ac413e436ef/be/src/exec/hdfs-table-sink.cc#L590 so hit the DCHECK.
Attachments
Issue Links
- relates to
-
IMPALA-8755 Implement Z-ordering for Impala
- Resolved
- links to