Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-4808

Crash in old hash join node for full outer join

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: Impala 2.3.0, Impala 2.5.0, Impala 2.6.0, Impala 2.7.0, Impala 2.8.0, Impala 2.9.0
    • Fix Version/s: Impala 2.9.0
    • Component/s: Backend
    • Labels:

      Description

      We saw a crash in HashJoinNode::GetNext() where EvalConjuncts was touching invalid memory here:

          if (match_all_probe_ && !matched_probe_ && probe_row_exists) {
            int row_idx = out_batch->AddRow();
            TupleRow* out_row = out_batch->GetRow(row_idx);
            CreateOutputRow(out_row, current_probe_row_, NULL);
            if (EvalConjuncts(conjunct_ctxs, num_conjunct_ctxs, out_row)) { <===
              out_batch->CommitLastRow();
              VLOG_ROW << "match row: " << PrintRow(out_row, row_desc());
              ++num_rows_returned_;
              COUNTER_SET(rows_returned_counter_, num_rows_returned_);
              matched_probe_ = true;
              if (out_batch->AtCapacity() || ReachedLimit()) {
                *eos = ReachedLimit();
                return Status::OK();
              }
            }
      }
      

      We believe the bug is that 'probe_row_exists' can be true even if there is no probe row. E.g. in the code below, it may break out of the loop and leave the node in a state where there is no valid probe row.

          if (probe_batch_pos_ == probe_batch_->num_rows()) {
            // pass on resources, out_batch might still need them
            probe_batch_->TransferResourceOwnership(out_batch);
            probe_batch_pos_ = 0;
            if (out_batch->AtCapacity()) return Status::OK();
            // get new probe batch
            if (!probe_side_eos_) {
              while (true) {
                probe_timer.Stop();
                RETURN_IF_ERROR(child(0)->GetNext(state, probe_batch_.get(), &probe_side_eos_));
                probe_timer.Start();
                if (probe_batch_->num_rows() == 0) {
                  // Empty batches can still contain IO buffers, which need to be passed up to
                  // the caller; transferring resources can fill up out_batch.
                  probe_batch_->TransferResourceOwnership(out_batch);
                  if (probe_side_eos_) {
                    eos_ = true;
                    break;
                  }
                  if (out_batch->AtCapacity()) return Status::OK(); <===== HERE
                  continue;
                } else {
                  COUNTER_ADD(probe_row_counter_, probe_batch_->num_rows());
                  break;
                }
              }
            } else {
              eos_ = true;
            }
            // finish up right outer join
            if (eos_ && match_all_build_) {
              hash_tbl_iterator_ = hash_tbl_->Begin();
            }
      }
      

        Activity

        Hide
        tarmstrong Tim Armstrong added a comment -

        IMPALA-4808: old hash join can reference invalid memory

        The bug was that 'probe_rows_exist' could be true even if
        there was no current probe row. The node can get into this
        state if it takes the branch at line 390.

        I tried to reproduce the crash but was unable to after a
        few attempts.

        Change-Id: Ic068bbc3e029264d1ce814d286e372391639fa68
        Reviewed-on: http://gerrit.cloudera.org:8080/5850
        Reviewed-by: Matthew Jacobs <mj@cloudera.com>
        Reviewed-by: Dan Hecht <dhecht@cloudera.com>
        Tested-by: Impala Public Jenkins

        Show
        tarmstrong Tim Armstrong added a comment - IMPALA-4808 : old hash join can reference invalid memory The bug was that 'probe_rows_exist' could be true even if there was no current probe row. The node can get into this state if it takes the branch at line 390. I tried to reproduce the crash but was unable to after a few attempts. Change-Id: Ic068bbc3e029264d1ce814d286e372391639fa68 Reviewed-on: http://gerrit.cloudera.org:8080/5850 Reviewed-by: Matthew Jacobs <mj@cloudera.com> Reviewed-by: Dan Hecht <dhecht@cloudera.com> Tested-by: Impala Public Jenkins

          People

          • Assignee:
            tarmstrong Tim Armstrong
            Reporter:
            tarmstrong Tim Armstrong
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development