Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-6461

*DataStreamSender::Channel::AddRow needs some micro-optimizations to remove per row function call and data dependency

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • None
    • Impala 2.11.0
    • Distributed Exec
    • None
    • ghx-label-4

    Description

      While analyzing performance of partition exchange operator I noticed that there is dependency and a function call per row in the hot path.

      
      // hash-partition batch's rows across channels
       // TODO: encapsulate this in an Expr as we've done for Kudu above and remove this case
       // once we have codegen here.
       int num_channels = channels_.size();
       for (int i = 0; i < batch->num_rows(); ++i) {
       TupleRow* row = batch->GetRow(i);
       uint64_t hash_val = EXCHANGE_HASH_SEED;
       for (int j = 0; j < partition_exprs_.size(); ++j) {
       ScalarExprEvaluator* eval = partition_expr_evals_[j];
       void* partition_val = eval->GetValue(row);
       // We can't use the crc hash function here because it does not result in
       // uncorrelated hashes with different seeds. Instead we use FastHash.
       // TODO: fix crc hash/GetHashValue()
       DCHECK(&(eval->root()) == partition_exprs_[j]);
       hash_val = RawValue::GetHashValueFastHash(
       partition_val, partition_exprs_[j]->type(), hash_val);
       }
       RETURN_IF_ERROR(channels_[hash_val % num_channels]->AddRow(row));
       }
      
      

      Force inlining DataStreamSender::Channel::AddRow and breaking up the loop improves partition exchange performance by 5%

      Code-generation of the hash computation IMPALA-5168 should give another 10% speedup. 

      Attachments

        Activity

          People

            mmokhtar Mostafa Mokhtar
            mmokhtar Mostafa Mokhtar
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: