Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-3349

Kudu scanner : KuduScanBatch::RowPtr::Get* has very expensive checks in the hot path

    Details

      Description

      KuduScanBatch::RowPtr::Get

      gets called by the Impala scanner per row, per column and RowPtr::Get has check on type which not necessarily needed.
      Also similar to IMPALA-3347, it would help if the Batch has a column level nullability bit to skip the NULL check.

      template<typename T>
      Status KuduScanBatch::RowPtr::Get(int col_idx, typename T::cpp_type* val) const {
        const ColumnSchema& col = schema_->column(col_idx);
        if (PREDICT_FALSE(col.type_info()->type() != T::type)) {
          // TODO: at some point we could allow type coercion here.
          return Status::InvalidArgument(
              Substitute("invalid type $0 provided for column '$1' (expected $2)",
                         T::name(),
                         col.name(), col.type_info()->name()));
        }
      
        if (col.is_nullable() && IsNull(col_idx)) {
          return Status::NotFound("column is NULL");
        }
      
        memcpy(val, row_data_ + schema_->column_offset(col_idx), sizeof(*val));
        return Status::OK();
      }
      

      From perf

      Samples: 182K of event 'cycles', Event count (approx.): 65636591846
       11.59%  libkudu_client.so.0.1.0               [.] kudu::client::KuduScanBatch::RowPtr::GetInt64(int, long*) const                                                                         
       10.03%  impalad                               [.] impala::KuduScanner::KuduRowToImpalaTuple(kudu::client::KuduScanBatch::RowPtr const&, impala::RowBatch*, impala::Tuple*)                
        7.28%  impalad                               [.] impala::KuduScanner::DecodeRowsIntoRowBatch(impala::RowBatch*, impala::Tuple**, bool*)                                                  
        3.69%  [kernel]                              [k] copy_user_enhanced_fast_string                                                                                                          
        3.26%  kudu-tserver                          [.] kudu::SerializeRowBlock(kudu::RowBlock const&, kudu::RowwiseRowBlockPB*, kudu::Schema const*, kudu::faststring*, kudu::faststring*)     
        3.01%  libkudu_client.so.0.1.0               [.] kudu::client::KuduScanBatch::RowPtr::IsNull(int) const                                                                                  
      

      And this shows that most of the cost is from the check before the memcpy

      kudu::client::KuduScanBatch::RowPtr::GetInt64(int, long*) const  /opt/cloudera/parcels/KUDU-0.7.1-1.kudu0.7.1.p0.36/lib64/libkudu_client.so.0.1.0
        2.37 │       mov    %rdi,%r12
        0.42 │       sub    $0xd8,%rsp
        2.62 │       mov    (%rsi),%r8
        2.37 │       lea    (%rbx,%rbx,8),%rsi
        0.46 │       mov    (%r8),%rax
        0.61 │       lea    (%rax,%rsi,8),%r9
        2.76 │       mov    0x8(%r9),%rax
        6.81 │       cmpl   $0x7,(%rax)
       12.14 │     ↓ jne    e8
        2.57 │       cmpb   $0x0,0x10(%r9)
        0.57 │     ↓ jne    68
        1.95 │ 3b:   mov    0x40(%r8),%rax
        0.68 │       mov    (%rax,%rbx,8),%rax
        2.92 │       add    0x8(%r13),%rax
        0.75 │       mov    (%rax),%rax
       32.71 │       mov    %rax,(%rcx)
        2.62 │       movq   $0x0,(%r12)
        0.52 │ 55:   lea    -0x18(%rbp),%rsp
        0.21 │       mov    %r12,%rax
        2.13 │       pop    %rbx
        2.48 │       pop    %r12
        0.52 │       pop    %r13
        0.22 │       pop    %rbp
        2.46 │     ← retq
             │       nop
             │ 68:   mov    %edx,%esi
             │       mov    %r13,%rdi
             │       mov    %rcx,-0xe8(%rbp)
             │     → callq  kudu::client::KuduScanBatch::RowPtr::IsNull(int) const@plt
             │       test   %al,%al
      

        Issue Links

          Activity

          Hide
          mjacobs Matthew Jacobs added a comment -

          Is there anything on the Impala side here or can we track this issue with KUDU-1417?

          Show
          mjacobs Matthew Jacobs added a comment - Is there anything on the Impala side here or can we track this issue with KUDU-1417 ?
          Hide
          tlipcon Todd Lipcon added a comment -

          This probably needs some coordination between the two sides. I'm fine tracking it upstream.

          Show
          tlipcon Todd Lipcon added a comment - This probably needs some coordination between the two sides. I'm fine tracking it upstream.
          Hide
          mjacobs Matthew Jacobs added a comment -

          We think this was fixed with Alex's change to use the Kudu tuple format.

          Show
          mjacobs Matthew Jacobs added a comment - We think this was fixed with Alex's change to use the Kudu tuple format.
          Hide
          jbapple Jim Apple added a comment -

          This is a bulk comment on all issues with Fix Version 2.8.0 that were resolved on or after 2016-12-09.

          2.8.0 was branched on December 9, with only two changes to master cherry-picked to the 2.8.0 release branch after that:

          https://github.com/apache/incubator-impala/commits/2.8.0

          Issues fixed after December 9 might not be fixed in 2.8.0. If you are the one who marked this issue Resolved, can you check to see if the patch is in 2.8.0 by using the link above? If the patch is not in 2.8.0, can you change the Fix Version to 2.9.0?

          Thank you!

          Show
          jbapple Jim Apple added a comment - This is a bulk comment on all issues with Fix Version 2.8.0 that were resolved on or after 2016-12-09. 2.8.0 was branched on December 9, with only two changes to master cherry-picked to the 2.8.0 release branch after that: https://github.com/apache/incubator-impala/commits/2.8.0 Issues fixed after December 9 might not be fixed in 2.8.0. If you are the one who marked this issue Resolved, can you check to see if the patch is in 2.8.0 by using the link above? If the patch is not in 2.8.0, can you change the Fix Version to 2.9.0? Thank you!

            People

            • Assignee:
              mjacobs Matthew Jacobs
              Reporter:
              mmokhtar Mostafa Mokhtar
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development