Details

    • Task
    • Status: Resolved
    • Minor
    • Resolution: Won't Fix
    • Impala 2.2
    • None
    • Perf Investigation

    Description

      Investigate increasing batch size for better performance.

      Initial results

      Query 10,000,000 1,000,000 100,000 10,000 1,000 100 10
      broadcast_join_1 5 4 4 4 3 3 33
      broadcast_join_2 9 7 8 8 12 18 96
      broadcast_join_3 55 63 67 73 81 133 556
      exchange_broadcast   115.59 118 119 149 285 1,193
      exchange_shuffle 196 187 186 191 191.81    
      filter_bigint_non_selective 9 6 6 6 7 31 275
      filter_bigint_selective 3 3 3 3 3 3 13
      filter_decimal_non_selective 3 3 3 3 3 9 65
      filter_decimal_selective 3 3 3 3 3 10 46
      filter_string_non_selective 2 2 2 2 3 17 143
      filter_string_selective 2 2 2 2 2 7 34
      groupBy_bigint_highndv 58 55 55 58 55 69 254
      groupBy_bigint_lowndv 13 9 9 9 10 26 202
      groupBy_decimal_highndv 116 87 81 105 102 103 257
      groupBy_decimal_lowndv 30 27 28 29 33 45 217
      groupBy_spilling     566 534 527 523 546
      insert_partitioned 392 383 375 385 483 451 486
      insert 392 383 375 385 483 451 486
      orderby_all     158 176 173 191 323
      orderby_bigint 30 34 34 35 34 49 281
      shuffle_join_one_to_many_string_with_groupby 554 568 613 561 549 577 739
      shuffle_join_union_all_with_groupby     97 109 122 119 262

      Attachments

        Issue Links

          Activity

            tlipcon Todd Lipcon added a comment -

            +1, I've seen good speedups on Kudu queries with larger batch sizes as well. We spend a lot of time in a few areas which are per-batch:

            • contention on the high water mark memory tracker metric (this could easily be striped per-CPU or per-thread for an easy win)
            • contention on the row batch queue (could improve the blocking queue implementation)
            • contention on the allocator (could "recycle" row batches back to scanner threads rather than deleting and re-allocating)

            If bigger row batches have some subtle issues, maybe attacking the above areas would help narrow the gap a bit.

            tlipcon Todd Lipcon added a comment - +1, I've seen good speedups on Kudu queries with larger batch sizes as well. We spend a lot of time in a few areas which are per-batch: contention on the high water mark memory tracker metric (this could easily be striped per-CPU or per-thread for an easy win) contention on the row batch queue (could improve the blocking queue implementation) contention on the allocator (could "recycle" row batches back to scanner threads rather than deleting and re-allocating) If bigger row batches have some subtle issues, maybe attacking the above areas would help narrow the gap a bit.
            dhecht Daniel Hecht added a comment -

            contention on the row batch queue (could improve the blocking queue implementation)

            This one is tracked already by IMPALA-2403.

            dhecht Daniel Hecht added a comment - contention on the row batch queue (could improve the blocking queue implementation) This one is tracked already by IMPALA-2403 .
            dhecht Daniel Hecht added a comment -

            And this one:

            contention on the high water mark memory tracker metric (this could easily be striped per-CPU or per-thread for an easy win)

            Is related to IMPALA-2399 (removes the polling of mem-tracker per batch).

            dhecht Daniel Hecht added a comment - And this one: contention on the high water mark memory tracker metric (this could easily be striped per-CPU or per-thread for an easy win) Is related to IMPALA-2399 (removes the polling of mem-tracker per batch).

            People

              mmokhtar Mostafa Mokhtar
              mmokhtar Mostafa Mokhtar
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: