Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-23880

Bloom filters can be merged in a parallel way in VectorUDAFBloomFilterMerge

    XMLWordPrintableJSON

Details

    Description

      Merging bloom filters in semijoin reduction can become the main bottleneck in case of large number of source mapper tasks (~1000, Map 1 in below example) and a large amount of expected entries (50M) in bloom filters.

      For example in TPCDS Q93:

      select /*+ semi(store_returns, sr_item_sk, store_sales, 70000000)*/ ss_customer_sk
                  ,sum(act_sales) sumsales
            from (select ss_item_sk
                        ,ss_ticket_number
                        ,ss_customer_sk
                        ,case when sr_return_quantity is not null then (ss_quantity-sr_return_quantity)*ss_sales_price
                                                                  else (ss_quantity*ss_sales_price) end act_sales
                  from store_sales left outer join store_returns on (sr_item_sk = ss_item_sk
                                                                     and sr_ticket_number = ss_ticket_number)
                      ,reason
                  where sr_reason_sk = r_reason_sk
                    and r_reason_desc = 'reason 66') t
            group by ss_customer_sk
            order by sumsales, ss_customer_sk
      limit 100;
      

      On 10TB-30TB scale there is a chance that from 3-4 mins of query runtime 1-2 mins are spent with merging bloom filters (Reducer 2), as in: lipwig-output3605036885489193068.svg

      ----------------------------------------------------------------------------------------------
              VERTICES      MODE        STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  KILLED
      ----------------------------------------------------------------------------------------------
      Map 3 ..........      llap     SUCCEEDED      1          1        0        0       0       0
      Map 1 ..........      llap     SUCCEEDED   1263       1263        0        0       0       0
      Reducer 2             llap       RUNNING      1          0        1        0       0       0
      Map 4                 llap       RUNNING   6154          0      207     5947       0       0
      Reducer 5             llap        INITED     43          0        0       43       0       0
      Reducer 6             llap        INITED      1          0        0        1       0       0
      ----------------------------------------------------------------------------------------------
      VERTICES: 02/06  [====>>----------------------] 16%   ELAPSED TIME: 149.98 s
      ----------------------------------------------------------------------------------------------
      

      For example, 70M entries in bloom filter leads to a 436 465 696 bits, so merging 1263 bloom filters means running ~ 1263 * 436 465 696 bitwise OR operation, which is very hot codepath, but can be parallelized.

      Attachments

        1. lipwig-output3605036885489193068.svg
          126 kB
          László Bodor

        Issue Links

          Activity

            People

              abstractdog László Bodor
              abstractdog László Bodor
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 8h 40m
                  8h 40m