Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-20683

Add the Ability to push Dynamic Between and Bloom filters to Druid

    XMLWordPrintableJSON

    Details

      Description

      For optimizing joins, Hive generates BETWEEN filter with min-max and BLOOM filter for filtering one side of semi-join.
      Druid 0.13.0 will have support for Bloom filters (Added via https://github.com/apache/incubator-druid/pull/6222)

      Implementation details -

      1. Hive generates and passes the filters as part of 'filterExpr' in TableScan.
      2. DruidQueryBasedRecordReader gets this filter passed as part of the conf.
      3. During execution phase, before sending the query to druid in DruidQueryBasedRecordReader we will deserialize this filter, translate it into a DruidDimFilter and add it to existing DruidQuery. Tez executor already ensures that when we start reading results from the record reader, all the dynamic values are initialized.
      4. Explaining a druid query also prints the query sent to druid as druid.json.query. We also need to make sure to update the druid query with the filters. During explain we do not have the actual values for the dynamic values, so instead of values we will print the dynamic expression itself as part of druid query.

      Note:- This work needs druid to be updated to version 0.13.0

        Attachments

        1. HIVE-20683.patch
          290 kB
          Nishant Bangarwa
        2. HIVE-20683.8.patch
          215 kB
          Nishant Bangarwa
        3. HIVE-20683.6.patch
          213 kB
          Nishant Bangarwa
        4. HIVE-20683.5.patch
          213 kB
          Nishant Bangarwa
        5. HIVE-20683.4.patch
          213 kB
          Nishant Bangarwa
        6. HIVE-20683.3.patch
          213 kB
          Nishant Bangarwa
        7. HIVE-20683.2.patch
          209 kB
          Nishant Bangarwa
        8. HIVE-20683.10.patch
          228 kB
          Nishant Bangarwa
        9. HIVE-20683.1.patch
          180 kB
          Nishant Bangarwa

          Issue Links

            Activity

              People

              • Assignee:
                nishantbangarwa Nishant Bangarwa
                Reporter:
                nishantbangarwa Nishant Bangarwa
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 5.5h
                  5.5h