Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-22157

Hive Pushing Aggr extension to Druid

Log workAgile BoardRank to TopRank to BottomAdd voteVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: Wish
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: 3.1.0, 3.0.0, 3.1.1, 3.1.2
    • Fix Version/s: None
    • Component/s: Druid integration
    • Labels:
      None

      Description

      Currently Hive can not push aggr spec if one want to use customized extension in druid for the execution

      when using Hive, below query is been rewritten with no aggr defined 

      Explain  select  floor_day(`_time`),count(distinct visitor_id) as uv from druid group by floor_day(`_time`);

      .....

      "limitSpec":{"type":"default"},

      "aggregations":[],

      "intervals":["1900-01-01T00:00:00.000Z/3000-01-01T00:00:00.000Z"]

      ...... 

       

      But what one really need was 

       

      "aggregations": [

      { "type": "distinctCount", "name": "uv", "fieldName": "visitor_id" }

      ]

       

      and aggregations spec is using the druid-distinctcount extension.  

       

      If we can call Druid's Native UDAF from HiveSQL and been able push that into the generated Druid query spec, this would be a nice thing to power up the Hive-Druid Integration.

       

        Attachments

          Activity

          $i18n.getText('security.level.explanation', $currentSelection) Viewable by All Users
          Cancel

            People

              Dates

              • Created:
                Updated:

                Issue deployment