Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-15639

Hive/Druid integration: wrong semantics for ordering within groupBy queries

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • 2.2.0
    • 3.0.0
    • Druid integration
    • None

    Description

      The scope of the ordering seems to be for each different granularity value instead of globally.

      EXPLAIN
      SELECT i_brand_id, floor_day(`__time`), max(ss_quantity), sum(ss_wholesale_cost) as s
      FROM store_sales_sold_time_subset
      GROUP BY i_brand_id, floor_day(`__time`)
      ORDER BY i_brand_id
      LIMIT 10;
      OK
      Plan optimized by CBO.
      
      Stage-0
        Fetch Operator
          limit:-1
          Select Operator [SEL_1]
            Output:["_col0","_col1","_col2","_col3"]
            TableScan [TS_0]
              Output:["i_brand_id","__time","$f2","$f3"],properties:{"druid.query.json":"{\"queryType\":\"groupBy\",\"dataSource\":\"druid_tpcds_ss_sold_time_subset\",\"granularity\":\"DAY\",\"dimensions\":[\"i_brand_id\"],\"limitSpec\":{\"type\":\"default\",\"limit\":10,\"columns\":[{\"dimension\":\"i_brand_id\",\"direction\":\"ascending\"}]},\"aggregations\":[{\"type\":\"longMax\",\"name\":\"$f2\",\"fieldName\":\"ss_quantity\"},{\"type\":\"doubleSum\",\"name\":\"$f3\",\"fieldName\":\"ss_wholesale_cost\"}],\"intervals\":[\"1900-01-01T00:00:00.000Z/3000-01-01T00:00:00.000Z\"]}","druid.query.type":"groupBy"}
      

      Attachments

        Issue Links

          Activity

            People

              julianhyde Julian Hyde
              jcamacho Jesús Camacho Rodríguez
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: