Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-19762

Druid Queries containing Joins gives wrong results.

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.1.0
    • Component/s: Druid integration
    • Labels:
      None

      Description

      Druid queries that have joins against self table gives wrong results.
      e.g.

       
      SELECT
      username AS `username`,
      SUM(double1) AS `sum_double1`
      FROM
      druid_table_with_nulls `tbl1`
        JOIN (
          SELECT
          username AS `username`,
          SUM(double1) AS `sum_double2`
          FROM druid_table_with_nulls
          GROUP BY `username`
          ORDER BY `sum_double2`
          DESC  LIMIT 10
        )
        `tbl2`
          ON (`tbl1`.`username` = `tbl2`.`username`)
      GROUP BY `tbl1`.`username`;
      

      In this case one of the queries is a druid scan query and other is groupBy query.
      During planning, the properties of these queries are set to the tableDesc and serdeInfo, while setting the map work, we overwrite the properties from the properties present in serdeInfo, this causes the scan query results to be deserialized using wrong column names and results in Null values.

        Attachments

        1. HIVE-19762.patch
          13 kB
          Nishant Bangarwa

          Activity

            People

            • Assignee:
              nishantbangarwa Nishant Bangarwa
              Reporter:
              nishantbangarwa Nishant Bangarwa
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: