Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-19762

Druid Queries containing Joins gives wrong results.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 3.1.0
    • Druid integration
    • None

    Description

      Druid queries that have joins against self table gives wrong results.
      e.g.

       
      SELECT
      username AS `username`,
      SUM(double1) AS `sum_double1`
      FROM
      druid_table_with_nulls `tbl1`
        JOIN (
          SELECT
          username AS `username`,
          SUM(double1) AS `sum_double2`
          FROM druid_table_with_nulls
          GROUP BY `username`
          ORDER BY `sum_double2`
          DESC  LIMIT 10
        )
        `tbl2`
          ON (`tbl1`.`username` = `tbl2`.`username`)
      GROUP BY `tbl1`.`username`;
      

      In this case one of the queries is a druid scan query and other is groupBy query.
      During planning, the properties of these queries are set to the tableDesc and serdeInfo, while setting the map work, we overwrite the properties from the properties present in serdeInfo, this causes the scan query results to be deserialized using wrong column names and results in Null values.

      Attachments

        1. HIVE-19762.patch
          13 kB
          Nishant Bangarwa

        Activity

          People

            nishantbangarwa Nishant Bangarwa
            nishantbangarwa Nishant Bangarwa
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: