Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-15272

"LEFT OUTER JOIN" Is not populating correct records with Hive On Spark

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.1.0
    • None
    • Hive, Spark
    • None
    • Hive 1.1.0, CentOS, Cloudera 5.7.4

    Description

      I ran following Hive query multiple times with execution engine as Hive on Spark and Hive on MapReduce.

      SELECT COUNT(DISTINCT t1.region, t1.amount)
      FROM my_db.my_table1 t1
      LEFT OUTER
      JOIN my_db.my_table2 t2 ON (t1.id = t2.id
                                  AND t1.name = t2.name)
      

      With Hive on Spark: Result (count) were different of every execution.
      With Hive on MapReduce: Result (count) were same of every execution.

      Seems like Hive on Spark behaving differently in each execution and does not populating correct result.

      Attachments

        Activity

          People

            lirui Rui Li
            VPareek Vikash Pareek
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated: