Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-8699 Enable support for common map join [Spark Branch]
  3. HIVE-9007

Hive may generate wrong plan for map join queries due to IdentityProjectRemover [Spark Branch]

    XMLWordPrintableJSON

    Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: spark-branch
    • Fix Version/s: 1.1.0
    • Component/s: Spark
    • Labels:
      None

      Description

      HIVE-8435 introduces a new logical optimizer called IdentityProjectRemover, which may cause map join in spark branch to generate wrong plan.

      Currently, the map join conversion in spark branch first goes through a method convertJoinMapJoin, which replaces a join op with a mapjoin op, removes RS associated with big table, and keep RSs for all small tables. Afterwards, in SparkReduceSinkMapJoinProc it replaces all parent RSs of the mapjoin op with HTS (note it doesn't check whether the RS belongs to small table or big table.)

      The issue arises, when IdentityProjectRemover comes into play, which may result into a situation that a operator tree has two consecutive RSs. Imaging the following example:

                Join               MapJoin
                / \                /   \
              RS   RS   --->     RS     RS
             /      \           /         \
            TS       RS       TS          TS (big table)
                      \      (small table)
                       TS
      

      In this case, all parents of the mapjoin op will be RS, even the branch for big table! In SparkReduceSinkMapJoinProc, they will be replaced with HTS, which is obviously incorrect.

        Attachments

        1. HIVE-9007.2-spark.patch
          6 kB
          Szehon Ho
        2. HIVE-9007-spark.patch
          6 kB
          Szehon Ho

          Issue Links

            Activity

              People

              • Assignee:
                szehon Szehon Ho
                Reporter:
                csun Chao Sun
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: