Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-7292 Hive on Spark
  3. HIVE-8639

Convert SMBJoin to MapJoin [Spark Branch]

    XMLWordPrintableJSON

    Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: spark-branch
    • Fix Version/s: 1.1.0
    • Component/s: Spark
    • Labels:
      None

      Description

      HIVE-8202 supports auto-conversion of SMB Join. However, if the tables are partitioned, there could be a slow down as each mapper would need to get a very small chunk of a partition which has a single key. Thus, in some scenarios it's beneficial to convert SMB join to map join.

      The task is to research and support the conversion from SMB join to map join for Spark execution engine. See the equivalent of MapReduce in SortMergeJoinResolver.

        Attachments

        1. HIVE-8639.1-spark.patch
          221 kB
          Szehon Ho
        2. HIVE-8639.2-spark.patch
          281 kB
          Szehon Ho
        3. HIVE-8639.3-spark.patch
          281 kB
          Szehon Ho
        4. HIVE-8639.3-spark.patch
          281 kB
          Szehon Ho
        5. HIVE-8639.4-spark.patch
          389 kB
          Szehon Ho

          Issue Links

            Activity

              People

              • Assignee:
                szehon Szehon Ho
                Reporter:
                szehon Szehon Ho
              • Votes:
                0 Vote for this issue
                Watchers:
                6 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: