Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-7292 Hive on Spark
  3. HIVE-8639

Convert SMBJoin to MapJoin [Spark Branch]

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • spark-branch
    • 1.1.0
    • Spark
    • None

    Description

      HIVE-8202 supports auto-conversion of SMB Join. However, if the tables are partitioned, there could be a slow down as each mapper would need to get a very small chunk of a partition which has a single key. Thus, in some scenarios it's beneficial to convert SMB join to map join.

      The task is to research and support the conversion from SMB join to map join for Spark execution engine. See the equivalent of MapReduce in SortMergeJoinResolver.

      Attachments

        1. HIVE-8639.1-spark.patch
          221 kB
          Szehon Ho
        2. HIVE-8639.2-spark.patch
          281 kB
          Szehon Ho
        3. HIVE-8639.3-spark.patch
          281 kB
          Szehon Ho
        4. HIVE-8639.3-spark.patch
          281 kB
          Szehon Ho
        5. HIVE-8639.4-spark.patch
          389 kB
          Szehon Ho

        Issue Links

          Activity

            People

              szehon Szehon Ho
              szehon Szehon Ho
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: