Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-7292 Hive on Spark
  3. HIVE-8639

Convert SMBJoin to MapJoin [Spark Branch]

Log workAgile BoardRank to TopRank to BottomVotersWatch issueWatchersConvert to IssueMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: spark-branch
    • Fix Version/s: 1.1.0
    • Component/s: Spark
    • Labels:
      None

      Description

      HIVE-8202 supports auto-conversion of SMB Join. However, if the tables are partitioned, there could be a slow down as each mapper would need to get a very small chunk of a partition which has a single key. Thus, in some scenarios it's beneficial to convert SMB join to map join.

      The task is to research and support the conversion from SMB join to map join for Spark execution engine. See the equivalent of MapReduce in SortMergeJoinResolver.

        Attachments

        1. HIVE-8639.1-spark.patch
          221 kB
          Szehon Ho
        2. HIVE-8639.2-spark.patch
          281 kB
          Szehon Ho
        3. HIVE-8639.3-spark.patch
          281 kB
          Szehon Ho
        4. HIVE-8639.3-spark.patch
          281 kB
          Szehon Ho
        5. HIVE-8639.4-spark.patch
          389 kB
          Szehon Ho

        Issue Links

          Activity

          $i18n.getText('security.level.explanation', $currentSelection) Viewable by All Users
          Cancel

            People

              Dates

              • Created:
                Updated:
                Resolved:

                Issue deployment