Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-7292 Hive on Spark
  3. HIVE-8202

Support SMB Join for Hive on Spark [Spark Branch]

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 1.1.0
    • Spark
    • None

    Description

      SMB joins are used wherever the tables are sorted and bucketed. It's a map-side join. The join boils down to just merging the already sorted tables, allowing this operation to be faster than an ordinary map-join.

      The task is to research and support the conversion from regular SMB join to SMB map join for Spark execution engine.

      Attachments

        1. HIVE-8202.9-spark.patch
          549 kB
          Szehon Ho
        2. HIVE-8202.8-spark.patch
          551 kB
          Szehon Ho
        3. HIVE-8202.7-spark.patch
          550 kB
          Szehon Ho
        4. HIVE-8202.6-spark.patch
          787 kB
          Szehon Ho
        5. HIVE-8202.5-spark.patch
          786 kB
          Szehon Ho
        6. HIVE-8202.4-spark.patch
          1.22 MB
          Szehon Ho
        7. HIVE-8202.3-spark.patch
          633 kB
          Szehon Ho
        8. HIVE-8202.2-spark.patch
          633 kB
          Szehon Ho
        9. HIVE-8202.1-spark.patch
          572 kB
          Szehon Ho
        10. Hive on Spark SMB Join.pdf
          111 kB
          Szehon Ho
        11. Hive on Spark SMB Join.docx
          120 kB
          Szehon Ho

        Issue Links

          Activity

            People

              szehon Szehon Ho
              xuefuz Xuefu Zhang
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: