Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-7292 Hive on Spark
  3. HIVE-8202

Support SMB Join for Hive on Spark [Spark Branch]

    XMLWordPrintableJSON

    Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.1.0
    • Component/s: Spark
    • Labels:
      None

      Description

      SMB joins are used wherever the tables are sorted and bucketed. It's a map-side join. The join boils down to just merging the already sorted tables, allowing this operation to be faster than an ordinary map-join.

      The task is to research and support the conversion from regular SMB join to SMB map join for Spark execution engine.

        Attachments

        1. HIVE-8202.1-spark.patch
          572 kB
          Szehon Ho
        2. HIVE-8202.2-spark.patch
          633 kB
          Szehon Ho
        3. HIVE-8202.3-spark.patch
          633 kB
          Szehon Ho
        4. HIVE-8202.4-spark.patch
          1.22 MB
          Szehon Ho
        5. HIVE-8202.5-spark.patch
          786 kB
          Szehon Ho
        6. HIVE-8202.6-spark.patch
          787 kB
          Szehon Ho
        7. HIVE-8202.7-spark.patch
          550 kB
          Szehon Ho
        8. HIVE-8202.8-spark.patch
          551 kB
          Szehon Ho
        9. HIVE-8202.9-spark.patch
          549 kB
          Szehon Ho
        10. Hive on Spark SMB Join.docx
          120 kB
          Szehon Ho
        11. Hive on Spark SMB Join.pdf
          111 kB
          Szehon Ho

          Issue Links

            Activity

              People

              • Assignee:
                szehon Szehon Ho
                Reporter:
                xuefuz Xuefu Zhang
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: