Hive
  1. Hive
  2. HIVE-7526

Research to use groupby transformation to replace Hive existing partitionByKey and SparkCollector combination

    Details

    • Type: Task Task
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: spark-branch
    • Component/s: Spark
    • Labels:
      None

      Description

      Currently SparkClient shuffles data by calling paritionByKey(). This transformation outputs <key, value> tuples. However, Hive's ExecMapper expects <key, iterator<value>> tuples, and Spark's groupByKey() seems outputing this directly. Thus, using groupByKey, we may be able to avoid its own key clustering mechanism (in HiveReduceFunction). This research is to have a try.

      1. HIVE-7526.2.patch
        3 kB
        Chao
      2. HIVE-7526.3.patch
        11 kB
        Chao
      3. HIVE-7526.4-spark.patch
        11 kB
        Chao
      4. HIVE-7526.5-spark.patch
        13 kB
        Xuefu Zhang
      5. HIVE-7526.patch
        4 kB
        Chao

        Issue Links

          Activity

          Xuefu Zhang created issue -
          Xuefu Zhang made changes -
          Field Original Value New Value
          Link This issue is part of HIVE-7292 [ HIVE-7292 ]
          Xuefu Zhang made changes -
          Link This issue is related to HIVE-7493 [ HIVE-7493 ]
          Xuefu Zhang made changes -
          Assignee Chao [ csun ]
          Chao made changes -
          Attachment HIVE-7526.patch [ 12658278 ]
          Chao made changes -
          Attachment HIVE-7526.2.patch [ 12658587 ]
          Chao made changes -
          Attachment HIVE-7526.3.patch [ 12658714 ]
          Chao made changes -
          Attachment HIVE-7526.4-spark.patch [ 12658841 ]
          Xuefu Zhang made changes -
          Attachment HIVE-7546.5-spark.patch [ 12658864 ]
          Xuefu Zhang made changes -
          Attachment HIVE-7526.5-spark.patch [ 12658865 ]
          Xuefu Zhang made changes -
          Attachment HIVE-7546.5-spark.patch [ 12658864 ]
          Xuefu Zhang made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Xuefu Zhang made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Fix Version/s spark-branch [ 12327352 ]
          Resolution Fixed [ 1 ]

            People

            • Assignee:
              Chao
              Reporter:
              Xuefu Zhang
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development