Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-4241

optimize hive.enforce.sorting and hive.enforce bucketing join

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.12.0
    • Component/s: Query Processor
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Consider the following scenario:

      T1: sorted and bucketed by key into 2 buckets
      T2: sorted and bucketed by key into 2 buckets
      T3: sorted and bucketed by key into 2 buckets

      set hive.enforce.sorting=true;
      set hive.enforce.bucketing=true;
      insert overwrite table T3
      select .. from T1 join T2 on T1.key = T2.key;

      Since T1, T2 and T3 are sorted/bucketed by the join, and the above join is
      being performed as a sort-merge join, T3 should be bucketed/sorted without
      the need for an extra reducer.

        Attachments

        1. hive.4241.4.patch
          382 kB
          Namit Jain
        2. hive.4241.3.patch
          375 kB
          Namit Jain
        3. hive.4241.2.patch-nohcat
          382 kB
          Namit Jain
        4. hive.4241.1.patch-nohcat
          314 kB
          Namit Jain
        5. hive.4241.1.patch
          308 kB
          Namit Jain

          Issue Links

            Activity

              People

              • Assignee:
                namit Namit Jain
                Reporter:
                namit Namit Jain
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: