Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-4241

optimize hive.enforce.sorting and hive.enforce bucketing join

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.12.0
    • Query Processor
    • None
    • Reviewed

    Description

      Consider the following scenario:

      T1: sorted and bucketed by key into 2 buckets
      T2: sorted and bucketed by key into 2 buckets
      T3: sorted and bucketed by key into 2 buckets

      set hive.enforce.sorting=true;
      set hive.enforce.bucketing=true;
      insert overwrite table T3
      select .. from T1 join T2 on T1.key = T2.key;

      Since T1, T2 and T3 are sorted/bucketed by the join, and the above join is
      being performed as a sort-merge join, T3 should be bucketed/sorted without
      the need for an extra reducer.

      Attachments

        1. hive.4241.1.patch
          308 kB
          Namit Jain
        2. hive.4241.1.patch-nohcat
          314 kB
          Namit Jain
        3. hive.4241.2.patch-nohcat
          382 kB
          Namit Jain
        4. hive.4241.3.patch
          375 kB
          Namit Jain
        5. hive.4241.4.patch
          382 kB
          Namit Jain

        Issue Links

          Activity

            People

              namit Namit Jain
              namit Namit Jain
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: