Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-3633

sort-merge join does not work with sub-queries

Log workAgile BoardRank to TopRank to BottomBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.11.0
    • Query Processor
    • None

    Description

      Consider the following query:

      create table smb_bucket_1(key int, value string) CLUSTERED BY (key) SORTED BY (key) INTO 6 BUCKETS STORED AS TEXTFILE;
      create table smb_bucket_2(key int, value string) CLUSTERED BY (key) SORTED BY (key) INTO 6 BUCKETS STORED AS TEXTFILE;

      – load the above tables

      set hive.optimize.bucketmapjoin = true;
      set hive.optimize.bucketmapjoin.sortedmerge = true;
      set hive.input.format = org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat;

      explain
      select count from
      (
      select /+mapjoin(a)/ a.key as key1, b.key as key2, a.value as value1, b.value as value2
      from smb_bucket_1 a join smb_bucket_2 b on a.key = b.key)
      subq;

      The above query does not use sort-merge join. This would be very useful as we automatically convert the queries to use sorting and bucketing properties for join.

      Attachments

        1. hive.3633.1.patch
          9 kB
          Namit Jain
        2. hive.3633.2.patch
          42 kB
          Namit Jain
        3. hive.3633.3.patch
          48 kB
          Namit Jain
        4. hive.3633.4.patch
          59 kB
          Namit Jain
        5. hive.3633.5.patch
          105 kB
          Namit Jain
        6. hive.3633.6.patch
          103 kB
          Namit Jain
        7. hive.3633.7.patch
          132 kB
          Namit Jain
        8. hive.3633.8.patch
          140 kB
          Namit Jain
        9. hive.3633.9.patch
          140 kB
          Namit Jain
        10. hive.3633.10.patch
          147 kB
          Namit Jain
        11. hive.3633.11.patch
          146 kB
          Namit Jain

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            namit Namit Jain Assign to me
            namit Namit Jain
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment