Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-24690

GlobalLimitOptimizer Fails To Identify Some Queries With LIMIT Operator

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.1.0, 2.1.0, 3.1.0
    • None
    • Query Planning
    • None

    Description

      As per https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GlobalLimitOptimizer.java#L88 queries like

      CREATE TABLE ... AS SELECT col1, col2 FROM tbl LIMIT ..
      INSERT OVERWRITE TABLE ... SELECT col1, hash(col2), split(col1) FROM ... LIMIT...
      

      falls under the category of qualified list, But after HIVE-9444 it is not.

      On investigating this issue, It is found that for

      CREATE TABLE ... AS SELECT col1, col2 FROM tbl LIMIT 
      

      query the operator tree looks like TS -> SEL -> LIM -> RS -> SEL -> LIM -> FS

      Since only only LIMIT operator is allowed as per https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GlobalLimitOptimizer.java#L196 , The GlobalLimitOptimizer fails to identify such queries.

      Steps To Reproduce

      set hive.limit.optimize.enable=true;
      create table t1 (a int);
      create table t2 as select * from t1 LIMIT 10;
      

      Attachments

        Issue Links

          Activity

            People

              srahman Syed Shameerur Rahman
              srahman Syed Shameerur Rahman
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: