Hive
  1. Hive
  2. HIVE-3107

Improve semantic analyzer to better handle column name references in group by/sort by clauses

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.9.0, 0.10.0, 0.11.0, 0.12.0
    • Fix Version/s: 0.13.0
    • Component/s: Query Processor
    • Labels:
      None

      Description

      This is related to HIVE-1922.

      Following queries all fail with various SemanticExceptions:

      explain select t.c from t group by c;
      explain select t.c from t group by c sort by t.c; 
      explain select t.c as c0 from t group by c0;
      explain select t.c from t group by t.c sort by t.c; 
      

      It is true that one could always find a version of any of above queries that works. But one has to try to find out and it doesn't work well with machine generated SQL queries.

      1. HIVE-3107.1.patch
        30 kB
        Harish Butani
      2. HIVE-3107.2.patch
        36 kB
        Harish Butani
      3. HIVE-3107.3.patch
        46 kB
        Harish Butani
      4. HIVE-3107.4.patch
        47 kB
        Harish Butani
      5. HIVE-3107.5.patch
        48 kB
        Harish Butani
      6. HIVE-3107.6.patch
        51 kB
        Harish Butani

        Issue Links

          Activity

          Hide
          Ashutosh Chauhan added a comment -

          Committed to trunk. Thanks, Harish!

          Show
          Ashutosh Chauhan added a comment - Committed to trunk. Thanks, Harish!
          Hide
          Hive QA added a comment -

          Overall: -1 at least one tests failed

          Here are the results of testing the latest attachment:
          https://issues.apache.org/jira/secure/attachment/12614463/HIVE-3107.6.patch

          ERROR: -1 due to 1 failed/errored test(s), 4617 tests executed
          Failed tests:

          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynamic_partition_skip_default
          

          Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/352/testReport
          Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/352/console

          Messages:

          Executing org.apache.hive.ptest.execution.PrepPhase
          Executing org.apache.hive.ptest.execution.ExecutionPhase
          Executing org.apache.hive.ptest.execution.ReportingPhase
          Tests failed with: TestsFailedException: 1 tests failed
          

          This message is automatically generated.

          ATTACHMENT ID: 12614463

          Show
          Hive QA added a comment - Overall : -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12614463/HIVE-3107.6.patch ERROR: -1 due to 1 failed/errored test(s), 4617 tests executed Failed tests: org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynamic_partition_skip_default Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/352/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/352/console Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests failed with: TestsFailedException: 1 tests failed This message is automatically generated. ATTACHMENT ID: 12614463
          Hide
          Harish Butani added a comment -

          remove -ve test clustern.q;
          move query to gby_resolution.q as a +ve test case

          Show
          Harish Butani added a comment - remove -ve test clustern.q; move query to gby_resolution.q as a +ve test case
          Hide
          Hive QA added a comment -

          Overall: -1 at least one tests failed

          Here are the results of testing the latest attachment:
          https://issues.apache.org/jira/secure/attachment/12614330/HIVE-3107.5.patch

          ERROR: -1 due to 2 failed/errored test(s), 4610 tests executed
          Failed tests:

          org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_clustern1
          org.apache.hive.hcatalog.pig.TestHCatLoader.testProjectionsBasic
          

          Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/336/testReport
          Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/336/console

          Messages:

          Executing org.apache.hive.ptest.execution.PrepPhase
          Executing org.apache.hive.ptest.execution.ExecutionPhase
          Executing org.apache.hive.ptest.execution.ReportingPhase
          Tests failed with: TestsFailedException: 2 tests failed
          

          This message is automatically generated.

          ATTACHMENT ID: 12614330

          Show
          Hive QA added a comment - Overall : -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12614330/HIVE-3107.5.patch ERROR: -1 due to 2 failed/errored test(s), 4610 tests executed Failed tests: org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_clustern1 org.apache.hive.hcatalog.pig.TestHCatLoader.testProjectionsBasic Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/336/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/336/console Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests failed with: TestsFailedException: 2 tests failed This message is automatically generated. ATTACHMENT ID: 12614330
          Hide
          Hive QA added a comment -

          Overall: -1 at least one tests failed

          Here are the results of testing the latest attachment:
          https://issues.apache.org/jira/secure/attachment/12614190/HIVE-3107.4.patch

          ERROR: -1 due to 30 failed/errored test(s), 4615 tests executed
          Failed tests:

          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_char_2
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer14
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_decimal_3
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_decimal_5
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_decimal_udf
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_global_limit
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_cube1
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_rollup1
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_limit_pushdown
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mapjoin_distinct
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_merge_dynamic_partition3
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_multiMapJoin2
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ql_rewrite_gbtoidx
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_rcfile_lazydecompress
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_reduce_deduplicate_extended
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample10
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_collect_set
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_corr
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_covar_pop
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_covar_samp
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udtf_json_tuple
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union26
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_varchar_2
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorization_12
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorization_14
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorization_limit
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorization_short_regress
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_virtual_column
          org.apache.hadoop.hive.cli.TestHBaseMinimrCliDriver.testCliDriver_hbase_bulk
          org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_infer_bucket_sort_bucketed_table
          

          Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/330/testReport
          Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/330/console

          Messages:

          Executing org.apache.hive.ptest.execution.PrepPhase
          Executing org.apache.hive.ptest.execution.ExecutionPhase
          Executing org.apache.hive.ptest.execution.ReportingPhase
          Tests failed with: TestsFailedException: 30 tests failed
          

          This message is automatically generated.

          ATTACHMENT ID: 12614190

          Show
          Hive QA added a comment - Overall : -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12614190/HIVE-3107.4.patch ERROR: -1 due to 30 failed/errored test(s), 4615 tests executed Failed tests: org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_char_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer14 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_decimal_3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_decimal_5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_decimal_udf org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_global_limit org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_cube1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_rollup1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_limit_pushdown org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mapjoin_distinct org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_merge_dynamic_partition3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_multiMapJoin2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ql_rewrite_gbtoidx org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_rcfile_lazydecompress org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_reduce_deduplicate_extended org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample10 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_collect_set org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_corr org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_covar_pop org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_covar_samp org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udtf_json_tuple org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union26 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_varchar_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorization_12 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorization_14 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorization_limit org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorization_short_regress org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_virtual_column org.apache.hadoop.hive.cli.TestHBaseMinimrCliDriver.testCliDriver_hbase_bulk org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_infer_bucket_sort_bucketed_table Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/330/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/330/console Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests failed with: TestsFailedException: 30 tests failed This message is automatically generated. ATTACHMENT ID: 12614190
          Hide
          Ashutosh Chauhan added a comment -

          +1

          Show
          Ashutosh Chauhan added a comment - +1
          Hide
          Ashutosh Chauhan added a comment -

          Also, can you take a look at HIVE-1922 there are few queries listed there which we might be able to support with this work. It will be good to add those in testcases if you haven't covered a variant of those already.

          Show
          Ashutosh Chauhan added a comment - Also, can you take a look at HIVE-1922 there are few queries listed there which we might be able to support with this work. It will be good to add those in testcases if you haven't covered a variant of those already.
          Hide
          Ashutosh Chauhan added a comment -

          Patch looks good. Left one comment on RB. Tests looks good.

          Show
          Ashutosh Chauhan added a comment - Patch looks good. Left one comment on RB. Tests looks good.
          Hide
          Hive QA added a comment -

          Overall: +1 all checks pass

          Here are the results of testing the latest attachment:
          https://issues.apache.org/jira/secure/attachment/12613898/HIVE-3107.3.patch

          SUCCESS: +1 4614 tests passed

          Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/300/testReport
          Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/300/console

          Messages:

          Executing org.apache.hive.ptest.execution.PrepPhase
          Executing org.apache.hive.ptest.execution.ExecutionPhase
          Executing org.apache.hive.ptest.execution.ReportingPhase
          

          This message is automatically generated.

          ATTACHMENT ID: 12613898

          Show
          Hive QA added a comment - Overall : +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12613898/HIVE-3107.3.patch SUCCESS: +1 4614 tests passed Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/300/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/300/console Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase This message is automatically generated. ATTACHMENT ID: 12613898
          Hide
          Harish Butani added a comment -
          Show
          Harish Butani added a comment - Review at https://reviews.apache.org/r/15525/
          Hide
          Hive QA added a comment -

          Overall: -1 at least one tests failed

          Here are the results of testing the latest attachment:
          https://issues.apache.org/jira/secure/attachment/12612892/HIVE-3107.1.patch

          ERROR: -1 due to 6 failed/errored test(s), 4599 tests executed
          Failed tests:

          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ctas_colname
          org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucket_num_reducers
          org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_notable_alias3
          org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby1
          org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby5
          org.apache.hadoop.hive.ql.parse.TestParseNegative.testParseNegative_nonkey_groupby
          

          Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/222/testReport
          Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/222/console

          Messages:

          Executing org.apache.hive.ptest.execution.PrepPhase
          Executing org.apache.hive.ptest.execution.ExecutionPhase
          Executing org.apache.hive.ptest.execution.ReportingPhase
          Tests failed with: TestsFailedException: 6 tests failed
          

          This message is automatically generated.

          ATTACHMENT ID: 12612892

          Show
          Hive QA added a comment - Overall : -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12612892/HIVE-3107.1.patch ERROR: -1 due to 6 failed/errored test(s), 4599 tests executed Failed tests: org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ctas_colname org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucket_num_reducers org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_notable_alias3 org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby1 org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby5 org.apache.hadoop.hive.ql.parse.TestParseNegative.testParseNegative_nonkey_groupby Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/222/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/222/console Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests failed with: TestsFailedException: 6 tests failed This message is automatically generated. ATTACHMENT ID: 12612892
          Hide
          Harish Butani added a comment -
          Show
          Harish Butani added a comment - Review request: https://reviews.apache.org/r/15361/
          Hide
          Xuefu Zhang added a comment -

          I plan to make this improvement, so assign it to me. Please feel free to take it from me if anyone likes to work on it.

          Show
          Xuefu Zhang added a comment - I plan to make this improvement, so assign it to me. Please feel free to take it from me if anyone likes to work on it.
          Hide
          Richard Ding added a comment -

          Thanks. I changed this to improvement.

          Show
          Richard Ding added a comment - Thanks. I changed this to improvement.
          Hide
          Edward Capriolo added a comment -

          "By design" I mean to say that it has always been that way. We have unit tests and as far as I know we have had no regressions. If we want semantic analysis to be easier or more intelligent it is a "feature request" not an "error".

          Show
          Edward Capriolo added a comment - "By design" I mean to say that it has always been that way. We have unit tests and as far as I know we have had no regressions. If we want semantic analysis to be easier or more intelligent it is a "feature request" not an "error".
          Hide
          Richard Ding added a comment -

          @Edward, why do you say this is by design?

          Why does the following statement is legal?

          select t.c from t group by t.c sort by c;
          

          while this one is not?

          select t.c from t group by t.c sort by t.c;
          
          Show
          Richard Ding added a comment - @Edward, why do you say this is by design? Why does the following statement is legal? select t.c from t group by t.c sort by c; while this one is not? select t.c from t group by t.c sort by t.c;
          Hide
          Richard Ding added a comment -

          These are issues for us since many of our SQL queries are machine generated. If this is by design, then it is not documented (i.e., what are invalid statements). I think these are just defects and not by design. The main problem with the above queries is that the column names are not in canonical forms, so it's hard for semantic analyzer to correlate column names in different clauses.

          Here are some stack traces:

          1. explain select t.c from t group by c;

          FAILED: Error in semantic analysis: Line 1:15 Expression not in GROUP BY key 'c'
          12/06/11 09:36:56 ERROR ql.Driver: FAILED: Error in semantic analysis: Line 1:15 Expression not in GROUP BY key 'c'
          org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:15 Expression not in GROUP BY key 'c'
          	at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:7510)
          	at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:2256)
          	at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:2058)
          	at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:5921)
          	at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:6524)
          	at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7282)
          	at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243)
          	at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:48)
          	at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243)
          	at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:430)
          	at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337)
          	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:889)
          	at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:255)
          	at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212)
          	at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
          	at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671)
          	at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554)
          	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
          	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
          	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
          	at java.lang.reflect.Method.invoke(Method.java:597)
          	at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
          

          2. explain select t.c as c0 from t group by c0;

          FAILED: Error in semantic analysis: Line 1:41 Invalid table alias or column reference 'c0': (possible column names are: c)
          12/06/11 09:50:18 ERROR ql.Driver: FAILED: Error in semantic analysis: Line 1:41 Invalid table alias or column reference 'c0': (possible column names are: c)
          org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:41 Invalid table alias or column reference 'c0': (possible column names are: c)
          	at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:7510)
          	at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:7464)
          	at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanMapGroupByOperator(SemanticAnalyzer.java:2739)
          	at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanMapAggr1MR(SemanticAnalyzer.java:3405)
          	at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:5902)
          	at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:6524)
          	at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7282)
          	at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243)
          	at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:48)
          	at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243)
          	at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:430)
          	at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337)
          	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:889)
          	at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:255)
          	at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212)
          	at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
          	at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671)
          	at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554)
          	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
          	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
          	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
          	at java.lang.reflect.Method.invoke(Method.java:597)
          	at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
          

          3. explain select t.c from t group by t.c sort by t.c;

          FAILED: Error in semantic analysis: Line 1:47 Invalid table alias or column reference 't': (possible column names are: _col0)
          12/06/11 09:52:00 ERROR ql.Driver: FAILED: Error in semantic analysis: Line 1:47 Invalid table alias or column reference 't': (possible column names are: _col0)
          org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:47 Invalid table alias or column reference 't': (possible column names are: _col0)
          	at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:7510)
          	at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:7464)
          	at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genReduceSinkPlan(SemanticAnalyzer.java:4562)
          	at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:5936)
          	at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:6524)
          	at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7282)
          	at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243)
          	at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:48)
          	at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243)
          	at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:430)
          	at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337)
          	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:889)
          	at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:255)
          	at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212)
          	at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
          	at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671)
          	at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554)
          	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
          	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
          	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
          	at java.lang.reflect.Method.invoke(Method.java:597)
          	at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
          

          The last case is interesting because all the column names are of the same form (but it still fails).

          Show
          Richard Ding added a comment - These are issues for us since many of our SQL queries are machine generated. If this is by design, then it is not documented (i.e., what are invalid statements). I think these are just defects and not by design. The main problem with the above queries is that the column names are not in canonical forms, so it's hard for semantic analyzer to correlate column names in different clauses. Here are some stack traces: 1. explain select t.c from t group by c; FAILED: Error in semantic analysis: Line 1:15 Expression not in GROUP BY key 'c' 12/06/11 09:36:56 ERROR ql.Driver: FAILED: Error in semantic analysis: Line 1:15 Expression not in GROUP BY key 'c' org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:15 Expression not in GROUP BY key 'c' at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:7510) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:2256) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:2058) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:5921) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:6524) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7282) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243) at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:48) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:430) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:889) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:255) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) 2. explain select t.c as c0 from t group by c0; FAILED: Error in semantic analysis: Line 1:41 Invalid table alias or column reference 'c0': (possible column names are: c) 12/06/11 09:50:18 ERROR ql.Driver: FAILED: Error in semantic analysis: Line 1:41 Invalid table alias or column reference 'c0': (possible column names are: c) org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:41 Invalid table alias or column reference 'c0': (possible column names are: c) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:7510) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:7464) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanMapGroupByOperator(SemanticAnalyzer.java:2739) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanMapAggr1MR(SemanticAnalyzer.java:3405) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:5902) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:6524) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7282) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243) at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:48) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:430) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:889) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:255) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) 3. explain select t.c from t group by t.c sort by t.c; FAILED: Error in semantic analysis: Line 1:47 Invalid table alias or column reference 't': (possible column names are: _col0) 12/06/11 09:52:00 ERROR ql.Driver: FAILED: Error in semantic analysis: Line 1:47 Invalid table alias or column reference 't': (possible column names are: _col0) org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:47 Invalid table alias or column reference 't': (possible column names are: _col0) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:7510) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:7464) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genReduceSinkPlan(SemanticAnalyzer.java:4562) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:5936) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:6524) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7282) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243) at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:48) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:430) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:889) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:255) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) The last case is interesting because all the column names are of the same form (but it still fails).
          Hide
          Richard Ding added a comment -
          FAILED: Error in semantic analysis: Line 1:15 Expression not in GROUP BY key 'c'
          12/06/11 09:36:56 ERROR ql.Driver: FAILED: Error in semantic analysis: Line 1:15 Expression not in GROUP BY key 'c'
          org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:15 Expression not in GROUP BY key 'c'
          	at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:7510)
          	at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:2256)
          	at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:2058)
          	at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:5921)
          	at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:6524)
          	at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7282)
          	at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243)
          	at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:48)
          	at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243)
          	at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:430)
          	at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337)
          	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:889)
          	at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:255)
          	at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212)
          	at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
          	at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671)
          	at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554)
          	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
          	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
          	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
          	at java.lang.reflect.Method.invoke(Method.java:597)
          	at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
          
          Show
          Richard Ding added a comment - FAILED: Error in semantic analysis: Line 1:15 Expression not in GROUP BY key 'c' 12/06/11 09:36:56 ERROR ql.Driver: FAILED: Error in semantic analysis: Line 1:15 Expression not in GROUP BY key 'c' org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:15 Expression not in GROUP BY key 'c' at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:7510) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:2256) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:2058) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:5921) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:6524) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7282) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243) at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:48) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:430) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:889) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:255) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
          Hide
          Edward Capriolo added a comment -

          While this can be annoying, it is not an issue. It is the way the language is designed. I believe we have other jira issues that this essentially duplicates

          Show
          Edward Capriolo added a comment - While this can be annoying, it is not an issue. It is the way the language is designed. I believe we have other jira issues that this essentially duplicates

            People

            • Assignee:
              Harish Butani
              Reporter:
              Richard Ding
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development