Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-25728

ParseException while gathering Column Stats

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • None
    • None

    Description

      The columnName is escaped twice in ColumnStatsSemanticAnalyzer at line 261, which can cause ParseException. Potential solution is to simply not escape it second time.

      This can be reproduced as following:

      CREATE TABLE table1(
         t1_col1 bigint);
      
       CREATE TABLE table2(
         t2_col1 bigint,
         t2_col2 int)
       PARTITIONED BY (
         t2_col3 date);
      
      insert into table1 values(1);
      insert into table2 values("1","1","1");
      
      --set hive.stats.autogather=false;
      set hive.support.quoted.identifiers=none;
      
      create external table ext_table STORED AS ORC tblproperties('compression'='snappy','external.table.purge'='true') as
      SELECT a.* ,d.`(t2_col1|t2_col3)?+.+`
      FROM table1 a
      LEFT JOIN (SELECT * FROM table2 where t2_col3 like '2021-01-%') d
      on a.t1_col1 = d.t2_col1;

      and it fails with the following stack trace:

      See ./ql/target/tmp/log/hive.log or ./itests/qtest/target/tmp/log/hive.log, or check ./ql/target/surefire-reports or ./itests/qtest/target/surefire-reports/ for specific test cases logs.
       org.apache.hadoop.hive.ql.parse.SemanticException: org.apache.hadoop.hive.ql.parse.ParseException: line 1:772 rule Identifier failed predicate: {allowQuotedId() != Quotation.NONE}?
      line 1:778 rule Identifier failed predicate: {allowQuotedId() != Quotation.NONE}?
      line 1:782 rule Identifier failed predicate: {allowQuotedId() != Quotation.NONE}?
      line 1:807 character '<EOF>' not supported here
          at org.apache.hadoop.hive.ql.parse.ColumnStatsAutoGatherContext.insertAnalyzePipeline(ColumnStatsAutoGatherContext.java:144)
          at org.apache.hadoop.hive.ql.parse.ColumnStatsAutoGatherContext.insertTableValuesAnalyzePipeline(ColumnStatsAutoGatherContext.java:135)
          at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAutoColumnStatsGatheringPipeline(SemanticAnalyzer.java:8380)
          at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFileSinkPlan(SemanticAnalyzer.java:7915)
          at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:11064)
          at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:10939)
          at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11854)
          at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11724)
          at org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:625)
          at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12557)
          at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:455)
          at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:317)
          at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:223)
          at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:105)
          at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:500)
          at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:453)
          at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:417)
          at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:411)
          at org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:125)
          at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:229)
          at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:256)
          at org.apache.hadoop.hive.cli.CliDriver.processCmd1(CliDriver.java:201)
          at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:127)
          at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:422)
          at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:353)
          at org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:783)
          at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:753)
          at org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:142)
          at org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:157)
          at org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver(TestMiniLlapLocalCliDriver.java:62)
          at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
          at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
          at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
          at java.lang.reflect.Method.invoke(Method.java:498)
          at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
          at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
          at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
          at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
          at org.apache.hadoop.hive.cli.control.CliAdapter$2$1.evaluate(CliAdapter.java:135)
          at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
          at org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
          at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
          at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
          at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
          at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
          at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
          at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
          at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
          at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
          at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
          at org.junit.runners.Suite.runChild(Suite.java:128)
          at org.junit.runners.Suite.runChild(Suite.java:27)
          at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
          at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
          at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
          at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
          at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
          at org.apache.hadoop.hive.cli.control.CliAdapter$1$1.evaluate(CliAdapter.java:95)
          at org.junit.rules.RunRules.evaluate(RunRules.java:20)
          at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
          at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
          at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
          at org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
          at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
          at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
          at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:377)
          at org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:138)
          at org.apache.maven.surefire.booter.ForkedBooter.run(ForkedBooter.java:465)
          at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:451)
      Caused by: org.apache.hadoop.hive.ql.parse.ParseException: line 1:772 rule Identifier failed predicate: {allowQuotedId() != Quotation.NONE}?
      line 1:778 rule Identifier failed predicate: {allowQuotedId() != Quotation.NONE}?
      line 1:782 rule Identifier failed predicate: {allowQuotedId() != Quotation.NONE}?
      line 1:807 character '<EOF>' not supported here
          at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:131)
          at org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:93)
          at org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:85)
          at org.apache.hadoop.hive.ql.parse.ColumnStatsAutoGatherContext.genSelOp(ColumnStatsAutoGatherContext.java:179)
          at org.apache.hadoop.hive.ql.parse.ColumnStatsAutoGatherContext.insertAnalyzePipeline(ColumnStatsAutoGatherContext.java:142)
          ... 68 more 

      Attachments

        Issue Links

          Activity

            People

              soumyakanti.das Soumyakanti Das
              soumyakanti.das Soumyakanti Das
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h
                  1h