Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-459

TokenMgrError parsing error when the query terminates a single-line comment without newline

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.6.0
    • Component/s: None
    • Labels:

      Description

      Here's the test (based on the bug report in https://github.com/vlsi/mat-calcite-plugin/issues/1):

        /** Tests comment without ending newline */
        @Test public void testCommentWithoutNewLine() {
          OptiqAssert.that()
              .with(OptiqAssert.Config.REGULAR)
              .query(
                  "select \"empid\" from \"hr\".\"emps\"\n-- select")
              .typeIs(
                  "[empid INTEGER NOT NULL]")
              .returnsUnordered(
                  "empid=100",
                  "empid=110",
                  "empid=150",
                  "empid=200");
        }
      
      java.lang.RuntimeException: exception while executing [select "empid" from "hr"."emps"
      -- select]
      	at net.hydromatic.optiq.test.OptiqAssert$AssertQuery.typeIs(OptiqAssert.java:1088)
      	at net.hydromatic.optiq.test.JdbcTest.testCommentWithoutNewLine(JdbcTest.java:3812)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
      	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
      	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
      	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
      	at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
      	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
      	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
      	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
      	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
      	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
      	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
      	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
      	at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
      	at org.junit.runner.JUnitCore.run(JUnitCore.java:160)
      	at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:74)
      	at com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:211)
      	at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:67)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
      	at com.intellij.rt.execution.application.AppMain.main(AppMain.java:134)
      Caused by: java.lang.RuntimeException: With materializationsEnabled=false, limit=0
      	at net.hydromatic.optiq.test.OptiqAssert.assertQuery(OptiqAssert.java:419)
      	at net.hydromatic.optiq.test.OptiqAssert$AssertQuery.typeIs(OptiqAssert.java:1084)
      	... 27 more
      Caused by: java.sql.SQLException: error while executing SQL "select "empid" from "hr"."emps"
      -- select": parse failed: Lexical error at line 2, column 10.  Encountered: <EOF> after : ""
      	at net.hydromatic.avatica.Helper.createException(Helper.java:39)
      	at net.hydromatic.avatica.AvaticaStatement.executeQuery(AvaticaStatement.java:79)
      	at net.hydromatic.optiq.test.OptiqAssert.assertQuery(OptiqAssert.java:394)
      	... 28 more
      Caused by: java.lang.RuntimeException: parse failed: Lexical error at line 2, column 10.  Encountered: <EOF> after : ""
      	at net.hydromatic.optiq.prepare.OptiqPrepareImpl.prepare2_(OptiqPrepareImpl.java:395)
      	at net.hydromatic.optiq.prepare.OptiqPrepareImpl.prepare_(OptiqPrepareImpl.java:322)
      	at net.hydromatic.optiq.prepare.OptiqPrepareImpl.prepareSql(OptiqPrepareImpl.java:291)
      	at net.hydromatic.optiq.jdbc.OptiqConnectionImpl.parseQuery(OptiqConnectionImpl.java:170)
      	at net.hydromatic.optiq.jdbc.MetaImpl.prepare(MetaImpl.java:622)
      	at net.hydromatic.avatica.AvaticaStatement.executeQuery(AvaticaStatement.java:76)
      	... 29 more
      Caused by: org.eigenbase.sql.parser.SqlParseException: Lexical error at line 2, column 10.  Encountered: <EOF> after : ""
      	at org.eigenbase.sql.parser.impl.SqlParserImpl.convertException(SqlParserImpl.java:327)
      	at org.eigenbase.sql.parser.impl.SqlParserImpl.normalizeException(SqlParserImpl.java:68)
      	at org.eigenbase.sql.parser.SqlParser.parseStmt(SqlParser.java:149)
      	at net.hydromatic.optiq.prepare.OptiqPrepareImpl.prepare2_(OptiqPrepareImpl.java:392)
      	... 34 more
      Caused by: org.eigenbase.sql.parser.impl.TokenMgrError: Lexical error at line 2, column 10.  Encountered: <EOF> after : ""
      	at org.eigenbase.sql.parser.impl.SqlParserImplTokenManager.getNextToken(SqlParserImplTokenManager.java:13715)
      	at org.eigenbase.sql.parser.impl.SqlParserImpl.jj_scan_token(SqlParserImpl.java:15844)
      	at org.eigenbase.sql.parser.impl.SqlParserImpl.jj_3_273(SqlParserImpl.java:15071)
      	at org.eigenbase.sql.parser.impl.SqlParserImpl.jj_2_273(SqlParserImpl.java:6888)
      	at org.eigenbase.sql.parser.impl.SqlParserImpl.CompoundIdentifier(SqlParserImpl.java:3348)
      	at org.eigenbase.sql.parser.impl.SqlParserImpl.TableRef(SqlParserImpl.java:1440)
      	at org.eigenbase.sql.parser.impl.SqlParserImpl.FromClause(SqlParserImpl.java:1353)
      	at org.eigenbase.sql.parser.impl.SqlParserImpl.SqlSelect(SqlParserImpl.java:752)
      	at org.eigenbase.sql.parser.impl.SqlParserImpl.LeafQuery(SqlParserImpl.java:516)
      	at org.eigenbase.sql.parser.impl.SqlParserImpl.LeafQueryOrExpr(SqlParserImpl.java:2151)
      	at org.eigenbase.sql.parser.impl.SqlParserImpl.QueryOrExpr(SqlParserImpl.java:2070)
      	at org.eigenbase.sql.parser.impl.SqlParserImpl.OrderedQueryOrExpr(SqlParserImpl.java:434)
      	at org.eigenbase.sql.parser.impl.SqlParserImpl.SqlStmt(SqlParserImpl.java:692)
      	at org.eigenbase.sql.parser.impl.SqlParserImpl.SqlStmtEof(SqlParserImpl.java:718)
      	at org.eigenbase.sql.parser.impl.SqlParserImpl.parseSqlStmtEof(SqlParserImpl.java:121)
      	at org.eigenbase.sql.parser.SqlParser.parseStmt(SqlParser.java:142)
      	... 35 more
      
      
      Process finished with exit code 255
      

        Issue Links

          Activity

          Hide
          julianhyde Julian Hyde added a comment -

          Resolved in release 1.6.0 (2016-01-22).

          Show
          julianhyde Julian Hyde added a comment - Resolved in release 1.6.0 (2016-01-22).
          Hide
          julianhyde Julian Hyde added a comment -
          Show
          julianhyde Julian Hyde added a comment - Fixed in http://git-wip-us.apache.org/repos/asf/calcite/commit/fa08c46d ; thanks for the PR zhen wang !
          Hide
          zhenw zhen wang added a comment -

          1. okay with the name you put there.
          2. matching regular expressions and lexical states are both NFAs, but I can imagine the new regular expression is a bit more complex than the original lexical state.
          lastly, this is from the FAQ, which I see quite some references.

          Show
          zhenw zhen wang added a comment - 1. okay with the name you put there. 2. matching regular expressions and lexical states are both NFAs, but I can imagine the new regular expression is a bit more complex than the original lexical state. lastly, this is from the FAQ, which I see quite some references.
          Hide
          julianhyde Julian Hyde added a comment -

          I've created CALCITE-1012. Let's continue discussion there.

          Show
          julianhyde Julian Hyde added a comment - I've created CALCITE-1012 . Let's continue discussion there.
          Hide
          vladimirsitnikov Vladimir Sitnikov added a comment -

          I've created CALCITE-1011

          Show
          vladimirsitnikov Vladimir Sitnikov added a comment - I've created CALCITE-1011
          Hide
          vladimirsitnikov Vladimir Sitnikov added a comment -

          Julian Hyde, what is the recommended way to parse queries?
          I've crafted a first draft and it looks like SqlParser.create(sql).parseQuery() is taking quite noticeable time:
          https://github.com/apache/calcite/pull/176

          Even parsing a trivial query takes something like 20-40 us and allocates 70KiB of java objects.
          It looks like I'm using the wrong API.

          Benchmark                                 (comments)  (length)  Mode  Cnt        Score        Error   Units
          ParserBenchmark.parse                           true        10  avgt    5       46,718 ±     16,213   us/op
          ParserBenchmark.parse:·gc.alloc.rate.norm       true        10  avgt    5    74487,751 ±    177,889    B/op
          ParserBenchmark.parse                           true       100  avgt    5       54,542 ±     24,144   us/op
          ParserBenchmark.parse:·gc.alloc.rate.norm       true       100  avgt    5    76262,748 ±    162,214    B/op
          ParserBenchmark.parse                           true      1000  avgt    5      250,789 ±    180,010   us/op
          ParserBenchmark.parse:·gc.alloc.rate.norm       true      1000  avgt    5    96880,116 ±      0,083    B/op
          ParserBenchmark.parse                           true    100000  avgt    5    12608,364 ±   1293,125   us/op
          ParserBenchmark.parse:·gc.alloc.rate.norm       true    100000  avgt    5  2396007,715 ±     16,160    B/op
          ParserBenchmark.parse                          false        10  avgt    5       46,253 ±      9,299   us/op
          ParserBenchmark.parse:·gc.alloc.rate.norm      false        10  avgt    5    74408,256 ±    137,292    B/op
          ParserBenchmark.parse                          false       100  avgt    5       54,753 ±      8,506   us/op
          ParserBenchmark.parse:·gc.alloc.rate.norm      false       100  avgt    5    77034,399 ±    152,759    B/op
          ParserBenchmark.parse                          false      1000  avgt    5      227,980 ±     23,439   us/op
          ParserBenchmark.parse:·gc.alloc.rate.norm      false      1000  avgt    5   105440,105 ±      0,010    B/op
          ParserBenchmark.parse                          false    100000  avgt    5    18675,561 ±    242,296   us/op
          ParserBenchmark.parse:·gc.alloc.rate.norm      false    100000  avgt    5  3202680,561 ±      0,269    B/op
          

          Query of length 10 is something like:

          select 1, '7935759579887025813642400976320251869'
           from dual

          100:

          select 1, ?, 1910525591, -33731156, -140885363, '-28690016290210702076711497928626237037', '-76148433530322602893458688481236178054'
           from dual

          1000:

          select 1, 1482506249, ?, 932852513, 540573036, -527274831, ?, ?// sb.append('\'').append(rnd.nextLong()).append(rnd.nextLong()).append(rnd.nextLong())
          , 1692895102, '50951735439143566-7690637232772497893', ?, ?, ?, ?, ?// sb.append('\'').append(rnd.nextLong()).append(rnd.nextLong()).append(rnd.nextLong())
          , '4380376358709010212-1173038460074161045', -1918523131, '66668745555380426053762426347771867584', '-4774732703008184939-3817796209798356166', '89039746234244470812022748182366240377', '-12480157603806734075299113791518246288', 1171721598// sb.append('\'').append(rnd.nextLong()).append(rnd.nextLong()).append(rnd.nextLong())
          , '47274140693996363363917338148185131534', -80138859, ?, ?, 1676036541, ?, -983339282// sb.append('\'').append(rnd.nextLong()).append(rnd.nextLong()).append(rnd.nextLong())
          , '70734587149480568336941566409423582112', '-8847237352737739855-2917641826110218142', '4087359824551837905-5660719534397547355', ?, ?, 1192173528, '-4576995549188621938-8764240972870282558'// sb.append('\'').append(rnd.nextLong()).append(rnd.nextLong()).append(rnd.nextLong())
           from dual

          100'000:

          you've got the idea
          Show
          vladimirsitnikov Vladimir Sitnikov added a comment - Julian Hyde , what is the recommended way to parse queries? I've crafted a first draft and it looks like SqlParser.create(sql).parseQuery() is taking quite noticeable time: https://github.com/apache/calcite/pull/176 Even parsing a trivial query takes something like 20-40 us and allocates 70KiB of java objects. It looks like I'm using the wrong API. Benchmark (comments) (length) Mode Cnt Score Error Units ParserBenchmark.parse true 10 avgt 5 46,718 ± 16,213 us/op ParserBenchmark.parse:·gc.alloc.rate.norm true 10 avgt 5 74487,751 ± 177,889 B/op ParserBenchmark.parse true 100 avgt 5 54,542 ± 24,144 us/op ParserBenchmark.parse:·gc.alloc.rate.norm true 100 avgt 5 76262,748 ± 162,214 B/op ParserBenchmark.parse true 1000 avgt 5 250,789 ± 180,010 us/op ParserBenchmark.parse:·gc.alloc.rate.norm true 1000 avgt 5 96880,116 ± 0,083 B/op ParserBenchmark.parse true 100000 avgt 5 12608,364 ± 1293,125 us/op ParserBenchmark.parse:·gc.alloc.rate.norm true 100000 avgt 5 2396007,715 ± 16,160 B/op ParserBenchmark.parse false 10 avgt 5 46,253 ± 9,299 us/op ParserBenchmark.parse:·gc.alloc.rate.norm false 10 avgt 5 74408,256 ± 137,292 B/op ParserBenchmark.parse false 100 avgt 5 54,753 ± 8,506 us/op ParserBenchmark.parse:·gc.alloc.rate.norm false 100 avgt 5 77034,399 ± 152,759 B/op ParserBenchmark.parse false 1000 avgt 5 227,980 ± 23,439 us/op ParserBenchmark.parse:·gc.alloc.rate.norm false 1000 avgt 5 105440,105 ± 0,010 B/op ParserBenchmark.parse false 100000 avgt 5 18675,561 ± 242,296 us/op ParserBenchmark.parse:·gc.alloc.rate.norm false 100000 avgt 5 3202680,561 ± 0,269 B/op Query of length 10 is something like: select 1, '7935759579887025813642400976320251869' from dual 100: select 1, ?, 1910525591, -33731156, -140885363, '-28690016290210702076711497928626237037', '-76148433530322602893458688481236178054' from dual 1000: select 1, 1482506249, ?, 932852513, 540573036, -527274831, ?, ?// sb.append('\'').append(rnd.nextLong()).append(rnd.nextLong()).append(rnd.nextLong()) , 1692895102, '50951735439143566-7690637232772497893', ?, ?, ?, ?, ?// sb.append('\'').append(rnd.nextLong()).append(rnd.nextLong()).append(rnd.nextLong()) , '4380376358709010212-1173038460074161045', -1918523131, '66668745555380426053762426347771867584', '-4774732703008184939-3817796209798356166', '89039746234244470812022748182366240377', '-12480157603806734075299113791518246288', 1171721598// sb.append('\'').append(rnd.nextLong()).append(rnd.nextLong()).append(rnd.nextLong()) , '47274140693996363363917338148185131534', -80138859, ?, ?, 1676036541, ?, -983339282// sb.append('\'').append(rnd.nextLong()).append(rnd.nextLong()).append(rnd.nextLong()) , '70734587149480568336941566409423582112', '-8847237352737739855-2917641826110218142', '4087359824551837905-5660719534397547355', ?, ?, 1192173528, '-4576995549188621938-8764240972870282558'// sb.append('\'').append(rnd.nextLong()).append(rnd.nextLong()).append(rnd.nextLong()) from dual 100'000: you've got the idea
          Hide
          vladimirsitnikov Vladimir Sitnikov added a comment -

          I've once saw a hand-written 600KiB SQL report.
          It obviously had some copy&paste issues, however someone did put that into production.

          Show
          vladimirsitnikov Vladimir Sitnikov added a comment - I've once saw a hand-written 600KiB SQL report. It obviously had some copy&paste issues, however someone did put that into production.
          Hide
          julianhyde Julian Hyde added a comment -

          Vladimir Sitnikov, I agree. Can you log a jira case for it? I was also wondering whether there were examples of large, hand-written SQL queries in the bug histories of Hive and Postgres.

          Show
          julianhyde Julian Hyde added a comment - Vladimir Sitnikov , I agree. Can you log a jira case for it? I was also wondering whether there were examples of large, hand-written SQL queries in the bug histories of Hive and Postgres.
          Hide
          vladimirsitnikov Vladimir Sitnikov added a comment -

          Julian Hyde, it would be nice to implement "random query generator" for testing purposes: it could catch some edge cases in parser and it can be used for benchmarking as well.

          Recent SqlNode to String unparser might suit for that generator.
          I mean something like https://github.com/anse1/sqlsmith

          Show
          vladimirsitnikov Vladimir Sitnikov added a comment - Julian Hyde , it would be nice to implement "random query generator" for testing purposes: it could catch some edge cases in parser and it can be used for benchmarking as well. Recent SqlNode to String unparser might suit for that generator. I mean something like https://github.com/anse1/sqlsmith
          Hide
          julianhyde Julian Hyde added a comment - - edited

          zhen wang, Do you know what are the performance implications of your change? I don't know whether removing a lexical state and making a token a more complex regular expression is likely to be a win or a loss. To be clear, I'm not seeing any performance change, but I want to be sure I'm not making anything worse.

          Show
          julianhyde Julian Hyde added a comment - - edited zhen wang , Do you know what are the performance implications of your change? I don't know whether removing a lexical state and making a token a more complex regular expression is likely to be a win or a loss. To be clear, I'm not seeing any performance change, but I want to be sure I'm not making anything worse.
          Hide
          julianhyde Julian Hyde added a comment -

          Reviewing now. How do you want your name to appear in the check-in comment? Right now I have "awang".

          Show
          julianhyde Julian Hyde added a comment - Reviewing now. How do you want your name to appear in the check-in comment? Right now I have "awang".
          Hide
          zhenw zhen wang added a comment -
          Show
          zhenw zhen wang added a comment - Julian Hyde please help review https://github.com/apache/calcite/pull/175

            People

            • Assignee:
              julianhyde Julian Hyde
              Reporter:
              vladimirsitnikov Vladimir Sitnikov
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development