Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-4529

Inefficient use of try/catch in parser

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: Impala 2.8.0
    • Fix Version/s: Impala 2.8.0
    • Component/s: Frontend
    • Labels:

      Description

      I was experimenting with some queries with very large case statements and found that the parser was spending a lot of time in this callstack:

      "Thread-8" #22 prio=5 os_prio=0 tid=0x0000000009dcc800 nid=0x700f runnable [0x00007f8d95520000]
         java.lang.Thread.State: RUNNABLE
              at java.lang.Integer.parseInt(Integer.java:615)
              at org.apache.impala.analysis.ToSqlUtils.getIdentSql(ToSqlUtils.java:95)
              at org.apache.impala.analysis.ToSqlUtils.getPathSql(ToSqlUtils.java:117)
              at org.apache.impala.analysis.SlotRef.<init>(SlotRef.java:51)
              at org.apache.impala.analysis.CUP$SqlParser$actions.case552(SqlParser.java:13790)
              at org.apache.impala.analysis.CUP$SqlParser$actions.CUP$SqlParser$do_action(SqlParser.java:3828)
              at org.apache.impala.analysis.SqlParser.do_action(SqlParser.java:1228)
              at java_cup.runtime.lr_parser.parse(lr_parser.java:587)
              at org.apache.impala.analysis.AnalysisContext.analyze(AnalysisContext.java:375)
              at org.apache.impala.analysis.AnalysisContext.analyze(AnalysisContext.java:360)
              at org.apache.impala.service.Frontend.analyzeStmt(Frontend.java:891)
              at org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1028)
              at org.apache.impala.service.JniFrontend.createExecRequest(JniFrontend.java:159)
      

      It looks like we're using a try/catch to check if the first character is a digit, which is quite inefficient.

        Activity

        Hide
        tarmstrong Tim Armstrong added a comment -

        IMPALA-4529: speed up parsing of identifiers

        Instead of using substring(), parseInt() and a try/catch, directly
        check the character.

        Change-Id: Iebef43a6a2f7923ca0e9c158d83f5c06f26da0cd
        Reviewed-on: http://gerrit.cloudera.org:8080/5210
        Reviewed-by: Alex Behm <alex.behm@cloudera.com>
        Tested-by: Internal Jenkins

        Show
        tarmstrong Tim Armstrong added a comment - IMPALA-4529 : speed up parsing of identifiers Instead of using substring(), parseInt() and a try/catch, directly check the character. Change-Id: Iebef43a6a2f7923ca0e9c158d83f5c06f26da0cd Reviewed-on: http://gerrit.cloudera.org:8080/5210 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: Internal Jenkins

          People

          • Assignee:
            tarmstrong Tim Armstrong
            Reporter:
            tarmstrong Tim Armstrong
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development