Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-464

Make parser accept configurable max length for SQL identifier

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.0.0-incubating
    • Component/s: None
    • Labels:
      None

      Description

      Calcite currently uses 128 as the maximum length for sql identifier. We would like to enhance the SQL parser, so that each system should be able to choose its own identifier max length.

      The following is the discussion copied from the dev mailing list:

      ===============================================================
      [Jinfeng]

      Calcite (formly Optiq) has set the maximum length for an SQL identifier to 128. This sounds quite reasonable for an ordinary sql identifier. However, for system like Drill which allows user to query a file directly, it's quite likely to run out of the allowed identifier length.

      Drill allows the use case of :

      select * from dfs.schema.`directory1/directory2/filename.json`;

      The directory plus the filename is parsed as a SQL identifer. Since many file system would allow file name up to 255 bytes [1], it's likely user could use an identifier with more than 128 bytes when query the file directly.

      I would like to ask the Calcite community whether there is any performance impact if we bump up the maximum length for a SQL identifier. If there is negligible performance impact, then, can we make the maximum length configurable in the SQL parser? We could make slightly modification to CombinedParser.jj template, to allow each system to set its own max length, with default still being 128.

      The approach to allow a configurable setting seems to be match what Postgres [2] is doing, which allows the length to be "raised by changing the NAMEDATALEN constant in src/include/pg_config_manual.h." [2]

      [Julian]

      Each system should be able to choose its own identifier max length.

      I’d prefer to make it configurable at runtime. Then we can test it. Add an extra parameter to SqlParser’s constructor, and have it call parser.setIdentifierMaxLength, similar to how it calls setQuotedCasing right now.

      Or if you want to go further, create

      interface SqlParser.Config {
      int identifierMaxLength();
      Casing quotedCasing();
      Casing unquotedCasing();
      }

      and replace all of the parameters of SqlParser’s constructor with one Config parameter.

        Activity

        Hide
        jni Jinfeng Ni added a comment -

        I' working on a patch to allow dynamically set the max length for identifier. Will submit for review once it's done.

        Show
        jni Jinfeng Ni added a comment - I' working on a patch to allow dynamically set the max length for identifier. Will submit for review once it's done.
        Hide
        julianhyde Julian Hyde added a comment -

        Thanks for the patch. Checked in (with a little fix-up) as http://git-wip-us.apache.org/repos/asf/incubator-calcite/commit/a5584ea7.

        Show
        julianhyde Julian Hyde added a comment - Thanks for the patch. Checked in (with a little fix-up) as http://git-wip-us.apache.org/repos/asf/incubator-calcite/commit/a5584ea7 .
        Hide
        julianhyde Julian Hyde added a comment -

        Closing now that 1.0.0-incubating has been released.

        Show
        julianhyde Julian Hyde added a comment - Closing now that 1.0.0-incubating has been released.

          People

          • Assignee:
            julianhyde Julian Hyde
            Reporter:
            jni Jinfeng Ni
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development