Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-11587

Improve handling of special chars in column names

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • Catalog, Frontend
    • None
    • ghx-label-11

    Description

      Hive can use several special characters in column names if it is quoted with ' ':
      create table tspeccharcol (`@"!£"!$%^=&)(-` int);

      The table above can be used by Impala, but there are some caveats:

      • Impala returns an error for a similar column name: Invalid column/field name
      • SHOW CREATE TABLE in Impala does not quote the column, so it will return a statement that is not executable in Hive (Hive quotes it correctly)

      I am not sure whether we should accept all these special characters - the original question why I investigate this was asking for @.

      The error is returned due to a Hive function:
      https://github.com/apache/impala/blob/cfd79b40beab86f08ad72e0bea41eabf736d0a99/fe/src/main/java/org/apache/impala/catalog/Hive3MetastoreShimBase.java#L166
      The second paramater should be a HiveConf, which will decide whether to accept special chars:
      https://github.com/apache/hive/blob/293a448296933b7498a91e7eeb91edc88dfaa07e/standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java#L219

      Besides this function, Hive also seems to have some other rules, e.g. : is not accepted:
      create table if not exists tspeccharcol (`:` int);
      Error: Error while compiling statement: FAILED: ParseException line 1:48 Failed to recognize predicate ')'. Failed rule: '[., :] can not be used in column name in create table statement.' in column specification (state=42000,code=40000)

      Also noticed some weirdness in Hive / beeline:
      While this is accepted:
      create table if not exists tspeccharcol (`""` int);
      these ones are not:
      create table if not exists tspeccharcol (`"` int);
      create table if not exists tspeccharcol (`"\"` int);

      Both fail with: Error: Error while compiling statement: FAILED: ParseException line 1:49 extraneous input ';' expecting EOF near '<EOF>' (state=42000,code=40000)

      Some part of the client/parser does not seem note the ' ' quoting and applies escaping / quoting rules to the text inside.

      Attachments

        Activity

          People

            Unassigned Unassigned
            csringhofer Csaba Ringhofer
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: