Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-3777

SqlParser parsed error for unicode

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Cannot Reproduce
    • Impala 2.2.4
    • None
    • Frontend
    • CentOS 6.7 64 bit. impalad version 2.7.0-cdh5-INTERNAL DEBUG

    Description

      When I run query:create table unicode_parse_error(id int) row format delimited fields terminated by '\u0023##'; the field delimiter becomes to '\u0017##'.

      Logs:

      [nobida147:21000] > create table unicode_parse_error(id int) row format delimited fields terminated by '\u0023##';
      Query: create table unicode_parse_error(id int) row format delimited fields terminated by '\u0023##'
      
      Fetched 0 row(s) in 242.44s
      [nobida147:21000] > describe extended unicode_parse_error;
      Query: describe extended unicode_parse_error
      +------------------------------+------------------------------------------------------------------+----------------------+
      | name                         | type                                                             | comment              |
      +------------------------------+------------------------------------------------------------------+----------------------+
      | # col_name                   | data_type                                                        | comment              |
      |                              | NULL                                                             | NULL                 |
      | id                           | int                                                              | NULL                 |
      |                              | NULL                                                             | NULL                 |
      | # Detailed Table Information | NULL                                                             | NULL                 |
      | Database:                    | db1                                                              | NULL                 |
      | Owner:                       | root                                                             | NULL                 |
      | CreateTime:                  | Thu Jun 23 15:54:20 CST 2016                                     | NULL                 |
      | LastAccessTime:              | UNKNOWN                                                          | NULL                 |
      | Protect Mode:                | None                                                             | NULL                 |
      | Retention:                   | 0                                                                | NULL                 |
      | Location:                    | hdfs://localhost:20500/test-warehouse/db1.db/unicode_parse_error | NULL                 |
      | Table Type:                  | MANAGED_TABLE                                                    | NULL                 |
      | Table Parameters:            | NULL                                                             | NULL                 |
      |                              | transient_lastDdlTime                                            | 1466668460           |
      |                              | NULL                                                             | NULL                 |
      | # Storage Information        | NULL                                                             | NULL                 |
      | SerDe Library:               | org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe               | NULL                 |
      | InputFormat:                 | org.apache.hadoop.mapred.TextInputFormat                         | NULL                 |
      | OutputFormat:                | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat       | NULL                 |
      | Compressed:                  | No                                                               | NULL                 |
      | Num Buckets:                 | 0                                                                | NULL                 |
      | Bucket Columns:              | []                                                               | NULL                 |
      | Sort Columns:                | []                                                               | NULL                 |
      | Storage Desc Params:         | NULL                                                             | NULL                 |
      |                              | field.delim                                                      | \u0017##             |
      |                              | serialization.format                                             | \u0017##             |
      +------------------------------+------------------------------------------------------------------+----------------------+
      Fetched 27 row(s) in 4.77s
      
      

      After debugging, it seems that SqlParser.parse() goes wrong. As attachment shows, before calling SqlParse.parse() the statement is: fields terminated by '\u0023##' , but after parsing, it becomes '\u0017##'

      Attachments

        1. After calling SqlParser.parse.JPG
          88 kB
          Yuanhao Luo
        2. Before calling SqlParser.parse.JPG
          69 kB
          Yuanhao Luo

        Activity

          People

            Unassigned Unassigned
            yhluo_impala_39a4 Yuanhao Luo
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: