[IMPALA-3777] SqlParser parsed error for unicode - ASF JIRA

Details

Type: Bug
Status: Resolved
Priority: Minor
Resolution: Cannot Reproduce
Affects Version/s: Impala 2.2.4
Fix Version/s: None
Component/s: Frontend
Labels:
- correctness
- downgraded
Environment:
CentOS 6.7 64 bit. impalad version 2.7.0-cdh5-INTERNAL DEBUG

Target Version:

Impala 2.2.4

Description

When I run query:create table unicode_parse_error(id int) row format delimited fields terminated by '\u0023##'; the field delimiter becomes to '\u0017##'.

Logs:

[nobida147:21000] > create table unicode_parse_error(id int) row format delimited fields terminated by '\u0023##';
Query: create table unicode_parse_error(id int) row format delimited fields terminated by '\u0023##'

Fetched 0 row(s) in 242.44s
[nobida147:21000] > describe extended unicode_parse_error;
Query: describe extended unicode_parse_error
+------------------------------+------------------------------------------------------------------+----------------------+
| name                         | type                                                             | comment              |
+------------------------------+------------------------------------------------------------------+----------------------+
| # col_name                   | data_type                                                        | comment              |
|                              | NULL                                                             | NULL                 |
| id                           | int                                                              | NULL                 |
|                              | NULL                                                             | NULL                 |
| # Detailed Table Information | NULL                                                             | NULL                 |
| Database:                    | db1                                                              | NULL                 |
| Owner:                       | root                                                             | NULL                 |
| CreateTime:                  | Thu Jun 23 15:54:20 CST 2016                                     | NULL                 |
| LastAccessTime:              | UNKNOWN                                                          | NULL                 |
| Protect Mode:                | None                                                             | NULL                 |
| Retention:                   | 0                                                                | NULL                 |
| Location:                    | hdfs://localhost:20500/test-warehouse/db1.db/unicode_parse_error | NULL                 |
| Table Type:                  | MANAGED_TABLE                                                    | NULL                 |
| Table Parameters:            | NULL                                                             | NULL                 |
|                              | transient_lastDdlTime                                            | 1466668460           |
|                              | NULL                                                             | NULL                 |
| # Storage Information        | NULL                                                             | NULL                 |
| SerDe Library:               | org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe               | NULL                 |
| InputFormat:                 | org.apache.hadoop.mapred.TextInputFormat                         | NULL                 |
| OutputFormat:                | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat       | NULL                 |
| Compressed:                  | No                                                               | NULL                 |
| Num Buckets:                 | 0                                                                | NULL                 |
| Bucket Columns:              | []                                                               | NULL                 |
| Sort Columns:                | []                                                               | NULL                 |
| Storage Desc Params:         | NULL                                                             | NULL                 |
|                              | field.delim                                                      | \u0017##             |
|                              | serialization.format                                             | \u0017##             |
+------------------------------+------------------------------------------------------------------+----------------------+
Fetched 27 row(s) in 4.77s

After debugging, it seems that SqlParser.parse() goes wrong. As attachment shows, before calling SqlParse.parse() the statement is: fields terminated by '\u0023##' , but after parsing, it becomes '\u0017##'

SqlParser parsed error for unicode

Details

Description

Attachments

Attachments

Activity

People

Dates