Details
-
Bug
-
Status: Open
-
Critical
-
Resolution: Unresolved
-
None
-
None
-
None
-
ghx-label-9
Description
Inserting strings with "\0" values to partition columns leads errors both in Iceberg and Hive tables.
The issue is more severe in Iceberg tables as from this point the table can't be read in Impala or Hive:
create table iceberg_unicode (s string, p string) partitioned by spec (identity(p)) stored as iceberg; insert into iceberg_unicode select "a", "a\0a"; ERROR: IcebergTableLoadingException: Error loading metadata for Iceberg table hdfs://localhost:20500/test-warehouse/iceberg_unicode CAUSED BY: TableLoadingException: Refreshing file and block metadata for 1 paths for table default.iceberg_unicode: failed to load 1 paths. Check the catalog server log for more details.
The partition directory created above seems truncated:
hdfs://localhost:20500/test-warehouse/iceberg_unicode/data/p=a
In partition Hive tables the insert also returns an error, but the new partition is not created and the table remains usable. The error is similar to IMPALA-11499's
Note that Java handles \0 characters in unicode in a special way, which may be related: https://docs.oracle.com/javase/1.5.0/docs/guide/jni/spec/types.html#wp16542
Attachments
Issue Links
- relates to
-
IMPALA-11499 Refactor UrlEncode function to handle special characters
- Resolved