Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
1.0.0
-
None
-
None
Description
The below use-case no longer works (tested on a PostgresQL backed HMS using JDO as well as on a MySQL backed HMS with DirectSQL):
CREATE TABLE foo (a STRING) PARTITIONED BY (b INT, c INT); ALTER TABLE foo ADD PARTITION (b='07', c='08'); LOAD DATA LOCAL INPATH '/etc/hostname' INTO TABLE foo PARTITION(b='07', c='08'); -- Does not work if you provide a string IN variable: SELECT a, c FROM foo WHERE b IN ('07'); (No rows selected) -- Works if you provide it in integer forms or canonical integer strings: SELECT a, c FROM foo WHERE b IN (07); (1 row(s) selected) SELECT a, c FROM foo WHERE b IN (7); (1 row(s) selected) SELECT a, c FROM foo WHERE b IN ('7'); (1 row(s) selected)
This worked fine prior to HIVE-8099. The change of HIVE-8099 is inducing a double conversion on the partition column input, such that the IN GenericUDFIn now receives b's value as a column type converted canonical integer 7, as opposed to an as-is DB stored non-canonical value 07. Subsequently the GenericUDFIn again up-converts the b's value to match its argument's value types instead, making 7 (int) into a string "7". Then, "7" is compared against "07" which naturally never matches.
As a regression, this breaks anyone upgrading pre-1.0 to 1.0 or higher.
Attachments
Issue Links
- is broken by
-
HIVE-8099 IN operator for partition column fails when the partition column type is DATE
- Closed