Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Cannot Reproduce
-
None
-
None
-
None
-
None
Description
Hive's casting behavior is inconsistent and the behavior of casting from one type to another undocumented as of now when the casted value is out of range. For example, casting out of range values from one type to another can result in incorrect results.
Eg:
1. select cast('1000' as tinyint) from t1;
NULL
2. select 1000Y from t1;
FAILED: SemanticException [Error 10029]: Line 1:7 Invalid numerical constant '1000Y'
3. select cast(1000 as tinyint) from t1;
-24
4.select cast(1.1e3-1000/0 as tinyint) from t1;
0
5. select cast(10/0 as tinyint) from pw18;
-1
The hive user can accidently try to typecast an out of range value. For example in the e.g. 4/5 even though the final result is NaN, Hive can typecast to a random result. Either we should document that the end user should take care of overflow, underflow, division by 0, etc. by himself/herself or we should return NULLs when the final result is out of range.