Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
3.1.1
-
None
-
None
Description
Using hive jdbc handler to query external mysql data source with type decimal type, it throws class cast Exception :
2019-07-08T11:11:50,424 ERROR [7787918f-3111-4706-a3b3-0097fa1bc117 main] CliDriver: Failed with exception java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ClassCastException: java.math.BigDecimal cannot be cast to org.apache.hadoop.hive.common.type.HiveDecimal java.io.IOException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ClassCastException: java.math.BigDecimal cannot be cast to org.apache.hadoop.hive.common.type.HiveDecimal at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:162) at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2691) at org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:229) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:188) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:402) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:683) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:226) at org.apache.hadoop.util.RunJar.main(RunJar.java:141) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ClassCastException: java.math.BigDecimal cannot be cast to org.apache.hadoop.hive.common.type.HiveDecimal at org.apache.hadoop.hive.ql.exec.ListSinkOperator.process(ListSinkOperator.java:98) at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:995) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:941) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:928) at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95) at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:995) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:941) at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125) at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:519) at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:511) at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146) ... 14 more Caused by: java.lang.ClassCastException: java.math.BigDecimal cannot be cast to org.apache.hadoop.hive.common.type.HiveDecimal at org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaHiveDecimalObjectInspector.getPrimitiveJavaObject(JavaHiveDecimalObjectInspector.java:55) at org.apache.hadoop.hive.serde2.lazy.LazyUtils.writePrimitiveUTF8(LazyUtils.java:329) at org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:292) at org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serializeField(LazySimpleSerDe.java:247) at org.apache.hadoop.hive.serde2.DelimitedJSONSerDe.serializeField(DelimitedJSONSerDe.java:72) at org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.doSerialize(LazySimpleSerDe.java:231) at org.apache.hadoop.hive.serde2.AbstractEncodingAwareSerDe.serialize(AbstractEncodingAwareSerDe.java:55) at org.apache.hadoop.hive.serde2.DefaultFetchFormatter.convert(DefaultFetchFormatter.java:67) at org.apache.hadoop.hive.serde2.DefaultFetchFormatter.convert(DefaultFetchFormatter.java:36) at org.apache.hadoop.hive.ql.exec.ListSinkOperator.process(ListSinkOperator.java:94) ... 24 more
Same problem with type date and timestamp.
I debug the code and find that was caused by hive-jdbc-handler return result wity type
java.math.Decimal , but when serialize the result row by calling JavaHiveDecimalObjectInspector#getPrimitiveJavaObject , java.math.Decimal cann't cast to org.apache.hadoop.hive.common.type.HiveDecimal.
So, IMO, this is a bug in jdbc-handler code when processing decimal, date and timestamp teyps. I think jdbc-handler should return result with type cast like this :
@Override public Map<String, Object> next() { try { ResultSetMetaData metadata = rs.getMetaData(); int numColumns = metadata.getColumnCount(); Map<String, Object> record = new HashMap<String, Object>(numColumns); for (int i = 0; i < numColumns; i++) { String key = metadata.getColumnName(i + 1); Object value; if (columnTypes!=null && columnTypes.get(i) instanceof PrimitiveTypeInfo) { // This is not a complete list, barely make information schema work switch (((PrimitiveTypeInfo)columnTypes.get(i)).getTypeName()) { case "int": case "smallint": case "tinyint": value = rs.getInt(i + 1); break; case "bigint": value = rs.getLong(i + 1); break; case "float": value = rs.getFloat(i + 1); break; case "double": value = rs.getDouble(i + 1); break; case "decimal": case "bigdecimal": value = HiveDecimal.create(rs.getBigDecimal(i + 1)); break; case "boolean": value = rs.getBoolean(i + 1); break; case "string": case "char": case "varchar": value = rs.getString(i + 1); break; case "date": case "datetime": value = new Date(rs.getDate(i + 1).toLocalDate()); break; case "timestamp": value = new Timestamp((rs.getTimestamp(i + 1)).toLocalDateTime()); break; default: value = rs.getObject(i + 1); break; } } else { value = rs.getObject(i + 1); } record.put(key, value); } return record; } catch (Exception e) { LOGGER.warn("next() threw exception", e); return null; } }
I want to know whether this change has other effects or not .
Any suggestions are appreciated.