Description
We could get the Partitions Statistics Info.But in some specail case, The Info like totalSize,rawDataSize,rowCount maybe empty. When we do some ddls like
desc formatted partition
,the NumberFormatException is showed as below:
spark-sql> desc formatted table1 partition(year='2019', month='10', day='17', hour='23'); 19/10/19 00:02:40 ERROR SparkSQLDriver: Failed in [desc formatted table1 partition(year='2019', month='10', day='17', hour='23')] java.lang.NumberFormatException: Zero length BigInteger at java.math.BigInteger.(BigInteger.java:411) at java.math.BigInteger.(BigInteger.java:597) at scala.math.BigInt$.apply(BigInt.scala:77) at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$31.apply(HiveClientImpl.scala:1056) at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$31.apply(HiveClientImpl.scala:1056) at scala.Option.map(Option.scala:146) at org.apache.spark.sql.hive.client.HiveClientImpl$.org$apache$spark$sql$hive$client$HiveClientImpl$$readHiveStats(HiveClientImpl.scala:1056) at org.apache.spark.sql.hive.client.HiveClientImpl$.fromHivePartition(HiveClientImpl.scala:1048) at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$getPartitionOption$1$$anonfun$apply$16.apply(HiveClientImpl.scala:659) at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$getPartitionOption$1$$anonfun$apply$16.apply(HiveClientImpl.scala:659) at scala.Option.map(Option.scala:146) at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$getPartitionOption$1.apply(HiveClientImpl.scala:659) at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$getPartitionOption$1.apply(HiveClientImpl.scala:656) at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:281) at org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:219) at org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:218) at org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:264) at org.apache.spark.sql.hive.client.HiveClientImpl.getPartitionOption(HiveClientImpl.scala:656) at org.apache.spark.sql.hive.client.HiveClient$class.getPartitionOption(HiveClient.scala:194) at org.apache.spark.sql.hive.client.HiveClientImpl.getPartitionOption(HiveClientImpl.scala:84) at org.apache.spark.sql.hive.client.HiveClient$class.getPartition(HiveClient.scala:174) at org.apache.spark.sql.hive.client.HiveClientImpl.getPartition(HiveClientImpl.scala:84) at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$getPartition$1.apply(HiveExternalCatalog.scala:1125) at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$getPartition$1.apply(HiveExternalCatalog.scala:1124)
Although we can use 'Analyze table partition ' to update the totalSize,rawDataSize or rowCount, it's unresonable for normal SQL to throw NumberFormatException for Empty totalSize.We should fix the empty case when readHiveStats.
Here is the empty case:
Attachments
Attachments
Issue Links
- links to