Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
3.2.1
-
Reviewed
Description
Hive on tez application fail occasionally after observer is enable, log show below.
2022-08-18 15:22:06,914 [ERROR] [Dispatcher thread {Central}] |impl.VertexImpl|: Vertex Input: namenodeinfo_stg initializer failed, vertex=vertex_1660618571916_4839_1_00 [Map 1] org.apache.tez.dag.app.dag.impl.AMUserCodeException: java.lang.ArrayIndexOutOfBoundsException: 0 at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallback.onFailure(RootInputInitializerManager.java:329) at com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1056) at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30) at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1138) at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:958) at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:748) at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.afterRanInterruptibly(TrustedListenableFutureTask.java:133) at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:80) at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.ArrayIndexOutOfBoundsException: 0 at org.apache.hadoop.mapred.FileInputFormat.identifyHosts(FileInputFormat.java:748) at org.apache.hadoop.mapred.FileInputFormat.getSplitHostsAndCachedHosts(FileInputFormat.java:714) at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:378) at org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:306) at org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:408) at org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:159) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:279) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:270) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:270) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:254) at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125) at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:57) ... 4 more
As describe in MAPREDUCE-7082, when the block is missing, then will throw this exception, but my cluster had no missing block.
In this example, I found getListing return location information. When block report of observer is delayed, will return the block without location.
HDFS-13924 is introduce to solve this problem, but only consider getBlockLocations.
In observer node, all method which may return location should check whether locations is empty or not.
Attachments
Issue Links
- causes
-
HDFS-16923 The getListing RPC will throw NPE if the path does not exist
- Resolved
-
HDFS-16832 [SBN READ] Fix NPE when check the block location of empty directory
- Resolved
- links to