Details
-
Bug
-
Status: Resolved
-
Blocker
-
Resolution: Duplicate
-
1.6.0, 1.6.1
-
None
-
None
-
rhel6 linux 2.6.32-279 (x86_64)
java 1.7.0_67-b01
hadoop CDH5.1.2, HA (2) federated (2) NN configuration
large production cluster
Description
On large clusters we are seeing various forms of HDFS reads hanging:
Queries that never return.
Major compactions that hang.
Accumulo 1.6.1 incorporates detectors that report hanging major compactions and a monitor display that reports scans by age.
Stack traces show readers in sun.nio.ch.EPollArrayWrapper.epollWait and in org.apache.hadoop.ipc.Client.Call(Client.java:1362).
Netstat results for the tablet server shows many connections with a single byte waiting on the Recv-Q of the process, and no bytes waiting on the Send-Q.
strace of the jvm shows the typical jvm thread noise (futex calls)
jstack shows lots of read-requests to the NN.
long-running MajC's do complete, albeit slowly.
Attachments
Issue Links
- duplicates
-
HDFS-7005 DFS input streams do not timeout
- Closed