Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
1.4.0
-
None
-
None
-
CDH 4.5 1.4.0+23
Description
looks like related to #FLUME-2228 (https://issues.apache.org/jira/browse/FLUME-2228)
We have 3-node cluster with NN HA. Sometimes nodes are switching.
And Flume HDFS sink can't correctly handle such situation.
Here is the log:
9 Dec 2013 22:14:49,175 INFO [pool-6-thread-1] (org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream:171) - [id: 0x1b2ae35d, /176.9.1.174:37697 :> /88.198.23.238:60011] DISCONNECTED 09 Dec 2013 22:14:49,175 INFO [pool-6-thread-1] (org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream:171) - [id: 0x1b2ae35d, /176.9.1.174:37697 :> /88.198.23.238:60011] UNBOUND 09 Dec 2013 22:14:49,175 INFO [pool-6-thread-1] (org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream:171) - [id: 0x1b2ae35d, /176.9.1.174:37697 :> /88.198.23.238:60011] CLOSED 09 Dec 2013 22:14:49,175 INFO [pool-6-thread-1] (org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.channelClosed:209) - Connection to /176.9.1.174:37697 disconnected. 09 Dec 2013 22:14:49,956 ERROR [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.AbstractHDFSWriter.isUnderReplicated:96) - Unexpected error while checking replication factor java.lang.reflect.InvocationTargetException at sun.reflect.GeneratedMethodAccessor21.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.flume.sink.hdfs.AbstractHDFSWriter.getNumCurrentReplicas(AbstractHDFSWriter.java:162) at org.apache.flume.sink.hdfs.AbstractHDFSWriter.isUnderReplicated(AbstractHDFSWriter.java:82) at org.apache.flume.sink.hdfs.BucketWriter.shouldRotate(BucketWriter.java:452) at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:387) at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:392) at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68) at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147) at java.lang.Thread.run(Thread.java:662) Caused by: java.io.IOException: Failed to add a datanode. User may turn off this feature by setting dfs.client.block.write.replace-datanode-on-failure.policy in configuration, where the current policy is DEFAULT. (Nodes: current=[88.198.23.238:50010, 176.9.1.174:50010], original=[88.198.23.238:50010, 176.9.1.174:50010]) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:817) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:877) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:983) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:780) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449) 09 Dec 2013 22:14:49,956 WARN [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.BucketWriter.append:424) - Caught IOException writing to HDFSWriter (Failed to add a datanode. User may turn off this feature by setting dfs.client.block.write.replace-datanode-on-failure.policy in configuration, where the current policy is DEFAULT. (Nodes: current=[88.198.23.238:50010, 176.9.1.174:50010], original=[88.198.23.238:50010, 176.9.1.174:50010])). Closing file (/staging/landing/stream/js_tracker/visit/2013/12/09/07/visit.1386560349521.bz2.tmp) and rethrowing exception. 09 Dec 2013 22:14:49,957 WARN [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.BucketWriter.append:430) - Caught IOException while closing file (/staging/landing/stream/js_tracker/visit/2013/12/09/07/visit.1386560349521.bz2.tmp). Exception follows. java.io.IOException: Failed to add a datanode. User may turn off this feature by setting dfs.client.block.write.replace-datanode-on-failure.policy in configuration, where the current policy is DEFAULT. (Nodes: current=[88.198.23.238:50010, 176.9.1.174:50010], original=[88.198.23.238:50010, 176.9.1.174:50010]) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:817) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:877) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:983) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:780) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449) 09 Dec 2013 22:14:49,957 WARN [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.HDFSEventSink.process:418) - HDFS IO error java.io.IOException: Failed to add a datanode. User may turn off this feature by setting dfs.client.block.write.replace-datanode-on-failure.policy in configuration, where the current policy is DEFAULT. (Nodes: current=[88.198.23.238:50010, 176.9.1.174:50010], original=[88.198.23.238:50010, 176.9.1.174:50010]) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:817) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:877) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:983) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:780) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449) 09 Dec 2013 22:14:54,957 ERROR [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.AbstractHDFSWriter.isUnderReplicated:96) - Unexpected error while checking replication factor java.lang.reflect.InvocationTargetException at sun.reflect.GeneratedMethodAccessor21.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.flume.sink.hdfs.AbstractHDFSWriter.getNumCurrentReplicas(AbstractHDFSWriter.java:162) at org.apache.flume.sink.hdfs.AbstractHDFSWriter.isUnderReplicated(AbstractHDFSWriter.java:82) at org.apache.flume.sink.hdfs.BucketWriter.shouldRotate(BucketWriter.java:452) at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:387) at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:392) at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68) at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147) at java.lang.Thread.run(Thread.java:662) Caused by: java.io.IOException: Failed to add a datanode. User may turn off this feature by setting dfs.client.block.write.replace-datanode-on-failure.policy in configuration, where the current policy is DEFAULT. (Nodes: current=[88.198.23.238:50010, 176.9.1.174:50010], original=[88.198.23.238:50010, 176.9.1.174:50010]) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:817) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:877) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:983) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:780) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449) 09 Dec 2013 22:14:54,958 WARN [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.BucketWriter.append:424) - Caught IOException writing to HDFSWriter (Failed to add a datanode. User may turn off this feature by setting dfs.client.block.write.replace-datanode-on-failure.policy in configuration, where the current policy is DEFAULT. (Nodes: current=[88.198.23.238:50010, 176.9.1.174:50010], original=[88.198.23.238:50010, 176.9.1.174:50010])). Closing file (/staging/landing/stream/js_tracker/visit/2013/12/09/07/visit.1386560349521.bz2.tmp) and rethrowing exception. 09 Dec 2013 22:14:54,958 WARN [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.BucketWriter.append:430) - Caught IOException while closing file (/staging/landing/strea
Flume stops to write data to HDFS and can't close opened file. It's a disaster because we don't have any notification (is it possible to get any)?
UPD
Here is the exact place of failure. Flume didn't recover after this
09 Dec 2013 07:33:13,617 INFO [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.HDFSSequenceFile.configure:63) - writeFormat = Text, UseRawLocalFileSystem = false 09 Dec 2013 07:33:13,630 INFO [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.BucketWriter.open:219) - Creating /staging/landing/stream/js_tracker/visit/2013/12/09/07/visit.1386559993618.bz2.tmp 09 Dec 2013 07:38:04,167 INFO [hdfs-hdfs_visit_sink-roll-timer-0] (org.apache.flume.sink.hdfs.BucketWriter$4.call:329) - Closing idle bucketWriter /staging/landing/stream/js_tracker/visit/2013/12/09/07/visit.1386559993618.bz2.tmp 09 Dec 2013 07:38:10,071 INFO [hdfs-hdfs_visit_sink-call-runner-6] (org.apache.flume.sink.hdfs.BucketWriter$7.call:487) - Renaming /staging/landing/stream/js_tracker/visit/2013/12/09/07/visit.1386559993618.bz2.tmp to /staging/landing/stream/js_tracker/visit/2013/12/09/07/visit.1386559993618.bz2 09 Dec 2013 07:38:10,103 WARN [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.HDFSEventSink.process:418) - HDFS IO error java.io.IOException: This bucket writer was closed due to idling and this handle is thus no longer valid at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:380) at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:392) at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68) at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147) at java.lang.Thread.run(Thread.java:662) 09 Dec 2013 07:38:15,103 INFO [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.HDFSSequenceFile.configure:63) - writeFormat = Text, UseRawLocalFileSystem = false 09 Dec 2013 07:38:15,116 INFO [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.BucketWriter.open:219) - Creating /staging/landing/stream/js_tracker/visit/2013/12/09/07/visit.1386560295104.bz2.tmp 09 Dec 2013 07:40:12,630 INFO [hdfs-hdfs_visit_sink-roll-timer-0] (org.apache.flume.sink.hdfs.BucketWriter$4.call:329) - Closing idle bucketWriter /staging/landing/stream/js_tracker/visit/2013/12/09/07/visit.1386560295104.bz2.tmp 09 Dec 2013 07:40:12,939 INFO [hdfs-hdfs_visit_sink-call-runner-0] (org.apache.flume.sink.hdfs.BucketWriter$7.call:487) - Renaming /staging/landing/stream/js_tracker/visit/2013/12/09/07/visit.1386560295104.bz2.tmp to /staging/landing/stream/js_tracker/visit/2013/12/09/07/visit.1386560295104.bz2 09 Dec 2013 07:40:14,634 INFO [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.HDFSSequenceFile.configure:63) - writeFormat = Text, UseRawLocalFileSystem = false 09 Dec 2013 07:40:14,647 INFO [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.BucketWriter.open:219) - Creating /staging/landing/stream/js_tracker/visit/2013/12/09/07/visit.1386560414635.bz2.tmp 09 Dec 2013 07:41:16,409 WARN [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.HDFSEventSink.process:418) - HDFS IO error java.io.IOException: Callable timed out after 10000 ms on file: /staging/landing/stream/js_tracker/visit/2013/12/09/07/visit.1386560414635.bz2.tmp at org.apache.flume.sink.hdfs.BucketWriter.callWithTimeout(BucketWriter.java:550) at org.apache.flume.sink.hdfs.BucketWriter.doFlush(BucketWriter.java:353) at org.apache.flume.sink.hdfs.BucketWriter.flush(BucketWriter.java:319) at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:442) at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:392) at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68) at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147) at java.lang.Thread.run(Thread.java:662) Caused by: java.util.concurrent.TimeoutException at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:228) at java.util.concurrent.FutureTask.get(FutureTask.java:91) at org.apache.flume.sink.hdfs.BucketWriter.callWithTimeout(BucketWriter.java:543) ... 7 more 09 Dec 2013 07:41:20,122 INFO [hdfs-hdfs_visit_sink-roll-timer-0] (org.apache.flume.sink.hdfs.BucketWriter$4.call:329) - Closing idle bucketWriter /staging/landing/stream/js_tracker/visit/2013/12/09/07/visit.1386560414635.bz2.tmp 09 Dec 2013 07:41:30,123 ERROR [hdfs-hdfs_visit_sink-roll-timer-0] (org.apache.flume.sink.hdfs.BucketWriter$4.call:336) - Unexpected error java.io.IOException: Callable timed out after 10000 ms on file: /staging/landing/stream/js_tracker/visit/2013/12/09/07/visit.1386560414635.bz2.tmp at org.apache.flume.sink.hdfs.BucketWriter.callWithTimeout(BucketWriter.java:550) at org.apache.flume.sink.hdfs.BucketWriter.doFlush(BucketWriter.java:353) at org.apache.flume.sink.hdfs.BucketWriter.flush(BucketWriter.java:319) at org.apache.flume.sink.hdfs.BucketWriter.close(BucketWriter.java:277) at org.apache.flume.sink.hdfs.BucketWriter$4.call(BucketWriter.java:331) at org.apache.flume.sink.hdfs.BucketWriter$4.call(BucketWriter.java:325) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:98) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:206) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.util.concurrent.TimeoutException at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:228) at java.util.concurrent.FutureTask.get(FutureTask.java:91) at org.apache.flume.sink.hdfs.BucketWriter.callWithTimeout(BucketWriter.java:543) ... 12 more 09 Dec 2013 07:41:40,132 WARN [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.HDFSEventSink.process:418) - HDFS IO error java.io.IOException: Callable timed out after 10000 ms on file: /staging/landing/stream/js_tracker/visit/2013/12/09/07/visit.1386560414635.bz2.tmp at org.apache.flume.sink.hdfs.BucketWriter.callWithTimeout(BucketWriter.java:550) at org.apache.flume.sink.hdfs.BucketWriter.doFlush(BucketWriter.java:353) at org.apache.flume.sink.hdfs.BucketWriter.flush(BucketWriter.java:319) at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:405) at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68) at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147) at java.lang.Thread.run(Thread.java:662) Caused by: java.util.concurrent.TimeoutException at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:228) at java.util.concurrent.FutureTask.get(FutureTask.java:91) at org.apache.flume.sink.hdfs.BucketWriter.callWithTimeout(BucketWriter.java:543) ... 6 more 09 Dec 2013 07:41:55,140 WARN [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.HDFSEventSink.process:418) - HDFS IO error java.io.IOException: Callable timed out after 10000 ms on file: /staging/landing/stream/js_tracker/visit/2013/12/09/07/visit.1386560414635.bz2.tmp at org.apache.flume.sink.hdfs.BucketWriter.callWithTimeout(BucketWriter.java:550) at org.apache.flume.sink.hdfs.BucketWriter.doFlush(BucketWriter.java:353) at org.apache.flume.sink.hdfs.BucketWriter.flush(BucketWriter.java:319) at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:405) at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68) at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147) at java.lang.Thread.run(Thread.java:662) Caused by: java.util.concurrent.TimeoutException at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:228) at java.util.concurrent.FutureTask.get(FutureTask.java:91) at org.apache.flume.sink.hdfs.BucketWriter.callWithTimeout(BucketWriter.java:543) ... 6 more 09 Dec 2013 07:42:01,709 WARN [ResponseProcessor for block BP-628993041-176.9.1.174-1384195296058:blk_-8982661908544496338_775853] (org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run:748) - DFSOutputStream ResponseProcessor exception for block BP-628993041-176.9.1.174-1384195296058:blk_-8982661908544496338_775853 java.io.IOException: Bad response ERROR for block BP-628993041-176.9.1.174-1384195296058:blk_-8982661908544496338_775853 from datanode 176.9.1.174:50010 at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:706) 09 Dec 2013 07:42:01,710 WARN [DataStreamer for file /staging/landing/stream/js_tracker/visit/2013/12/09/07/visit.1386560414635.bz2.tmp block BP-628993041-176.9.1.174-1384195296058:blk_-8982661908544496338_775853] (org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery:965) - Error Recovery for block BP-628993041-176.9.1.174-1384195296058:blk_-8982661908544496338_775853 in pipeline 178.63.23.149:50010, 88.198.23.238:50010, 176.9.1.174:50010: bad datanode 176.9.1.174:50010 09 Dec 2013 07:42:01,741 WARN [DataStreamer for file /staging/landing/stream/js_tracker/visit/2013/12/09/07/visit.1386560414635.bz2.tmp block BP-628993041-176.9.1.174-1384195296058:blk_-8982661908544496338_775853] (org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run:587) - DataStreamer Exception java.io.IOException: Failed to add a datanode. User may turn off this feature by setting dfs.client.block.write.replace-datanode-on-failure.policy in configuration, where the current policy is DEFAULT. (Nodes: current=[178.63.23.149:50010, 88.198.23.238:50010], original=[178.63.23.149:50010, 88.198.23.238:50010]) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:817) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:877) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:983) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:780) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449) 09 Dec 2013 07:42:01,742 WARN [hdfs-hdfs_visit_sink-call-runner-4] (org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync:1641) - Error while syncing java.io.IOException: Failed to add a datanode. User may turn off this feature by setting dfs.client.block.write.replace-datanode-on-failure.policy in configuration, where the current policy is DEFAULT. (Nodes: current=[178.63.23.149:50010, 88.198.23.238:50010], original=[178.63.23.149:50010, 88.198.23.238:50010]) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:817) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:877) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:983) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:780) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449) 09 Dec 2013 07:42:01,743 WARN [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.HDFSEventSink.process:418) - HDFS IO error java.io.IOException: Failed to add a datanode. User may turn off this feature by setting dfs.client.block.write.replace-datanode-on-failure.policy in configuration, where the current policy is DEFAULT. (Nodes: current=[178.63.23.149:50010, 88.198.23.238:50010], original=[178.63.23.149:50010, 88.198.23.238:50010]) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:817) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:877) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:983) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:780) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449) 09 Dec 2013 07:42:06,743 ERROR [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.AbstractHDFSWriter.isUnderReplicated:96) - Unexpected error while checking replication factor java.lang.reflect.InvocationTargetException at sun.reflect.GeneratedMethodAccessor404.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.flume.sink.hdfs.AbstractHDFSWriter.getNumCurrentReplicas(AbstractHDFSWriter.java:162) at org.apache.flume.sink.hdfs.AbstractHDFSWriter.isUnderReplicated(AbstractHDFSWriter.java:82) at org.apache.flume.sink.hdfs.BucketWriter.shouldRotate(BucketWriter.java:452) at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:387) at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:392) at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68) at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147) at java.lang.Thread.run(Thread.java:662) Caused by: java.io.IOException: Failed to add a datanode. User may turn off this feature by setting dfs.client.block.write.replace-datanode-on-failure.policy in configuration, where the current policy is DEFAULT. (Nodes: current=[178.63.23.149:50010, 88.198.23.238:50010], original=[178.63.23.149:50010, 88.198.23.238:50010]) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:817) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:877) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:983) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:780) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449) 09 Dec 2013 07:42:06,744 ERROR [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.AbstractHDFSWriter.isUnderReplicated:96) - Unexpected error while checking replication factor java.lang.reflect.InvocationTargetException at sun.reflect.GeneratedMethodAccessor404.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.flume.sink.hdfs.AbstractHDFSWriter.getNumCurrentReplicas(AbstractHDFSWriter.java:162) at org.apache.flume.sink.hdfs.AbstractHDFSWriter.isUnderReplicated(AbstractHDFSWriter.java:82) at org.apache.flume.sink.hdfs.BucketWriter.shouldRotate(BucketWriter.java:452) at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:387) at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:392) at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68) at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147) at java.lang.Thread.run(Thread.java:662) Caused by: java.io.IOException: Failed to add a datanode. User may turn off this feature by setting dfs.client.block.write.replace-datanode-on-failure.policy in configuration, where the current policy is DEFAULT. (Nodes: current=[178.63.23.149:50010, 88.198.23.238:50010], original=[178.63.23.149:50010, 88.198.23.238:50010]) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:817) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:877) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:983) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:780) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449) 09 Dec 2013 07:42:06,745 ERROR [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.AbstractHDFSWriter.isUnderReplicated:96) - Unexpected error while checking replication factor java.lang.reflect.InvocationTargetException at sun.reflect.GeneratedMethodAccessor404.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.flume.sink.hdfs.AbstractHDFSWriter.getNumCurrentReplicas(AbstractHDFSWriter.java:162) at org.apache.flume.sink.hdfs.AbstractHDFSWriter.isUnderReplicated(AbstractHDFSWriter.java:82) at org.apache.flume.sink.hdfs.BucketWriter.shouldRotate(BucketWriter.java:452) at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:387) at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:392) at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68) at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147) at java.lang.Thread.run(Thread.java:662) Caused by: java.io.IOException: Failed to add a datanode. User may turn off this feature by setting dfs.client.block.write.replace-datanode-on-failure.policy in configuration, where the current policy is DEFAULT. (Nodes: current=[178.63.23.149:50010, 88.198.23.238:50010], original=[178.63.23.149:50010, 88.198.23.238:50010]) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:817) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:877) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:983) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:780) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449) 09 Dec 2013 07:42:06,745 ERROR [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.AbstractHDFSWriter.isUnderReplicated:96) - Unexpected error while checking replication factor java.lang.reflect.InvocationTargetException at sun.reflect.GeneratedMethodAccessor404.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.flume.sink.hdfs.AbstractHDFSWriter.getNumCurrentReplicas(AbstractHDFSWriter.java:162) at org.apache.flume.sink.hdfs.AbstractHDFSWriter.isUnderReplicated(AbstractHDFSWriter.java:82) at org.apache.flume.sink.hdfs.BucketWriter.shouldRotate(BucketWriter.java:452) at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:387) at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:392) at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68) at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147) at java.lang.Thread.run(Thread.java:662) Caused by: java.io.IOException: Failed to add a datanode. User may turn off this feature by setting dfs.client.block.write.replace-datanode-on-failure.policy in configuration, where the current policy is DEFAULT. (Nodes: current=[178.63.23.149:50010, 88.198.23.238:50010], original=[178.63.23.149:50010, 88.198.23.238:50010]) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:817) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:877) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:983) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:780) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449) 09 Dec 2013 07:42:06,746 ERROR [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.AbstractHDFSWriter.isUnderReplicated:96) - Unexpected error while checking replication factor java.lang.reflect.InvocationTargetException
Since we are using NN HA configuration I've decided to check Zookeeper logs. There was a leader fallowing faulire at that time. And leader reelection happened.
04:39:18.417 INFO org.apache.zookeeper.server.ZooKeeperServer Established session 0x242a94989e94071 with negotiated timeout 30000 for client /176.9.1.174:55862 04:39:22.663 INFO org.apache.zookeeper.server.NIOServerCnxn Closed socket connection for client /176.9.1.174:55862 which had sessionid 0x242a94989e94071 04:39:51.810 WARN org.apache.zookeeper.server.quorum.Learner Exception when following the leader java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:375) at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63) at org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83) at org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108) at org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:152) at org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85) at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:740) 04:39:51.811 INFO org.apache.zookeeper.server.quorum.Learner shutdown called java.lang.Exception: shutdown Follower at org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:166) at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:744) 04:39:51.812 INFO org.apache.zookeeper.server.NIOServerCnxn Closed socket connection for client /176.9.1.174:46245 which had sessionid 0x242a94989e90144 04:39:51.813 INFO org.apache.zookeeper.server.NIOServerCnxn Closed socket connection for client /176.9.1.174:39346 which had sessionid 0x242a94989e900af 04:39:51.813 INFO org.apache.zookeeper.server.NIOServerCnxn Closed socket connection for client /176.9.1.174:59722 which had sessionid 0x242a94989e906a9 04:39:51.813 INFO org.apache.zookeeper.server.quorum.FollowerZooKeeperServer Shutting down 04:39:51.813 INFO org.apache.zookeeper.server.ZooKeeperServer shutting down 04:39:51.813 INFO org.apache.zookeeper.server.quorum.FollowerRequestProcessor Shutting down 04:39:51.813 INFO org.apache.zookeeper.server.quorum.CommitProcessor Shutting down 04:39:51.813 INFO org.apache.zookeeper.server.FinalRequestProcessor shutdown of request processor complete 04:39:51.813 INFO org.apache.zookeeper.server.quorum.FollowerRequestProcessor FollowerRequestProcessor exited loop! 04:39:51.814 INFO org.apache.zookeeper.server.quorum.CommitProcessor CommitProcessor exited loop! 04:39:51.814 INFO org.apache.zookeeper.server.SyncRequestProcessor Shutting down 04:39:51.814 INFO org.apache.zookeeper.server.SyncRequestProcessor SyncRequestProcessor exited! 04:39:51.815 INFO org.apache.zookeeper.server.quorum.QuorumPeer LOOKING 04:39:51.832 INFO org.apache.zookeeper.server.persistence.FileSnap Reading snapshot /var/lib/zookeeper/version-2/snapshot.700071453 04:39:52.231 INFO org.apache.zookeeper.server.quorum.FastLeaderElection New election. My id = 2, proposed zxid=0x700082c97 04:39:52.232 INFO org.apache.zookeeper.server.quorum.FastLeaderElection Notification: 2 (n.leader), 0x700082c97 (n.zxid), 0x8 (n.round), LOOKING (n.state), 2 (n.sid), 0x7 (n.peerEPoch), LOOKING (my state) 04:39:52.233 INFO org.apache.zookeeper.server.quorum.FastLeaderElection Notification: 3 (n.leader), 0x600000005 (n.zxid), 0x7 (n.round), FOLLOWING (n.state), 1 (n.sid), 0x6 (n.peerEPoch), LOOKING (my state) 04:39:52.433 INFO org.apache.zookeeper.server.quorum.FastLeaderElection Notification time out: 400 04:39:52.433 INFO org.apache.zookeeper.server.quorum.FastLeaderElection Notification: 2 (n.leader), 0x700082c97 (n.zxid), 0x8 (n.round), LOOKING (n.state), 2 (n.sid), 0x7 (n.peerEPoch), LOOKING (my state) 04:39:52.434 INFO org.apache.zookeeper.server.quorum.FastLeaderElection Notification: 3 (n.leader), 0x600000005 (n.zxid), 0x7 (n.round), FOLLOWING (n.state), 1 (n.sid), 0x6 (n.peerEPoch), LOOKING (my state) 04:39:52.571 INFO org.apache.zookeeper.server.quorum.FastLeaderElection Notification: 3 (n.leader), 0x600000005 (n.zxid), 0x7 (n.round), LEADING (n.state), 3 (n.sid), 0x6 (n.peerEPoch), LOOKING (my state) 04:39:52.571 INFO org.apache.zookeeper.server.quorum.QuorumPeer FOLLOWING 04:39:52.571 INFO org.apache.zookeeper.server.ZooKeeperServer Created server with tickTime 2000 minSessionTimeout 4000 maxSessionTimeout 60000 datadir /var/lib/zookeeper/version-2 snapdir /var/lib/zookeeper/version-2 04:39:52.572 INFO org.apache.zookeeper.server.quorum.Learner FOLLOWING - LEADER ELECTION TOOK - 757 04:39:55.091 INFO org.apache.zookeeper.server.quorum.FastLeaderElection Notification: 3 (n.leader), 0x600000005 (n.zxid), 0x7 (n.round), LEADING (n.state), 3 (n.sid), 0x6 (n.peerEPoch), FOLLOWING (my state) 04:40:02.582 WARN org.apache.zookeeper.server.quorum.Learner Unexpected exception, tries=0, connecting to node03.cluster.ru/88.198.23.238:3181 java.net.SocketTimeoutException: connect timed out at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351) at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213) at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366) at java.net.Socket.connect(Socket.java:529) at org.apache.zookeeper.server.quorum.Learner.connectToLeader(Learner.java:224) at org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:71) at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:740) 04:40:05.958 INFO org.apache.zookeeper.server.quorum.Learner Getting a diff from the leader 0x700082c97 04:40:05.959 INFO org.apache.zookeeper.server.persistence.FileTxnSnapLog Snapshotting: 0x700082c97 to /var/lib/zookeeper/version-2/snapshot.700082c97 04:40:12.342 INFO org.apache.zookeeper.server.NIOServerCnxnFactory Accepted socket connection from /176.9.1.174:58483 04:40:12.343 INFO org.apache.zookeeper.server.ZooKeeperServer Client attempting to establish new session at /176.9.1.174:58483 04:40:12.481 WARN org.apache.zookeeper.server.quorum.Learner Got zxid 0x700082c98 expected 0x1 04:40:12.510 INFO org.apache.zookeeper.server.ZooKeeperServer Established session 0x242d570c1c60000 with negotiated timeout 30000 for client /176.9.1.174:58483 04:40:21.540 INFO org.apache.zookeeper.server.NIOServerCnxnFactory Accepted socket connection from /176.9.1.174:58915 04:40:21.540 INFO org.apache.zookeeper.server.ZooKeeperServer Client attempting to establish new session at /176.9.1.174:58915 04:40:24.469 WARN org.apache.zookeeper.server.quorum.Learner Exception when following the leader java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:375) at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63) at org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83) at org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108) at org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:152) at org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85) at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:740) 04:40:24.470 INFO org.apache.zookeeper.server.quorum.Learner shutdown called java.lang.Exception: shutdown Follower at org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:166) at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:744) 04:40:24.470 INFO org.apache.zookeeper.server.NIOServerCnxn Closed socket connection for client /176.9.1.174:58915 which had sessionid 0x242d570c1c60001 04:40:24.470 INFO org.apache.zookeeper.server.NIOServerCnxn Closed socket connection for client /176.9.1.174:58483 which had sessionid 0x242d570c1c60000 04:40:24.470 INFO org.apache.zookeeper.server.quorum.FollowerZooKeeperServer Shutting down 04:40:24.471 INFO org.apache.zookeeper.server.ZooKeeperServer shutting down 04:40:24.471 INFO org.apache.zookeeper.server.quorum.FollowerRequestProcessor Shutting down 04:40:24.471 INFO org.apache.zookeeper.server.quorum.CommitProcessor Shutting down 04:40:24.471 INFO org.apache.zookeeper.server.quorum.FollowerRequestProcessor FollowerRequestProcessor exited loop! 04:40:24.471 INFO org.apache.zookeeper.server.quorum.CommitProcessor CommitProcessor exited loop! 04:40:24.471 INFO org.apache.zookeeper.server.FinalRequestProcessor shutdown of request processor complete 04:40:24.472 INFO org.apache.zookeeper.server.SyncRequestProcessor Shutting down 04:40:24.472 INFO org.apache.zookeeper.server.SyncRequestProcessor SyncRequestProcessor exited! 04:40:24.472 INFO org.apache.zookeeper.server.quorum.QuorumPeer LOOKING 04:40:24.473 INFO org.apache.zookeeper.server.persistence.FileSnap Reading snapshot /var/lib/zookeeper/version-2/snapshot.700082c97 04:40:24.550 INFO org.apache.zookeeper.server.quorum.FastLeaderElection New election. My id = 2, proposed zxid=0x700082c9b 04:40:24.550 INFO org.apache.zookeeper.server.quorum.FastLeaderElection Notification: 2 (n.leader), 0x700082c9b (n.zxid), 0x8 (n.round), LOOKING (n.state), 2 (n.sid), 0x7 (n.peerEPoch), LOOKING (my state) 04:40:24.551 INFO org.apache.zookeeper.server.quorum.FastLeaderElection Notification: 3 (n.leader), 0x600000005 (n.zxid), 0x7 (n.round), FOLLOWING (n.state), 1 (n.sid), 0x6 (n.peerEPoch), LOOKING (my state) 04:40:24.752 INFO org.apache.zookeeper.server.quorum.FastLeaderElection Notification time out: 400 04:40:24.752 INFO org.apache.zookeeper.server.quorum.FastLeaderElection Notification: 2 (n.leader), 0x700082c9b (n.zxid), 0x8 (n.round), LOOKING (n.state), 2 (n.sid), 0x7 (n.peerEPoch), LOOKING (my state) 04:40:24.753 INFO org.apache.zookeeper.server.quorum.FastLeaderElection Notification: 3 (n.leader), 0x600000005 (n.zxid), 0x7 (n.round), FOLLOWING (n.state), 1 (n.sid), 0x6 (n.peerEPoch), LOOKING (my state) 04:40:25.153 INFO org.apache.zookeeper.server.quorum.FastLeaderElection Notification time out: 800 04:40:25.153 INFO org.apache.zookeeper.server.quorum.FastLeaderElection Notification: 2 (n.leader), 0x700082c9b (n.zxid), 0x8 (n.round), LOOKING (n.state), 2 (n.sid), 0x7 (n.peerEPoch), LOOKING (my state) 04:40:25.154 INFO org.apache.zookeeper.server.quorum.FastLeaderElection Notification: 3 (n.leader), 0x600000005 (n.zxid), 0x7 (n.round), FOLLOWING (n.state), 1 (n.sid), 0x6 (n.peerEPoch), LOOKING (my state) 04:40:25.954 INFO org.apache.zookeeper.server.quorum.FastLeaderElection Notification time out: 1600 04:40:25.955 INFO org.apache.zookeeper.server.quorum.FastLeaderElection Notification: 2 (n.leader), 0x700082c9b (n.zxid), 0x8 (n.round), LOOKING (n.state), 2 (n.sid), 0x7 (n.peerEPoch), LOOKING (my state) 04:40:25.956 INFO org.apache.zookeeper.server.quorum.FastLeaderElection Notification: 3 (n.leader), 0x600000005 (n.zxid), 0x7 (n.round), FOLLOWING (n.state), 1 (n.sid), 0x6 (n.peerEPoch), LOOKING (my state) 04:40:26.133 INFO org.apache.zookeeper.server.NIOServerCnxnFactory Accepted socket connection from /176.9.1.174:59112 04:40:26.134 WARN org.apache.zookeeper.server.NIOServerCnxn Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running 04:40:26.134 INFO org.apache.zookeeper.server.NIOServerCnxn Closed socket connection for client /176.9.1.174:59112 (no session established for client) 04:40:27.556 INFO org.apache.zookeeper.server.quorum.FastLeaderElection Notification time out: 3200 04:40:27.556 INFO org.apache.zookeeper.server.quorum.FastLeaderElection Notification: 2 (n.leader), 0x700082c9b (n.zxid), 0x8 (n.round), LOOKING (n.state), 2 (n.sid), 0x7 (n.peerEPoch), LOOKING (my state) 04:40:27.557 INFO org.apache.zookeeper.server.quorum.FastLeaderElection Notification: 3 (n.leader), 0x600000005 (n.zxid), 0x7 (n.round), FOLLOWING (n.state), 1 (n.sid), 0x6 (n.peerEPoch), LOOKING (my state) 04:40:27.639 INFO org.apache.zookeeper.server.NIOServerCnxnFactory Accepted socket connection from /176.9.1.174:59208 04:40:27.639 WARN org.apache.zookeeper.server.NIOServerCnxn Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running 04:40:27.639 INFO org.apache.zookeeper.server.NIOServerCnxn Closed socket connection for client /176.9.1.174:59208 (no session established for client) 04:40:28.970 INFO org.apache.zookeeper.server.NIOServerCnxnFactory Accepted socket connection from /176.9.1.174:59254 04:40:28.970 WARN org.apache.zookeeper.server.NIOServerCnxn Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running 04:40:28.970 INFO org.apache.zookeeper.server.NIOServerCnxn Closed socket connection for client /176.9.1.174:59254 (no session established for client) 04:40:30.179 INFO org.apache.zookeeper.server.NIOServerCnxnFactory Accepted socket connection from /88.198.23.238:55236 04:40:30.180 WARN org.apache.zookeeper.server.NIOServerCnxn Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running 04:40:30.180 INFO org.apache.zookeeper.server.NIOServerCnxn Closed socket connection for client /88.198.23.238:55236 (no session established for client) 04:40:30.757 INFO org.apache.zookeeper.server.quorum.FastLeaderElection Notification time out: 6400 04:40:30.758 INFO org.apache.zookeeper.server.quorum.FastLeaderElection Notification: 2 (n.leader), 0x700082c9b (n.zxid), 0x8 (n.round), LOOKING (n.state), 2 (n.sid), 0x7 (n.peerEPoch), LOOKING (my state) 04:40:30.758 INFO org.apache.zookeeper.server.quorum.FastLeaderElection Notification: 3 (n.leader), 0x600000005 (n.zxid), 0x7 (n.round), FOLLOWING (n.state), 1 (n.sid), 0x6 (n.peerEPoch), LOOKING (my state) 04:40:30.883 INFO org.apache.zookeeper.server.NIOServerCnxnFactory Accepted socket connection from /176.9.1.174:59350 04:40:30.883 WARN org.apache.zookeeper.server.NIOServerCnxn Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running