Uploaded image for project: 'Flume'
  1. Flume
  2. FLUME-2261

Flume HDFS sinks can't overcome NN HA switching

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.4.0
    • None
    • Sinks+Sources
    • None
    • CDH 4.5 1.4.0+23

    Description

      looks like related to #FLUME-2228 (https://issues.apache.org/jira/browse/FLUME-2228)

      We have 3-node cluster with NN HA. Sometimes nodes are switching.
      And Flume HDFS sink can't correctly handle such situation.
      Here is the log:

      9 Dec 2013 22:14:49,175 INFO  [pool-6-thread-1] (org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream:171)  - [id: 0x1b2ae35d, /176.9.1.174:37697 :> /88.198.23.238:60011] DISCONNECTED
      09 Dec 2013 22:14:49,175 INFO  [pool-6-thread-1] (org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream:171)  - [id: 0x1b2ae35d, /176.9.1.174:37697 :> /88.198.23.238:60011] UNBOUND
      09 Dec 2013 22:14:49,175 INFO  [pool-6-thread-1] (org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream:171)  - [id: 0x1b2ae35d, /176.9.1.174:37697 :> /88.198.23.238:60011] CLOSED
      09 Dec 2013 22:14:49,175 INFO  [pool-6-thread-1] (org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.channelClosed:209)  - Connection to /176.9.1.174:37697 disconnected.
      09 Dec 2013 22:14:49,956 ERROR [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.AbstractHDFSWriter.isUnderReplicated:96)  - Unexpected error while checking replication factor
      java.lang.reflect.InvocationTargetException
      	at sun.reflect.GeneratedMethodAccessor21.invoke(Unknown Source)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      	at java.lang.reflect.Method.invoke(Method.java:597)
      	at org.apache.flume.sink.hdfs.AbstractHDFSWriter.getNumCurrentReplicas(AbstractHDFSWriter.java:162)
      	at org.apache.flume.sink.hdfs.AbstractHDFSWriter.isUnderReplicated(AbstractHDFSWriter.java:82)
      	at org.apache.flume.sink.hdfs.BucketWriter.shouldRotate(BucketWriter.java:452)
      	at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:387)
      	at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:392)
      	at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
      	at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
      	at java.lang.Thread.run(Thread.java:662)
      Caused by: java.io.IOException: Failed to add a datanode.  User may turn off this feature by setting dfs.client.block.write.replace-datanode-on-failure.policy in configuration, where the current policy is DEFAULT.  (Nodes: current=[88.198.23.238:50010, 176.9.1.174:50010], original=[88.198.23.238:50010, 176.9.1.174:50010])
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:817)
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:877)
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:983)
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:780)
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449)
      09 Dec 2013 22:14:49,956 WARN  [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.BucketWriter.append:424)  - Caught IOException writing to HDFSWriter (Failed to add a datanode.  User may turn off this feature by setting dfs.client.block.write.replace-datanode-on-failure.policy in configuration, where the current policy is DEFAULT.  (Nodes: current=[88.198.23.238:50010, 176.9.1.174:50010], original=[88.198.23.238:50010, 176.9.1.174:50010])). Closing file (/staging/landing/stream/js_tracker/visit/2013/12/09/07/visit.1386560349521.bz2.tmp) and rethrowing exception.
      09 Dec 2013 22:14:49,957 WARN  [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.BucketWriter.append:430)  - Caught IOException while closing file (/staging/landing/stream/js_tracker/visit/2013/12/09/07/visit.1386560349521.bz2.tmp). Exception follows.
      java.io.IOException: Failed to add a datanode.  User may turn off this feature by setting dfs.client.block.write.replace-datanode-on-failure.policy in configuration, where the current policy is DEFAULT.  (Nodes: current=[88.198.23.238:50010, 176.9.1.174:50010], original=[88.198.23.238:50010, 176.9.1.174:50010])
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:817)
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:877)
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:983)
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:780)
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449)
      09 Dec 2013 22:14:49,957 WARN  [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.HDFSEventSink.process:418)  - HDFS IO error
      java.io.IOException: Failed to add a datanode.  User may turn off this feature by setting dfs.client.block.write.replace-datanode-on-failure.policy in configuration, where the current policy is DEFAULT.  (Nodes: current=[88.198.23.238:50010, 176.9.1.174:50010], original=[88.198.23.238:50010, 176.9.1.174:50010])
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:817)
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:877)
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:983)
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:780)
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449)
      09 Dec 2013 22:14:54,957 ERROR [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.AbstractHDFSWriter.isUnderReplicated:96)  - Unexpected error while checking replication factor
      java.lang.reflect.InvocationTargetException
      	at sun.reflect.GeneratedMethodAccessor21.invoke(Unknown Source)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      	at java.lang.reflect.Method.invoke(Method.java:597)
      	at org.apache.flume.sink.hdfs.AbstractHDFSWriter.getNumCurrentReplicas(AbstractHDFSWriter.java:162)
      	at org.apache.flume.sink.hdfs.AbstractHDFSWriter.isUnderReplicated(AbstractHDFSWriter.java:82)
      	at org.apache.flume.sink.hdfs.BucketWriter.shouldRotate(BucketWriter.java:452)
      	at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:387)
      	at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:392)
      	at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
      	at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
      	at java.lang.Thread.run(Thread.java:662)
      Caused by: java.io.IOException: Failed to add a datanode.  User may turn off this feature by setting dfs.client.block.write.replace-datanode-on-failure.policy in configuration, where the current policy is DEFAULT.  (Nodes: current=[88.198.23.238:50010, 176.9.1.174:50010], original=[88.198.23.238:50010, 176.9.1.174:50010])
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:817)
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:877)
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:983)
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:780)
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449)
      09 Dec 2013 22:14:54,958 WARN  [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.BucketWriter.append:424)  - Caught IOException writing to HDFSWriter (Failed to add a datanode.  User may turn off this feature by setting dfs.client.block.write.replace-datanode-on-failure.policy in configuration, where the current policy is DEFAULT.  (Nodes: current=[88.198.23.238:50010, 176.9.1.174:50010], original=[88.198.23.238:50010, 176.9.1.174:50010])). Closing file (/staging/landing/stream/js_tracker/visit/2013/12/09/07/visit.1386560349521.bz2.tmp) and rethrowing exception.
      09 Dec 2013 22:14:54,958 WARN  [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.BucketWriter.append:430)  - Caught IOException while closing file (/staging/landing/strea
      

      Flume stops to write data to HDFS and can't close opened file. It's a disaster because we don't have any notification (is it possible to get any)?

      UPD
      Here is the exact place of failure. Flume didn't recover after this

      09 Dec 2013 07:33:13,617 INFO  [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.HDFSSequenceFile.configure:63)  - writeFormat = Text, UseRawLocalFileSystem = false
      09 Dec 2013 07:33:13,630 INFO  [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.BucketWriter.open:219)  - Creating /staging/landing/stream/js_tracker/visit/2013/12/09/07/visit.1386559993618.bz2.tmp
      09 Dec 2013 07:38:04,167 INFO  [hdfs-hdfs_visit_sink-roll-timer-0] (org.apache.flume.sink.hdfs.BucketWriter$4.call:329)  - Closing idle bucketWriter /staging/landing/stream/js_tracker/visit/2013/12/09/07/visit.1386559993618.bz2.tmp
      09 Dec 2013 07:38:10,071 INFO  [hdfs-hdfs_visit_sink-call-runner-6] (org.apache.flume.sink.hdfs.BucketWriter$7.call:487)  - Renaming /staging/landing/stream/js_tracker/visit/2013/12/09/07/visit.1386559993618.bz2.tmp to /staging/landing/stream/js_tracker/visit/2013/12/09/07/visit.1386559993618.bz2
      09 Dec 2013 07:38:10,103 WARN  [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.HDFSEventSink.process:418)  - HDFS IO error
      java.io.IOException: This bucket writer was closed due to idling and this handle is thus no longer valid
      	at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:380)
      	at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:392)
      	at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
      	at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
      	at java.lang.Thread.run(Thread.java:662)
      09 Dec 2013 07:38:15,103 INFO  [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.HDFSSequenceFile.configure:63)  - writeFormat = Text, UseRawLocalFileSystem = false
      09 Dec 2013 07:38:15,116 INFO  [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.BucketWriter.open:219)  - Creating /staging/landing/stream/js_tracker/visit/2013/12/09/07/visit.1386560295104.bz2.tmp
      09 Dec 2013 07:40:12,630 INFO  [hdfs-hdfs_visit_sink-roll-timer-0] (org.apache.flume.sink.hdfs.BucketWriter$4.call:329)  - Closing idle bucketWriter /staging/landing/stream/js_tracker/visit/2013/12/09/07/visit.1386560295104.bz2.tmp
      09 Dec 2013 07:40:12,939 INFO  [hdfs-hdfs_visit_sink-call-runner-0] (org.apache.flume.sink.hdfs.BucketWriter$7.call:487)  - Renaming /staging/landing/stream/js_tracker/visit/2013/12/09/07/visit.1386560295104.bz2.tmp to /staging/landing/stream/js_tracker/visit/2013/12/09/07/visit.1386560295104.bz2
      09 Dec 2013 07:40:14,634 INFO  [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.HDFSSequenceFile.configure:63)  - writeFormat = Text, UseRawLocalFileSystem = false
      09 Dec 2013 07:40:14,647 INFO  [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.BucketWriter.open:219)  - Creating /staging/landing/stream/js_tracker/visit/2013/12/09/07/visit.1386560414635.bz2.tmp
      09 Dec 2013 07:41:16,409 WARN  [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.HDFSEventSink.process:418)  - HDFS IO error
      java.io.IOException: Callable timed out after 10000 ms on file: /staging/landing/stream/js_tracker/visit/2013/12/09/07/visit.1386560414635.bz2.tmp
      	at org.apache.flume.sink.hdfs.BucketWriter.callWithTimeout(BucketWriter.java:550)
      	at org.apache.flume.sink.hdfs.BucketWriter.doFlush(BucketWriter.java:353)
      	at org.apache.flume.sink.hdfs.BucketWriter.flush(BucketWriter.java:319)
      	at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:442)
      	at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:392)
      	at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
      	at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
      	at java.lang.Thread.run(Thread.java:662)
      Caused by: java.util.concurrent.TimeoutException
      	at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:228)
      	at java.util.concurrent.FutureTask.get(FutureTask.java:91)
      	at org.apache.flume.sink.hdfs.BucketWriter.callWithTimeout(BucketWriter.java:543)
      	... 7 more
      09 Dec 2013 07:41:20,122 INFO  [hdfs-hdfs_visit_sink-roll-timer-0] (org.apache.flume.sink.hdfs.BucketWriter$4.call:329)  - Closing idle bucketWriter /staging/landing/stream/js_tracker/visit/2013/12/09/07/visit.1386560414635.bz2.tmp
      09 Dec 2013 07:41:30,123 ERROR [hdfs-hdfs_visit_sink-roll-timer-0] (org.apache.flume.sink.hdfs.BucketWriter$4.call:336)  - Unexpected error
      java.io.IOException: Callable timed out after 10000 ms on file: /staging/landing/stream/js_tracker/visit/2013/12/09/07/visit.1386560414635.bz2.tmp
      	at org.apache.flume.sink.hdfs.BucketWriter.callWithTimeout(BucketWriter.java:550)
      	at org.apache.flume.sink.hdfs.BucketWriter.doFlush(BucketWriter.java:353)
      	at org.apache.flume.sink.hdfs.BucketWriter.flush(BucketWriter.java:319)
      	at org.apache.flume.sink.hdfs.BucketWriter.close(BucketWriter.java:277)
      	at org.apache.flume.sink.hdfs.BucketWriter$4.call(BucketWriter.java:331)
      	at org.apache.flume.sink.hdfs.BucketWriter$4.call(BucketWriter.java:325)
      	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
      	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:98)
      	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:206)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
      	at java.lang.Thread.run(Thread.java:662)
      Caused by: java.util.concurrent.TimeoutException
      	at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:228)
      	at java.util.concurrent.FutureTask.get(FutureTask.java:91)
      	at org.apache.flume.sink.hdfs.BucketWriter.callWithTimeout(BucketWriter.java:543)
      	... 12 more
      09 Dec 2013 07:41:40,132 WARN  [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.HDFSEventSink.process:418)  - HDFS IO error
      java.io.IOException: Callable timed out after 10000 ms on file: /staging/landing/stream/js_tracker/visit/2013/12/09/07/visit.1386560414635.bz2.tmp
      	at org.apache.flume.sink.hdfs.BucketWriter.callWithTimeout(BucketWriter.java:550)
      	at org.apache.flume.sink.hdfs.BucketWriter.doFlush(BucketWriter.java:353)
      	at org.apache.flume.sink.hdfs.BucketWriter.flush(BucketWriter.java:319)
      	at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:405)
      	at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
      	at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
      	at java.lang.Thread.run(Thread.java:662)
      Caused by: java.util.concurrent.TimeoutException
      	at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:228)
      	at java.util.concurrent.FutureTask.get(FutureTask.java:91)
      	at org.apache.flume.sink.hdfs.BucketWriter.callWithTimeout(BucketWriter.java:543)
      	... 6 more
      09 Dec 2013 07:41:55,140 WARN  [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.HDFSEventSink.process:418)  - HDFS IO error
      java.io.IOException: Callable timed out after 10000 ms on file: /staging/landing/stream/js_tracker/visit/2013/12/09/07/visit.1386560414635.bz2.tmp
      	at org.apache.flume.sink.hdfs.BucketWriter.callWithTimeout(BucketWriter.java:550)
      	at org.apache.flume.sink.hdfs.BucketWriter.doFlush(BucketWriter.java:353)
      	at org.apache.flume.sink.hdfs.BucketWriter.flush(BucketWriter.java:319)
      	at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:405)
      	at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
      	at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
      	at java.lang.Thread.run(Thread.java:662)
      Caused by: java.util.concurrent.TimeoutException
      	at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:228)
      	at java.util.concurrent.FutureTask.get(FutureTask.java:91)
      	at org.apache.flume.sink.hdfs.BucketWriter.callWithTimeout(BucketWriter.java:543)
      	... 6 more
      09 Dec 2013 07:42:01,709 WARN  [ResponseProcessor for block BP-628993041-176.9.1.174-1384195296058:blk_-8982661908544496338_775853] (org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run:748)  - DFSOutputStream ResponseProcessor exception  for block BP-628993041-176.9.1.174-1384195296058:blk_-8982661908544496338_775853
      java.io.IOException: Bad response ERROR for block BP-628993041-176.9.1.174-1384195296058:blk_-8982661908544496338_775853 from datanode 176.9.1.174:50010
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:706)
      09 Dec 2013 07:42:01,710 WARN  [DataStreamer for file /staging/landing/stream/js_tracker/visit/2013/12/09/07/visit.1386560414635.bz2.tmp block BP-628993041-176.9.1.174-1384195296058:blk_-8982661908544496338_775853] (org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery:965)  - Error Recovery for block BP-628993041-176.9.1.174-1384195296058:blk_-8982661908544496338_775853 in pipeline 178.63.23.149:50010, 88.198.23.238:50010, 176.9.1.174:50010: bad datanode 176.9.1.174:50010
      09 Dec 2013 07:42:01,741 WARN  [DataStreamer for file /staging/landing/stream/js_tracker/visit/2013/12/09/07/visit.1386560414635.bz2.tmp block BP-628993041-176.9.1.174-1384195296058:blk_-8982661908544496338_775853] (org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run:587)  - DataStreamer Exception
      java.io.IOException: Failed to add a datanode.  User may turn off this feature by setting dfs.client.block.write.replace-datanode-on-failure.policy in configuration, where the current policy is DEFAULT.  (Nodes: current=[178.63.23.149:50010, 88.198.23.238:50010], original=[178.63.23.149:50010, 88.198.23.238:50010])
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:817)
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:877)
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:983)
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:780)
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449)
      09 Dec 2013 07:42:01,742 WARN  [hdfs-hdfs_visit_sink-call-runner-4] (org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync:1641)  - Error while syncing
      java.io.IOException: Failed to add a datanode.  User may turn off this feature by setting dfs.client.block.write.replace-datanode-on-failure.policy in configuration, where the current policy is DEFAULT.  (Nodes: current=[178.63.23.149:50010, 88.198.23.238:50010], original=[178.63.23.149:50010, 88.198.23.238:50010])
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:817)
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:877)
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:983)
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:780)
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449)
      09 Dec 2013 07:42:01,743 WARN  [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.HDFSEventSink.process:418)  - HDFS IO error
      java.io.IOException: Failed to add a datanode.  User may turn off this feature by setting dfs.client.block.write.replace-datanode-on-failure.policy in configuration, where the current policy is DEFAULT.  (Nodes: current=[178.63.23.149:50010, 88.198.23.238:50010], original=[178.63.23.149:50010, 88.198.23.238:50010])
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:817)
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:877)
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:983)
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:780)
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449)
      09 Dec 2013 07:42:06,743 ERROR [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.AbstractHDFSWriter.isUnderReplicated:96)  - Unexpected error while checking replication factor
      java.lang.reflect.InvocationTargetException
      	at sun.reflect.GeneratedMethodAccessor404.invoke(Unknown Source)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      	at java.lang.reflect.Method.invoke(Method.java:597)
      	at org.apache.flume.sink.hdfs.AbstractHDFSWriter.getNumCurrentReplicas(AbstractHDFSWriter.java:162)
      	at org.apache.flume.sink.hdfs.AbstractHDFSWriter.isUnderReplicated(AbstractHDFSWriter.java:82)
      	at org.apache.flume.sink.hdfs.BucketWriter.shouldRotate(BucketWriter.java:452)
      	at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:387)
      	at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:392)
      	at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
      	at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
      	at java.lang.Thread.run(Thread.java:662)
      Caused by: java.io.IOException: Failed to add a datanode.  User may turn off this feature by setting dfs.client.block.write.replace-datanode-on-failure.policy in configuration, where the current policy is DEFAULT.  (Nodes: current=[178.63.23.149:50010, 88.198.23.238:50010], original=[178.63.23.149:50010, 88.198.23.238:50010])
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:817)
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:877)
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:983)
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:780)
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449)
      09 Dec 2013 07:42:06,744 ERROR [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.AbstractHDFSWriter.isUnderReplicated:96)  - Unexpected error while checking replication factor
      java.lang.reflect.InvocationTargetException
      	at sun.reflect.GeneratedMethodAccessor404.invoke(Unknown Source)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      	at java.lang.reflect.Method.invoke(Method.java:597)
      	at org.apache.flume.sink.hdfs.AbstractHDFSWriter.getNumCurrentReplicas(AbstractHDFSWriter.java:162)
      	at org.apache.flume.sink.hdfs.AbstractHDFSWriter.isUnderReplicated(AbstractHDFSWriter.java:82)
      	at org.apache.flume.sink.hdfs.BucketWriter.shouldRotate(BucketWriter.java:452)
      	at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:387)
      	at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:392)
      	at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
      	at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
      	at java.lang.Thread.run(Thread.java:662)
      Caused by: java.io.IOException: Failed to add a datanode.  User may turn off this feature by setting dfs.client.block.write.replace-datanode-on-failure.policy in configuration, where the current policy is DEFAULT.  (Nodes: current=[178.63.23.149:50010, 88.198.23.238:50010], original=[178.63.23.149:50010, 88.198.23.238:50010])
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:817)
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:877)
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:983)
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:780)
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449)
      09 Dec 2013 07:42:06,745 ERROR [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.AbstractHDFSWriter.isUnderReplicated:96)  - Unexpected error while checking replication factor
      java.lang.reflect.InvocationTargetException
      	at sun.reflect.GeneratedMethodAccessor404.invoke(Unknown Source)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      	at java.lang.reflect.Method.invoke(Method.java:597)
      	at org.apache.flume.sink.hdfs.AbstractHDFSWriter.getNumCurrentReplicas(AbstractHDFSWriter.java:162)
      	at org.apache.flume.sink.hdfs.AbstractHDFSWriter.isUnderReplicated(AbstractHDFSWriter.java:82)
      	at org.apache.flume.sink.hdfs.BucketWriter.shouldRotate(BucketWriter.java:452)
      	at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:387)
      	at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:392)
      	at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
      	at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
      	at java.lang.Thread.run(Thread.java:662)
      Caused by: java.io.IOException: Failed to add a datanode.  User may turn off this feature by setting dfs.client.block.write.replace-datanode-on-failure.policy in configuration, where the current policy is DEFAULT.  (Nodes: current=[178.63.23.149:50010, 88.198.23.238:50010], original=[178.63.23.149:50010, 88.198.23.238:50010])
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:817)
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:877)
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:983)
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:780)
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449)
      09 Dec 2013 07:42:06,745 ERROR [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.AbstractHDFSWriter.isUnderReplicated:96)  - Unexpected error while checking replication factor
      java.lang.reflect.InvocationTargetException
      	at sun.reflect.GeneratedMethodAccessor404.invoke(Unknown Source)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      	at java.lang.reflect.Method.invoke(Method.java:597)
      	at org.apache.flume.sink.hdfs.AbstractHDFSWriter.getNumCurrentReplicas(AbstractHDFSWriter.java:162)
      	at org.apache.flume.sink.hdfs.AbstractHDFSWriter.isUnderReplicated(AbstractHDFSWriter.java:82)
      	at org.apache.flume.sink.hdfs.BucketWriter.shouldRotate(BucketWriter.java:452)
      	at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:387)
      	at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:392)
      	at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
      	at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
      	at java.lang.Thread.run(Thread.java:662)
      Caused by: java.io.IOException: Failed to add a datanode.  User may turn off this feature by setting dfs.client.block.write.replace-datanode-on-failure.policy in configuration, where the current policy is DEFAULT.  (Nodes: current=[178.63.23.149:50010, 88.198.23.238:50010], original=[178.63.23.149:50010, 88.198.23.238:50010])
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:817)
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:877)
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:983)
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:780)
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449)
      09 Dec 2013 07:42:06,746 ERROR [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.AbstractHDFSWriter.isUnderReplicated:96)  - Unexpected error while checking replication factor
      java.lang.reflect.InvocationTargetException
      

      Since we are using NN HA configuration I've decided to check Zookeeper logs. There was a leader fallowing faulire at that time. And leader reelection happened.

      04:39:18.417	INFO	org.apache.zookeeper.server.ZooKeeperServer	
      Established session 0x242a94989e94071 with negotiated timeout 30000 for client /176.9.1.174:55862
      04:39:22.663	INFO	org.apache.zookeeper.server.NIOServerCnxn	
      Closed socket connection for client /176.9.1.174:55862 which had sessionid 0x242a94989e94071
      04:39:51.810	WARN	org.apache.zookeeper.server.quorum.Learner	
      Exception when following the leader
      java.io.EOFException
      	at java.io.DataInputStream.readInt(DataInputStream.java:375)
      	at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
      	at org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
      	at org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
      	at org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:152)
      	at org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85)
      	at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:740)
      04:39:51.811	INFO	org.apache.zookeeper.server.quorum.Learner	
      shutdown called
      java.lang.Exception: shutdown Follower
      	at org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:166)
      	at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:744)
      04:39:51.812	INFO	org.apache.zookeeper.server.NIOServerCnxn	
      Closed socket connection for client /176.9.1.174:46245 which had sessionid 0x242a94989e90144
      04:39:51.813	INFO	org.apache.zookeeper.server.NIOServerCnxn	
      Closed socket connection for client /176.9.1.174:39346 which had sessionid 0x242a94989e900af
      04:39:51.813	INFO	org.apache.zookeeper.server.NIOServerCnxn	
      Closed socket connection for client /176.9.1.174:59722 which had sessionid 0x242a94989e906a9
      04:39:51.813	INFO	org.apache.zookeeper.server.quorum.FollowerZooKeeperServer	
      Shutting down
      04:39:51.813	INFO	org.apache.zookeeper.server.ZooKeeperServer	
      shutting down
      04:39:51.813	INFO	org.apache.zookeeper.server.quorum.FollowerRequestProcessor	
      Shutting down
      04:39:51.813	INFO	org.apache.zookeeper.server.quorum.CommitProcessor	
      Shutting down
      04:39:51.813	INFO	org.apache.zookeeper.server.FinalRequestProcessor	
      shutdown of request processor complete
      04:39:51.813	INFO	org.apache.zookeeper.server.quorum.FollowerRequestProcessor	
      FollowerRequestProcessor exited loop!
      04:39:51.814	INFO	org.apache.zookeeper.server.quorum.CommitProcessor	
      CommitProcessor exited loop!
      04:39:51.814	INFO	org.apache.zookeeper.server.SyncRequestProcessor	
      Shutting down
      04:39:51.814	INFO	org.apache.zookeeper.server.SyncRequestProcessor	
      SyncRequestProcessor exited!
      04:39:51.815	INFO	org.apache.zookeeper.server.quorum.QuorumPeer	
      LOOKING
      04:39:51.832	INFO	org.apache.zookeeper.server.persistence.FileSnap	
      Reading snapshot /var/lib/zookeeper/version-2/snapshot.700071453
      04:39:52.231	INFO	org.apache.zookeeper.server.quorum.FastLeaderElection	
      New election. My id =  2, proposed zxid=0x700082c97
      04:39:52.232	INFO	org.apache.zookeeper.server.quorum.FastLeaderElection	
      Notification: 2 (n.leader), 0x700082c97 (n.zxid), 0x8 (n.round), LOOKING (n.state), 2 (n.sid), 0x7 (n.peerEPoch), LOOKING (my state)
      04:39:52.233	INFO	org.apache.zookeeper.server.quorum.FastLeaderElection	
      Notification: 3 (n.leader), 0x600000005 (n.zxid), 0x7 (n.round), FOLLOWING (n.state), 1 (n.sid), 0x6 (n.peerEPoch), LOOKING (my state)
      04:39:52.433	INFO	org.apache.zookeeper.server.quorum.FastLeaderElection	
      Notification time out: 400
      04:39:52.433	INFO	org.apache.zookeeper.server.quorum.FastLeaderElection	
      Notification: 2 (n.leader), 0x700082c97 (n.zxid), 0x8 (n.round), LOOKING (n.state), 2 (n.sid), 0x7 (n.peerEPoch), LOOKING (my state)
      04:39:52.434	INFO	org.apache.zookeeper.server.quorum.FastLeaderElection	
      Notification: 3 (n.leader), 0x600000005 (n.zxid), 0x7 (n.round), FOLLOWING (n.state), 1 (n.sid), 0x6 (n.peerEPoch), LOOKING (my state)
      04:39:52.571	INFO	org.apache.zookeeper.server.quorum.FastLeaderElection	
      Notification: 3 (n.leader), 0x600000005 (n.zxid), 0x7 (n.round), LEADING (n.state), 3 (n.sid), 0x6 (n.peerEPoch), LOOKING (my state)
      04:39:52.571	INFO	org.apache.zookeeper.server.quorum.QuorumPeer	
      FOLLOWING
      04:39:52.571	INFO	org.apache.zookeeper.server.ZooKeeperServer	
      Created server with tickTime 2000 minSessionTimeout 4000 maxSessionTimeout 60000 datadir /var/lib/zookeeper/version-2 snapdir /var/lib/zookeeper/version-2
      04:39:52.572	INFO	org.apache.zookeeper.server.quorum.Learner	
      FOLLOWING - LEADER ELECTION TOOK - 757
      04:39:55.091	INFO	org.apache.zookeeper.server.quorum.FastLeaderElection	
      Notification: 3 (n.leader), 0x600000005 (n.zxid), 0x7 (n.round), LEADING (n.state), 3 (n.sid), 0x6 (n.peerEPoch), FOLLOWING (my state)
      04:40:02.582	WARN	org.apache.zookeeper.server.quorum.Learner	
      Unexpected exception, tries=0, connecting to node03.cluster.ru/88.198.23.238:3181
      java.net.SocketTimeoutException: connect timed out
      	at java.net.PlainSocketImpl.socketConnect(Native Method)
      	at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)
      	at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213)
      	at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)
      	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
      	at java.net.Socket.connect(Socket.java:529)
      	at org.apache.zookeeper.server.quorum.Learner.connectToLeader(Learner.java:224)
      	at org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:71)
      	at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:740)
      04:40:05.958	INFO	org.apache.zookeeper.server.quorum.Learner	
      Getting a diff from the leader 0x700082c97
      04:40:05.959	INFO	org.apache.zookeeper.server.persistence.FileTxnSnapLog	
      Snapshotting: 0x700082c97 to /var/lib/zookeeper/version-2/snapshot.700082c97
      04:40:12.342	INFO	org.apache.zookeeper.server.NIOServerCnxnFactory	
      Accepted socket connection from /176.9.1.174:58483
      04:40:12.343	INFO	org.apache.zookeeper.server.ZooKeeperServer	
      Client attempting to establish new session at /176.9.1.174:58483
      04:40:12.481	WARN	org.apache.zookeeper.server.quorum.Learner	
      Got zxid 0x700082c98 expected 0x1
      04:40:12.510	INFO	org.apache.zookeeper.server.ZooKeeperServer	
      Established session 0x242d570c1c60000 with negotiated timeout 30000 for client /176.9.1.174:58483
      04:40:21.540	INFO	org.apache.zookeeper.server.NIOServerCnxnFactory	
      Accepted socket connection from /176.9.1.174:58915
      04:40:21.540	INFO	org.apache.zookeeper.server.ZooKeeperServer	
      Client attempting to establish new session at /176.9.1.174:58915
      04:40:24.469	WARN	org.apache.zookeeper.server.quorum.Learner	
      Exception when following the leader
      java.io.EOFException
      	at java.io.DataInputStream.readInt(DataInputStream.java:375)
      	at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
      	at org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
      	at org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
      	at org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:152)
      	at org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85)
      	at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:740)
      04:40:24.470	INFO	org.apache.zookeeper.server.quorum.Learner	
      shutdown called
      java.lang.Exception: shutdown Follower
      	at org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:166)
      	at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:744)
      04:40:24.470	INFO	org.apache.zookeeper.server.NIOServerCnxn	
      Closed socket connection for client /176.9.1.174:58915 which had sessionid 0x242d570c1c60001
      04:40:24.470	INFO	org.apache.zookeeper.server.NIOServerCnxn	
      Closed socket connection for client /176.9.1.174:58483 which had sessionid 0x242d570c1c60000
      04:40:24.470	INFO	org.apache.zookeeper.server.quorum.FollowerZooKeeperServer	
      Shutting down
      04:40:24.471	INFO	org.apache.zookeeper.server.ZooKeeperServer	
      shutting down
      04:40:24.471	INFO	org.apache.zookeeper.server.quorum.FollowerRequestProcessor	
      Shutting down
      04:40:24.471	INFO	org.apache.zookeeper.server.quorum.CommitProcessor	
      Shutting down
      04:40:24.471	INFO	org.apache.zookeeper.server.quorum.FollowerRequestProcessor	
      FollowerRequestProcessor exited loop!
      04:40:24.471	INFO	org.apache.zookeeper.server.quorum.CommitProcessor	
      CommitProcessor exited loop!
      04:40:24.471	INFO	org.apache.zookeeper.server.FinalRequestProcessor	
      shutdown of request processor complete
      04:40:24.472	INFO	org.apache.zookeeper.server.SyncRequestProcessor	
      Shutting down
      04:40:24.472	INFO	org.apache.zookeeper.server.SyncRequestProcessor	
      SyncRequestProcessor exited!
      04:40:24.472	INFO	org.apache.zookeeper.server.quorum.QuorumPeer	
      LOOKING
      04:40:24.473	INFO	org.apache.zookeeper.server.persistence.FileSnap	
      Reading snapshot /var/lib/zookeeper/version-2/snapshot.700082c97
      04:40:24.550	INFO	org.apache.zookeeper.server.quorum.FastLeaderElection	
      New election. My id =  2, proposed zxid=0x700082c9b
      04:40:24.550	INFO	org.apache.zookeeper.server.quorum.FastLeaderElection	
      Notification: 2 (n.leader), 0x700082c9b (n.zxid), 0x8 (n.round), LOOKING (n.state), 2 (n.sid), 0x7 (n.peerEPoch), LOOKING (my state)
      04:40:24.551	INFO	org.apache.zookeeper.server.quorum.FastLeaderElection	
      Notification: 3 (n.leader), 0x600000005 (n.zxid), 0x7 (n.round), FOLLOWING (n.state), 1 (n.sid), 0x6 (n.peerEPoch), LOOKING (my state)
      04:40:24.752	INFO	org.apache.zookeeper.server.quorum.FastLeaderElection	
      Notification time out: 400
      04:40:24.752	INFO	org.apache.zookeeper.server.quorum.FastLeaderElection	
      Notification: 2 (n.leader), 0x700082c9b (n.zxid), 0x8 (n.round), LOOKING (n.state), 2 (n.sid), 0x7 (n.peerEPoch), LOOKING (my state)
      04:40:24.753	INFO	org.apache.zookeeper.server.quorum.FastLeaderElection	
      Notification: 3 (n.leader), 0x600000005 (n.zxid), 0x7 (n.round), FOLLOWING (n.state), 1 (n.sid), 0x6 (n.peerEPoch), LOOKING (my state)
      04:40:25.153	INFO	org.apache.zookeeper.server.quorum.FastLeaderElection	
      Notification time out: 800
      04:40:25.153	INFO	org.apache.zookeeper.server.quorum.FastLeaderElection	
      Notification: 2 (n.leader), 0x700082c9b (n.zxid), 0x8 (n.round), LOOKING (n.state), 2 (n.sid), 0x7 (n.peerEPoch), LOOKING (my state)
      04:40:25.154	INFO	org.apache.zookeeper.server.quorum.FastLeaderElection	
      Notification: 3 (n.leader), 0x600000005 (n.zxid), 0x7 (n.round), FOLLOWING (n.state), 1 (n.sid), 0x6 (n.peerEPoch), LOOKING (my state)
      04:40:25.954	INFO	org.apache.zookeeper.server.quorum.FastLeaderElection	
      Notification time out: 1600
      04:40:25.955	INFO	org.apache.zookeeper.server.quorum.FastLeaderElection	
      Notification: 2 (n.leader), 0x700082c9b (n.zxid), 0x8 (n.round), LOOKING (n.state), 2 (n.sid), 0x7 (n.peerEPoch), LOOKING (my state)
      04:40:25.956	INFO	org.apache.zookeeper.server.quorum.FastLeaderElection	
      Notification: 3 (n.leader), 0x600000005 (n.zxid), 0x7 (n.round), FOLLOWING (n.state), 1 (n.sid), 0x6 (n.peerEPoch), LOOKING (my state)
      04:40:26.133	INFO	org.apache.zookeeper.server.NIOServerCnxnFactory	
      Accepted socket connection from /176.9.1.174:59112
      04:40:26.134	WARN	org.apache.zookeeper.server.NIOServerCnxn	
      Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
      04:40:26.134	INFO	org.apache.zookeeper.server.NIOServerCnxn	
      Closed socket connection for client /176.9.1.174:59112 (no session established for client)
      04:40:27.556	INFO	org.apache.zookeeper.server.quorum.FastLeaderElection	
      Notification time out: 3200
      04:40:27.556	INFO	org.apache.zookeeper.server.quorum.FastLeaderElection	
      Notification: 2 (n.leader), 0x700082c9b (n.zxid), 0x8 (n.round), LOOKING (n.state), 2 (n.sid), 0x7 (n.peerEPoch), LOOKING (my state)
      04:40:27.557	INFO	org.apache.zookeeper.server.quorum.FastLeaderElection	
      Notification: 3 (n.leader), 0x600000005 (n.zxid), 0x7 (n.round), FOLLOWING (n.state), 1 (n.sid), 0x6 (n.peerEPoch), LOOKING (my state)
      04:40:27.639	INFO	org.apache.zookeeper.server.NIOServerCnxnFactory	
      Accepted socket connection from /176.9.1.174:59208
      04:40:27.639	WARN	org.apache.zookeeper.server.NIOServerCnxn	
      Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
      04:40:27.639	INFO	org.apache.zookeeper.server.NIOServerCnxn	
      Closed socket connection for client /176.9.1.174:59208 (no session established for client)
      04:40:28.970	INFO	org.apache.zookeeper.server.NIOServerCnxnFactory	
      Accepted socket connection from /176.9.1.174:59254
      04:40:28.970	WARN	org.apache.zookeeper.server.NIOServerCnxn	
      Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
      04:40:28.970	INFO	org.apache.zookeeper.server.NIOServerCnxn	
      Closed socket connection for client /176.9.1.174:59254 (no session established for client)
      04:40:30.179	INFO	org.apache.zookeeper.server.NIOServerCnxnFactory	
      Accepted socket connection from /88.198.23.238:55236
      04:40:30.180	WARN	org.apache.zookeeper.server.NIOServerCnxn	
      Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
      04:40:30.180	INFO	org.apache.zookeeper.server.NIOServerCnxn	
      Closed socket connection for client /88.198.23.238:55236 (no session established for client)
      04:40:30.757	INFO	org.apache.zookeeper.server.quorum.FastLeaderElection	
      Notification time out: 6400
      04:40:30.758	INFO	org.apache.zookeeper.server.quorum.FastLeaderElection	
      Notification: 2 (n.leader), 0x700082c9b (n.zxid), 0x8 (n.round), LOOKING (n.state), 2 (n.sid), 0x7 (n.peerEPoch), LOOKING (my state)
      04:40:30.758	INFO	org.apache.zookeeper.server.quorum.FastLeaderElection	
      Notification: 3 (n.leader), 0x600000005 (n.zxid), 0x7 (n.round), FOLLOWING (n.state), 1 (n.sid), 0x6 (n.peerEPoch), LOOKING (my state)
      04:40:30.883	INFO	org.apache.zookeeper.server.NIOServerCnxnFactory	
      Accepted socket connection from /176.9.1.174:59350
      04:40:30.883	WARN	org.apache.zookeeper.server.NIOServerCnxn	
      Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
      

      Attachments

        Activity

          People

            Unassigned Unassigned
            serega_sheypak Sergey
            Votes:
            2 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated: