  2. FLUME-3080

Close failure in HDFS Sink might cause data loss



    • 1.7.0
    • 1.8.0
      If the HDFS Sink tries to close a file but it fails (e.g. due to timeout) the last block might not end up in COMPLETE state. In this case block recovery should happen but as the lease is still held by Flume the NameNode will start the recovery process only after the hard limit of 1 hour expires.
      The lease recovery can be started manually by the hdfs debug recoverLease command.

      For reproduction I removed the close call from the BucketWriter (https://github.com/apache/flume/blob/trunk/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java#L380) to simulate the failure and used the following config:

      agent.sources = testsrc
      agent.sinks = testsink
      agent.channels = testch
      agent.sources.testsrc.type = netcat
      agent.sources.testsrc.bind = localhost
      agent.sources.testsrc.port = 9494
      agent.sources.testsrc.channels = testch
      agent.sinks.testsink.type = hdfs
      agent.sinks.testsink.hdfs.path = /user/flume/test
      agent.sinks.testsink.hdfs.rollInterval = 0
      agent.sinks.testsink.hdfs.rollCount = 3
      agent.sinks.testsink.serializer = avro_event
      agent.sinks.testsink.serializer.compressionCodec = snappy
      agent.sinks.testsink.hdfs.fileSuffix = .avro
      agent.sinks.testsink.hdfs.fileType = DataStream
      agent.sinks.testsink.hdfs.batchSize = 2
      agent.sinks.testsink.hdfs.writeFormat = Text
      agent.sinks.testsink.channel = testch
      agent.channels.testch.type = memory

      After ingesting 6 events ("a" - "f") 2 files were created on HDFS, as expected. But there are missing events when listing the contents in Spark shell:

      scala> sqlContext.read.avro("/user/flume/test/FlumeData.14908867127*.avro").collect().map(a => new String(a(1).asInstanceOf[Array[Byte]])).foreach(println)

      hdfs fsck also confirms that the blocks are still in UNDER_CONSTRUCTION state:

      $ hdfs fsck /user/flume/test/ -openforwrite -files -blocks
      FSCK started by root (auth:SIMPLE) from / for path /user/flume/test/ at Thu Mar 30 08:23:56 PDT 2017
      /user/flume/test/ <dir>
      /user/flume/test/FlumeData.1490887185312.avro 310 bytes, 1 block(s), OPENFORWRITE:  MISSING 1 blocks of total size 310 B
      0. BP-1285398861-{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[[DISK]DS-e0d04bef-a861-40b0-99dd-27bfb2871ecd:NORMAL:|RBW], ReplicaUnderConstruction[[DISK]DS-d1979e0c-db81-4790-b225-ae8a4cf42dd8:NORMAL:|RBW], ReplicaUnderConstruction[[DISK]DS-ca00550d-702e-4892-a54a-7105af0c19ee:NORMAL:|RBW]]} len=310 MISSING!
      /user/flume/test/FlumeData.1490887185313.avro 292 bytes, 1 block(s), OPENFORWRITE:  MISSING 1 blocks of total size 292 B
      0. BP-1285398861-{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[[DISK]DS-ca00550d-702e-4892-a54a-7105af0c19ee:NORMAL:|RBW], ReplicaUnderConstruction[[DISK]DS-e0d04bef-a861-40b0-99dd-27bfb2871ecd:NORMAL:|RBW], ReplicaUnderConstruction[[DISK]DS-d1979e0c-db81-4790-b225-ae8a4cf42dd8:NORMAL:|RBW]]} len=292 MISSING!

      These blocks need to be recovered by starting a lease recovery process on the NameNode (which will then run the block recovery as well). This can be triggered programmatically via the DFSClient.
      Adding this call if the close fails solves the issue.

      cc jojochuang


              denes Denes Arvay
