Uploaded image for project: 'Flume'
  1. Flume
  2. FLUME-2566

BucketWriter tries to close file endlessly

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Duplicate
    • Affects Version/s: 1.5.1
    • Fix Version/s: 1.6.0
    • Component/s: Sinks+Sources
    • Labels:
      None

      Description

      The following scenario causes BucketWriter to go in a endless loop trying to close a file:

      On the first call to close, there is a timeout (due to hdfs being temporarily overloaded)

      12:53:57.363 [hdfs-hdfs_sink_3-roll-timer-0] WARN   o.a.flume.sink.hdfs.BucketWriter - failed to close() HDFSWriter for file (/rawdata/medusa/data/p_nl_omm_goat_medusa01/20141202/node1/FOO/medusa.1417521000000.1417521129207.avro.tmp). Exception follows.
      java.io.IOException: Callable timed out after 10000 ms on file: /rawdata/medusa/data/p_nl_omm_goat_medusa01/20141202/node1/FOO/medusa.1417521000000.1417521129207.avro.tmp
      	at org.apache.flume.sink.hdfs.BucketWriter.callWithTimeout(BucketWriter.java:736) ~[flume-hdfs-sink.jar:na]
      	at org.apache.flume.sink.hdfs.BucketWriter.close(BucketWriter.java:417) ~[flume-hdfs-sink.jar:na]
      	at org.apache.flume.sink.hdfs.BucketWriter$5.call(BucketWriter.java:476) [flume-hdfs-sink.jar:na]
      	at org.apache.flume.sink.hdfs.BucketWriter$5.call(BucketWriter.java:471) [flume-hdfs-sink.jar:na]
      	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_20]
      	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_20]
      	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [na:1.8.0_20]
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_20]
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_20]
      

      BucketWriter schedules an operation to retry the close. This retry, and every other retry after that, fails because, by now, the channel succeeded in closing and refuses to flush/close again. Instead it throws an exception that causes BucketWriter to schedule another close retry (which will fail again).

      14:32:58.793 [hdfs-hdfs_sink_3-roll-timer-0] WARN   o.a.flume.sink.hdfs.BucketWriter - Closing file: /rawdata/medusa/data/p_nl_omm_goat_medusa01/20141202/node1/FOO/medusa.1417521000000.1417521129207.avro.tmp failed. Will retry again in 180 seconds.
      java.nio.channels.ClosedChannelException: null
      	at org.apache.hadoop.hdfs.DFSOutputStream.checkClosed(DFSOutputStream.java:1527) ~[hadoop-hdfs.jar:na]
      	at org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(DFSOutputStream.java:1843) ~[hadoop-hdfs.jar:na]
      	at org.apache.hadoop.hdfs.DFSOutputStream.hflush(DFSOutputStream.java:1803) ~[hadoop-hdfs.jar:na]
      	at org.apache.hadoop.hdfs.DFSOutputStream.sync(DFSOutputStream.java:1788) ~[hadoop-hdfs.jar:na]
      	at org.apache.hadoop.fs.FSDataOutputStream.sync(FSDataOutputStream.java:120) ~[hadoop-common.jar:na]
      	at org.apache.flume.sink.hdfs.HDFSDataStream.close(HDFSDataStream.java:139) ~[flume-hdfs-sink.jar:na]
      	at org.apache.flume.sink.hdfs.BucketWriter$3.call(BucketWriter.java:341) ~[flume-hdfs-sink.jar:na]
      	at org.apache.flume.sink.hdfs.BucketWriter$3.call(BucketWriter.java:335) ~[flume-hdfs-sink.jar:na]
      	at org.apache.flume.sink.hdfs.BucketWriter$9$1.run(BucketWriter.java:722) ~[flume-hdfs-sink.jar:na]
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                jrufus Johny Rufus
                Reporter:
                tychol Tycho Lamerigts
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: