Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
1.7.0
-
None
Description
The HDFSEventSink.process() is already prepared for a rare race condition, namely when the BucketWriter acquired in line 389 gets closed by an other thread (e.g. because the idleTimeout or the rollInterval) before the append() is called in line 406.
If this is the case the BucketWriter.append() call throws a BucketClosedException and the sink creates a new BucketWriter instance and appends to it.
But this newly created instance won't be added to the writers list, which means that it won't be flushed after the processing loop finished: https://github.com/apache/flume/blob/trunk/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java#L429
This has multiple consequences:
- unflushed data might get lost
- the BucketWriter's idleAction won't be scheduled (https://github.com/apache/flume/blob/trunk/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java#L450), which means that it won't be closed nor renamed if the idle timeout is the only trigger for closing the file.
Attachments
Issue Links
- links to