Uploaded image for project: 'Flume'
  1. Flume
  2. FLUME-3085

HDFS Sink can skip flushing some BucketWriters, might lead to data loss

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 1.7.0
    • Fix Version/s: 1.8.0
    • Component/s: Sinks+Sources
    • Labels:
      None

      Description

      The HDFSEventSink.process() is already prepared for a rare race condition, namely when the BucketWriter acquired in line 389 gets closed by an other thread (e.g. because the idleTimeout or the rollInterval) before the append() is called in line 406.
      If this is the case the BucketWriter.append() call throws a BucketClosedException and the sink creates a new BucketWriter instance and appends to it.
      But this newly created instance won't be added to the writers list, which means that it won't be flushed after the processing loop finished: https://github.com/apache/flume/blob/trunk/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java#L429

      This has multiple consequences:

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                denes Denes Arvay
                Reporter:
                denes Denes Arvay
              • Votes:
                0 Vote for this issue
                Watchers:
                6 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: