Flume
  1. Flume
  2. FLUME-1219

Race conditions in BucketWriter / HDFSEventSink

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: v1.2.0
    • Component/s: None
    • Labels:
      None

      Description

      BucketWriter has several race conditions that came up during my performance testing over the weekend. One issue that caused data loss was the lack of atomic close() and open() semantics related to the "retry" mechanism after the abort() call in HDFSEventSink.process().

      Another issue is the lack of clearly delineated responsibilities for calling open(), flush(), close(), etc. For example, HDFSEventSink.start() calls open(), HDFSEventSink.process() calls and abort() which calls open(), and BucketWriter.append() also calls close() and open().

      There is another race condition related to the JVM shutdown hooks, which cause .tmp files not to be renamed.

      These APIs need to be refactored and their responsibilities need to be clarified.

      1. FLUME-1219-1.patch
        29 kB
        Mike Percy
      2. FLUME-1219-3.patch
        34 kB
        Mike Percy

        Issue Links

          Activity

          Hide
          Hudson added a comment -

          Integrated in flume-trunk #210 (See https://builds.apache.org/job/flume-trunk/210/)
          FLUME-1219. Race conditions in BucketWriter and HDFSEventSink

          (Mike Percy via Arvind Prabhakar) (Revision 1341096)

          Result = SUCCESS
          arvind : http://svn.apache.org/viewvc/?view=rev&rev=1341096
          Files :

          • /incubator/flume/trunk/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java
          • /incubator/flume/trunk/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java
          • /incubator/flume/trunk/flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestBucketWriter.java
          • /incubator/flume/trunk/flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestHDFSEventSink.java
          Show
          Hudson added a comment - Integrated in flume-trunk #210 (See https://builds.apache.org/job/flume-trunk/210/ ) FLUME-1219 . Race conditions in BucketWriter and HDFSEventSink (Mike Percy via Arvind Prabhakar) (Revision 1341096) Result = SUCCESS arvind : http://svn.apache.org/viewvc/?view=rev&rev=1341096 Files : /incubator/flume/trunk/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java /incubator/flume/trunk/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java /incubator/flume/trunk/flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestBucketWriter.java /incubator/flume/trunk/flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestHDFSEventSink.java
          Hide
          Arvind Prabhakar added a comment -

          Patch committed. Thanks Mike!

          Show
          Arvind Prabhakar added a comment - Patch committed. Thanks Mike!
          Hide
          Mike Percy added a comment -

          The reviewboard doesn't seem to be posting to JIRA so here are my comments regarding the patch on RB:

          BucketWriter refactoring: append() does all the work of open/close/roll. open() is a private method that takes no arguments. No abort() call. Only one constructor. Far fewer entry points and code paths. I believe I've closed all or many of the race conditions and clarified the API responsibilities/semantics. Also, renaming of the .tmp files works on shutdown.

          Show
          Mike Percy added a comment - The reviewboard doesn't seem to be posting to JIRA so here are my comments regarding the patch on RB: BucketWriter refactoring: append() does all the work of open/close/roll. open() is a private method that takes no arguments. No abort() call. Only one constructor. Far fewer entry points and code paths. I believe I've closed all or many of the race conditions and clarified the API responsibilities/semantics. Also, renaming of the .tmp files works on shutdown.

            People

            • Assignee:
              Mike Percy
              Reporter:
              Mike Percy
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development