Uploaded image for project: 'Flume'
  1. Flume
  2. FLUME-1610

HDFSEventSink and bucket writer have a race condition

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.2.0
    • 1.3.0
    • Sinks+Sources
    • None

    Description

      I have seen a scenerio where an exception was thrown during HDFSEventSink.process when the flush on the bucket writer was called but the BucketWriter was already closed.

      Assumptions:
      1) In HDFSEventSink.process when done, we flush all buckets written to once channel returns null or batch size is exceeded
      2) The BucketWriter.flush method does not check the isOpen flag.
      3) Our time roll interval code assumes the next call to the bucket writer will be append as such the isOpen flag will be checked and the underlying writer re-opened.

      As such, I think what is happening is this:
      1) In HDFSEventSink.process the bucket writer is written to
      2) In BucketWriter the time based roll trips
      3) In HDFSEventSink.process the channel returns null or batch size is exceeded
      4) In HDFSEventSink.process bucket writer flush is called throwing the exception logged above.

      Attachments

        1. FLUME-1610-0.patch
          3 kB
          Brock Noland
        2. FLUME-1610-3.patch
          5 kB
          Mike Percy

        Issue Links

          Activity

            People

              mpercy Mike Percy
              brocknoland Brock Noland
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: