Flume
  1. Flume
  2. FLUME-1779

Backoff returned from HDFSEventSink after IOException

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      Status.Backoff is returned from HdfsEventSink.process() in case of IOException. This behavior prevents FailoverSinkProcessor from pushing event to next sink in queue.

      In my test case, IOException is caused by serious hdfs failure, for example all DataNodes in cluster are dead. After such failure BucketWriter throws IOException and becomes unavailable - it probably should be removed from sfWriters map.

      1. st
        10 kB
        Jaroslaw Grabowski
      2. FLUME-1799.patch
        0.8 kB
        Connor Woodson

        Activity

        Hide
        Jaroslaw Grabowski added a comment -

        Stack trace after all (single) data node died

        Show
        Jaroslaw Grabowski added a comment - Stack trace after all (single) data node died
        Hide
        Connor Woodson added a comment - - edited

        The simple way to fix this that would enable the sink to work on the fail-over processor is to replace the return statement with throwing an exception. However, when the sink isn't used in a fail-over processor, backoff is a much more desirable return value, as there is a possibility that events may eventually be writable. There could be a config setting such as 'retryBadConnection' defaulted to true that would run the return statement, thus keeping backwards functionality, but when false would throw the exception; is there a better way?

        And I suppose I haven't fully tested when an IOException occurs; does the bucket writer need to be removed as well?

        Show
        Connor Woodson added a comment - - edited The simple way to fix this that would enable the sink to work on the fail-over processor is to replace the return statement with throwing an exception. However, when the sink isn't used in a fail-over processor, backoff is a much more desirable return value, as there is a possibility that events may eventually be writable. There could be a config setting such as 'retryBadConnection' defaulted to true that would run the return statement, thus keeping backwards functionality, but when false would throw the exception; is there a better way? And I suppose I haven't fully tested when an IOException occurs; does the bucket writer need to be removed as well?
        Hide
        Connor Woodson added a comment -

        Here's a one-line fix that throws an exception instead of returning BACKOFF; is that all that needs to be done, or is there more to this?

        Show
        Connor Woodson added a comment - Here's a one-line fix that throws an exception instead of returning BACKOFF; is that all that needs to be done, or is there more to this?

          People

          • Assignee:
            Unassigned
            Reporter:
            Jaroslaw Grabowski
          • Votes:
            1 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:

              Development