Uploaded image for project: 'Flume'
  1. Flume
  2. FLUME-1864

Allow hdfs idle callback to clean up closed bucket writers

Agile BoardAttach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.4.0
    • 1.4.0
    • None
    • None

    Description

      In the original implementation of the idle file closing behaviour the callback was cancelled on close(). This makes sense assuming everything else behaves in a desirable manner.

      On the other hand, rollInterval will close a file and leave the bucketWriter in the writer map. This allows for incrementally named files to be created as the same path is reopened. However in some situations(primarily with time bucketed data), this leaves a lot of abandoned bucket writers that may be closed by rollInterval and thus never removed by idle.

      In FLUME-1850 a couple of approaches were suggested, and I originally intended to fix this by using the callback from rollInterval to remove the writer from the map. However this would break incremental naming.

      Until(if we ever do) we change the rolling logic it leaves the more viable option to be allowing the idle timer to persist after a close. In this way it can be used to release resources that have become unnecessary but already closed by rollInterval, also reducing hard to understand interactions between configuration variables by one.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            juhanic Juhani Connolly
            juhanic Juhani Connolly
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment