Flume
  1. Flume
  2. FLUME-1428

File Channel should not consider a file as inactive until all takes are committed.

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: v1.2.0
    • Fix Version/s: v1.3.0
    • Component/s: Channel
    • Labels:
      None

      Description

      FlumeEventQueue removes fileID from fileIDCounts before commits are completed. See FLUME-1417 for background. This series of events would be problematic:
      File ( i )
      n puts
      commit
      n takes (file i is no longer in flume event queue fileIDCounts)

      file roll -> current active file in directory (File i+1)
      background worker kicks in -> removes file i.
      Note that the commit for the n takes has not taken place.

      We do keep 2 files per directory, but if a commit/rollback does not come and 2 files are rolled, we might have some issues.

      1. FLUME-1428-1.patch
        4 kB
        Hari Shreedharan
      2. FLUME-1428.patch
        4 kB
        Hari Shreedharan

        Issue Links

          Activity

          Hide
          Brock Noland added a comment - - edited

          I think we should move the reference counting logic up to the Log class and then do the reference decrement/increment on commitTake and commitPut since the FileBackedTransaction could pass in the list of FlumeEventPointers.

          Show
          Brock Noland added a comment - - edited I think we should move the reference counting logic up to the Log class and then do the reference decrement/increment on commitTake and commitPut since the FileBackedTransaction could pass in the list of FlumeEventPointers.
          Hide
          Hari Shreedharan added a comment -

          Assigning to myself for now. I am trying a couple of methods:
          1. Move the reference counting to the Log class, and pass it to the queue while checkpointing, and retrieve from queue before replaying. Replay handler provides the refcounts from the replay and the log class applies this delta to the original values in the checkpoint.

          2. If 1 is too complex and causes issues, I am going to get something checked in which will work reasonably fine. Include a configuration variable which sets a cleanUpInterval and delete the "inactive" files after the clean up interval. The downside is there might be extra "old" data, but this will work around the current problem. Not yet sure how this will work on restarts - shutdown and restart after several hours.

          Show
          Hari Shreedharan added a comment - Assigning to myself for now. I am trying a couple of methods: 1. Move the reference counting to the Log class, and pass it to the queue while checkpointing, and retrieve from queue before replaying. Replay handler provides the refcounts from the replay and the log class applies this delta to the original values in the checkpoint. 2. If 1 is too complex and causes issues, I am going to get something checked in which will work reasonably fine. Include a configuration variable which sets a cleanUpInterval and delete the "inactive" files after the clean up interval. The downside is there might be extra "old" data, but this will work around the current problem. Not yet sure how this will work on restarts - shutdown and restart after several hours.
          Hide
          Brock Noland added a comment -
          Show
          Brock Noland added a comment - Committed here https://git-wip-us.apache.org/repos/asf?p=flume.git;a=commit;h=5289ccc566a02b7ca59e0f5ae39dd0a4369f48cb Thank you for contribution Hari!
          Hide
          Hudson added a comment -

          Integrated in flume-trunk #289 (See https://builds.apache.org/job/flume-trunk/289/)
          FLUME-1428: File Channel should not consider a file as inactive until all takes are committed (Revision 5289ccc566a02b7ca59e0f5ae39dd0a4369f48cb)

          Result = SUCCESS
          brock : http://git-wip-us.apache.org/repos/asf/flume/repo?p=flume.git;a=summary&a=commit&h=5289ccc566a02b7ca59e0f5ae39dd0a4369f48cb
          Files :

          • flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestFileChannel.java
          • flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/FlumeEventQueue.java
          Show
          Hudson added a comment - Integrated in flume-trunk #289 (See https://builds.apache.org/job/flume-trunk/289/ ) FLUME-1428 : File Channel should not consider a file as inactive until all takes are committed (Revision 5289ccc566a02b7ca59e0f5ae39dd0a4369f48cb) Result = SUCCESS brock : http://git-wip-us.apache.org/repos/asf/flume/repo?p=flume.git;a=summary&a=commit&h=5289ccc566a02b7ca59e0f5ae39dd0a4369f48cb Files : flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestFileChannel.java flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/FlumeEventQueue.java

            People

            • Assignee:
              Hari Shreedharan
              Reporter:
              Hari Shreedharan
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development