Uploaded image for project: 'Flume'
  1. Flume
  2. FLUME-1173

HDFSEventSink can leave orphaned .tmp files

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Duplicate
    • 1.1.0
    • None
    • Sinks+Sources
    • None

    Description

      Currently HDFSEventSink only renames a .tmp file under the following conditions:

      1) An attempt to write an event to the file coupled with hitting one of the three roll criteria
      2) Stopping the HDFSEventSink closes all writers and thus renames all currently open .tmp files
      3) If the number of max open files is hit, oler writers are closed, and thus their .tmp files get renamed

      The problem that I see is if events are being routed by a path by timestamp, say day or hour, you should stop seeing any events written to that path after that timestamp has been hit. If this last event comes at an inopportune time, say 5 minutes after the last roll and you're rolling once an hour, then you could be left with an orphan .tmp file that won't get rolled until (2) or (3) hit. Unless you set the max number of open files low, that could be quite a long time.

      Attachments

        Issue Links

          Activity

            People

              mpercy Mike Percy
              fwiffo Joey Echeverria
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: