Uploaded image for project: 'Flume'
  1. Flume
  2. FLUME-1702

HDFSEventSink should write to a hidden file as opposed to a .tmp file

Agile BoardAttach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 1.4.0
    • None
    • None

    Description

      Currently we write to a .tmp file. The problem is that if MR jobs are being run on the directory we are writing to, then it's common for an MR job to list the directory, get a .tmp file and then in the mean time the .tmp file is renamed causing the job to fail when run.

      Using JavaMR you can use a PathFilter to avoid this, however a custom solution is required for Pig, Hive, etc.

      Perhaps we should write to a hidden file so that MR never tries to process data in flight.

      Attachments

        1. bugFLUME-1702.patch
          25 kB
          Jarek Jarcec Cecho
        2. bugFLUME-1702.patch
          25 kB
          Jarek Jarcec Cecho

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            jarcec Jarek Jarcec Cecho
            brocknoland Brock Noland
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment