Uploaded image for project: 'Flume'
  1. Flume
  2. FLUME-2525

flume should handle a zero byte .flumespool-main.meta file for the spooldir source

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.5.0.1
    • Fix Version/s: 1.6.0
    • Component/s: Sinks+Sources
    • Labels:
      None

      Description

      When a zero byte .flumespool-main.meta file exists in the trackerDir (usually do to the partition filling up), flume will throw the following ambiguous error message when trying to read in new spool files:
      2014-10-19 18:28:31,333 ERROR org.apache.flume.client.avro.ReliableSpoolingFileEventReader: Exception opening file: /home/spooldir/input.log
      java.io.IOException: Not a data file.
      at org.apache.avro.file.DataFileStream.initialize(DataFileStream.java:102)
      at org.apache.avro.file.DataFileReader.<init>(DataFileReader.java:97)
      at org.apache.avro.file.DataFileWriter.appendTo(DataFileWriter.java:160)
      at org.apache.avro.file.DataFileWriter.appendTo(DataFileWriter.java:149)
      at org.apache.flume.serialization.DurablePositionTracker.<init>(DurablePositionTracker.java:141)
      at org.apache.flume.serialization.DurablePositionTracker.getInstance(DurablePositionTracker.java:76)
      at org.apache.flume.client.avro.ReliableSpoolingFileEventReader.getNextFile(ReliableSpoolingFileEventReader.java:420)
      at org.apache.flume.client.avro.ReliableSpoolingFileEventReader.readEvents(ReliableSpoolingFileEventReader.java:215)
      at org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:182)
      at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
      at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
      at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
      at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
      at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
      at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
      at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
      at java.lang.Thread.run(Thread.java:662)

      Restarts of the flume agent do not resolve the issue. Only when the zero byte file is removed, will flume properly start processing files from the spooldir again.

      1. FLUME-2525-1.patch
        2 kB
        Johny Rufus
      2. FLUME-2525.patch
        4 kB
        Johny Rufus

        Activity

        Hide
        hudson Hudson added a comment -

        UNSTABLE: Integrated in Flume-trunk-hbase-98 #45 (See https://builds.apache.org/job/Flume-trunk-hbase-98/45/)
        FLUME-2525. Handle a zero byte .flumespool-main.meta file for the spooldir source. (hshreedharan: http://git-wip-us.apache.org/repos/asf/flume/repo?p=flume.git&a=commit&h=efbf87fb6ddc0bbc736446a5a91cf6a83d34d2d4)

        • flume-ng-core/src/main/java/org/apache/flume/client/avro/ReliableSpoolingFileEventReader.java
        • flume-ng-core/src/test/java/org/apache/flume/client/avro/TestReliableSpoolingFileEventReader.java
        Show
        hudson Hudson added a comment - UNSTABLE: Integrated in Flume-trunk-hbase-98 #45 (See https://builds.apache.org/job/Flume-trunk-hbase-98/45/ ) FLUME-2525 . Handle a zero byte .flumespool-main.meta file for the spooldir source. (hshreedharan: http://git-wip-us.apache.org/repos/asf/flume/repo?p=flume.git&a=commit&h=efbf87fb6ddc0bbc736446a5a91cf6a83d34d2d4 ) flume-ng-core/src/main/java/org/apache/flume/client/avro/ReliableSpoolingFileEventReader.java flume-ng-core/src/test/java/org/apache/flume/client/avro/TestReliableSpoolingFileEventReader.java
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in flume-trunk #688 (See https://builds.apache.org/job/flume-trunk/688/)
        FLUME-2525. Handle a zero byte .flumespool-main.meta file for the spooldir source. (hshreedharan: http://git-wip-us.apache.org/repos/asf/flume/repo?p=flume.git&a=commit&h=efbf87fb6ddc0bbc736446a5a91cf6a83d34d2d4)

        • flume-ng-core/src/test/java/org/apache/flume/client/avro/TestReliableSpoolingFileEventReader.java
        • flume-ng-core/src/main/java/org/apache/flume/client/avro/ReliableSpoolingFileEventReader.java
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in flume-trunk #688 (See https://builds.apache.org/job/flume-trunk/688/ ) FLUME-2525 . Handle a zero byte .flumespool-main.meta file for the spooldir source. (hshreedharan: http://git-wip-us.apache.org/repos/asf/flume/repo?p=flume.git&a=commit&h=efbf87fb6ddc0bbc736446a5a91cf6a83d34d2d4 ) flume-ng-core/src/test/java/org/apache/flume/client/avro/TestReliableSpoolingFileEventReader.java flume-ng-core/src/main/java/org/apache/flume/client/avro/ReliableSpoolingFileEventReader.java
        Hide
        hshreedharan Hari Shreedharan added a comment -

        Committed! Thanks Johny!

        Show
        hshreedharan Hari Shreedharan added a comment - Committed! Thanks Johny!
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit c2b953449d033acaa02e3ced1d9a95e9cdcb5e02 in flume's branch refs/heads/flume-1.6 from Hari Shreedharan
        [ https://git-wip-us.apache.org/repos/asf?p=flume.git;h=c2b9534 ]

        FLUME-2525. Handle a zero byte .flumespool-main.meta file for the spooldir source.

        (Johny Rufus via Hari)

        Show
        jira-bot ASF subversion and git services added a comment - Commit c2b953449d033acaa02e3ced1d9a95e9cdcb5e02 in flume's branch refs/heads/flume-1.6 from Hari Shreedharan [ https://git-wip-us.apache.org/repos/asf?p=flume.git;h=c2b9534 ] FLUME-2525 . Handle a zero byte .flumespool-main.meta file for the spooldir source. (Johny Rufus via Hari)
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit efbf87fb6ddc0bbc736446a5a91cf6a83d34d2d4 in flume's branch refs/heads/trunk from Hari Shreedharan
        [ https://git-wip-us.apache.org/repos/asf?p=flume.git;h=efbf87f ]

        FLUME-2525. Handle a zero byte .flumespool-main.meta file for the spooldir source.

        (Johny Rufus via Hari)

        Show
        jira-bot ASF subversion and git services added a comment - Commit efbf87fb6ddc0bbc736446a5a91cf6a83d34d2d4 in flume's branch refs/heads/trunk from Hari Shreedharan [ https://git-wip-us.apache.org/repos/asf?p=flume.git;h=efbf87f ] FLUME-2525 . Handle a zero byte .flumespool-main.meta file for the spooldir source. (Johny Rufus via Hari)
        Hide
        hshreedharan Hari Shreedharan added a comment -

        +1. I made a couple of formatting changes (added {} for single statement if). Running tests now, if they pass I will commit this.

        Show
        hshreedharan Hari Shreedharan added a comment - +1. I made a couple of formatting changes (added {} for single statement if). Running tests now, if they pass I will commit this.
        Hide
        jrufus Johny Rufus added a comment -

        Thanks Hari, the initial idea was to decode the IOException to find the root cause, but since we cant do that accurately, following your suggestion. Attaching a new patch

        Show
        jrufus Johny Rufus added a comment - Thanks Hari, the initial idea was to decode the IOException to find the root cause, but since we cant do that accurately, following your suggestion. Attaching a new patch
        Hide
        hshreedharan Hari Shreedharan added a comment -

        Couple of comments:

        • Do you really need to wait for the IOException before deleting the zero-byte file? Why not check beforehand, and delete the file?
        • In the unit tests, please use junit Assert.* methods, not the standard java asserts.
        Show
        hshreedharan Hari Shreedharan added a comment - Couple of comments: Do you really need to wait for the IOException before deleting the zero-byte file? Why not check beforehand, and delete the file? In the unit tests, please use junit Assert.* methods, not the standard java asserts.
        Hide
        jrufus Johny Rufus added a comment -

        Attached a patch, that deletes zero byte tracker files when trying to instantiate the position tracker and logs detailed error messages

        Show
        jrufus Johny Rufus added a comment - Attached a patch, that deletes zero byte tracker files when trying to instantiate the position tracker and logs detailed error messages
        Hide
        pdvorak Patrick Dvorak added a comment -

        When a partition containing the trackerDir (or spoolDir in the event trackerDir is not specified) fills up, a new meta file can be created, but it will have a zero byte size. This prevents flume from properly processing any files once the partition full issue has been resolved, even across restarts. The error thrown does not really indicate the root issue. Ideally flume should remove any zero byte spooldir meta files when it starts up, as these are invalid.

        Show
        pdvorak Patrick Dvorak added a comment - When a partition containing the trackerDir (or spoolDir in the event trackerDir is not specified) fills up, a new meta file can be created, but it will have a zero byte size. This prevents flume from properly processing any files once the partition full issue has been resolved, even across restarts. The error thrown does not really indicate the root issue. Ideally flume should remove any zero byte spooldir meta files when it starts up, as these are invalid.

          People

          • Assignee:
            jrufus Johny Rufus
            Reporter:
            pdvorak Patrick Dvorak
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development