Uploaded image for project: 'Flume'
  1. Flume
  2. FLUME-734

escapedFormatDfs goes into a file creation frenzy

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • 0.9.4
    • 0.9.5
    • Sinks+Sources
    • None
    • CentOS 5.6

    Description

      Using this configuration:
      collectorSource(54001) | collector(600000) { escapedFormatDfs("hdfs://hadoop1-m1:8020/raw-events/%Y-%m-%d/", "events-%

      {rolltag}

      -col1.snappy", seqfile("SnappyCodec")) }

      The expected behavior is to see a new file created every 10 minutes. However, once in a while the collector would go into a file creation frenzy, creating new files every second.
      The log indicates that writing has failed with error: "OutputFormat instance can only write to the same OutputStream" causing the file to be closed a new one to be opened just to be closed again.

      Looking at the code I'm not even sure how the output stream could change but the behavior I'm seeing feels like some sort of a race condition. It is happening much more under heavy load than under low load.

      See attached log excerpt.

      Attachments

        1. FLUME-734-draft.patch
          7 kB
          Jonathan Hsieh
        2. flume.log
          30 kB
          Eran Kutner
        3. 0001-FLUME-734-escapedFormatDfs-goes-into-a-file-creation.patch
          9 kB
          Jonathan Hsieh

        Issue Links

          Activity

            People

              jmhsieh Jonathan Hsieh
              erank Eran Kutner
              Votes:
              1 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: