Uploaded image for project: 'Flume'
  1. Flume
  2. FLUME-2052

Spooling directory source should be able to replace or ignore malformed characters

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.4.0
    • 1.5.0
    • None
    • centOS 6.3
      Flume 1.3.0

    Description

      When parsing a file with messed up encoding flume spits this error:

      23 May 2013 22:06:29,446 ERROR [pool-12-thread-1] (org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run:164) - Uncaught exception in Runnable
      java.nio.charset.MalformedInputException: Input length = 1
      at java.nio.charset.CoderResult.throwException(CoderResult.java:277)
      at org.apache.flume.serialization.ResettableFileInputStream.readChar(ResettableFileInputStream.java:162)
      at org.apache.flume.serialization.LineDeserializer.readLine(LineDeserializer.java:134)
      at org.apache.flume.serialization.LineDeserializer.readEvent(LineDeserializer.java:72)
      at org.apache.flume.serialization.LineDeserializer.readEvents(LineDeserializer.java:91)
      at org.apache.flume.client.avro.ReliableSpoolingFileEventReader.readEvents(ReliableSpoolingFileEventReader.java:221)
      at org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:154)
      at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
      at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351)
      at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)
      at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
      at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      at java.lang.Thread.run(Thread.java:722)

      It would be good to skip such characters, ignore them or delete. Corrupt signs come from spamming engines, flume cant handle them at all.

      Attachments

        1. FLUME-2052.patch
          19 kB
          Mike Percy

        Issue Links

          Activity

            People

              mpercy Mike Percy
              greg.glazewski@cp.net greg glazeweas
              Votes:
              3 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: