Uploaded image for project: 'Flume'
  1. Flume
  2. FLUME-2052

Spooling directory source should be able to replace or ignore malformed characters

Attach filesAttach ScreenshotVotersStop watchingWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.4.0
    • 1.5.0
    • None
    • centOS 6.3
      Flume 1.3.0

    Description

      When parsing a file with messed up encoding flume spits this error:

      23 May 2013 22:06:29,446 ERROR [pool-12-thread-1] (org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run:164) - Uncaught exception in Runnable
      java.nio.charset.MalformedInputException: Input length = 1
      at java.nio.charset.CoderResult.throwException(CoderResult.java:277)
      at org.apache.flume.serialization.ResettableFileInputStream.readChar(ResettableFileInputStream.java:162)
      at org.apache.flume.serialization.LineDeserializer.readLine(LineDeserializer.java:134)
      at org.apache.flume.serialization.LineDeserializer.readEvent(LineDeserializer.java:72)
      at org.apache.flume.serialization.LineDeserializer.readEvents(LineDeserializer.java:91)
      at org.apache.flume.client.avro.ReliableSpoolingFileEventReader.readEvents(ReliableSpoolingFileEventReader.java:221)
      at org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:154)
      at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
      at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351)
      at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)
      at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
      at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      at java.lang.Thread.run(Thread.java:722)

      It would be good to skip such characters, ignore them or delete. Corrupt signs come from spamming engines, flume cant handle them at all.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            mpercy Mike Percy
            greg.glazewski@cp.net greg glazeweas
            Votes:
            3 Vote for this issue
            Watchers:
            9 Stop watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment