Flume
  1. Flume
  2. FLUME-2147

Missing headers cause events to become stuck in channel

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Sinks+Sources
    • Labels:
      None

      Description

      If a sink expects a header but does not find it, events will become stuck in the channel and Flume will log NullPointer and EventDelivery exceptions. In a memory channel, this can be fixed by restarting. In a file channel, restarting does not cause events to be removed.

      05 Aug 2013 12:21:09,424 ERROR [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.SinkRunner$PollingRunner.run:160) - Unable to deliver event. Exception follows.
      org.apache.flume.EventDeliveryException: java.lang.NullPointerException: Expected timestamp in the Flume event headers, but it was null
      at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:426)
      at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
      at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
      at java.lang.Thread.run(Thread.java:662)
      Caused by: java.lang.NullPointerException: Expected timestamp in the Flume event headers, but it was null
      at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:204)
      at org.apache.flume.formatter.output.BucketPath.replaceShorthand(BucketPath.java:200)
      at org.apache.flume.formatter.output.BucketPath.escapeString(BucketPath.java:396)
      at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:356)
      ... 3 more
      05 Aug 2013 12:21:09,424 ERROR [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.HDFSEventSink.process:422) - process failed
      java.lang.NullPointerException: Expected timestamp in the Flume event headers, but it was null
      at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:204)
      at org.apache.flume.formatter.output.BucketPath.replaceShorthand(BucketPath.java:200)
      at org.apache.flume.formatter.output.BucketPath.escapeString(BucketPath.java:396)
      at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:356)
      at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
      at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
      at java.lang.Thread.run(Thread.java:662)

      I was using RegexExtractorInterceptor to match timestamp for partitioning in with HDFS sink.

        Activity

        Hide
        Jonathan Cooper-Ellis added a comment -

        This is actually a bigger issue than I initially thought, because when the sink fails to process the event missing the header, it retries on the whole batch and duplicates in HDFS whatever data was ahead of the bad event in the batch.

        Show
        Jonathan Cooper-Ellis added a comment - This is actually a bigger issue than I initially thought, because when the sink fails to process the event missing the header, it retries on the whole batch and duplicates in HDFS whatever data was ahead of the bad event in the batch.
        Show
        Jonathan Cooper-Ellis added a comment - https://gist.github.com/anonymous/9f88209d8ab9443aebe8
        Hide
        Roshan Naik added a comment -

        Could you provide a sample config file you were using ?

        Show
        Roshan Naik added a comment - Could you provide a sample config file you were using ?
        Jonathan Cooper-Ellis made changes -
        Field Original Value New Value
        Summary RegexExtractorInterceptor miss causes irrecoverable IllegalStateException Missing headers cause events to become stuck in channel
        Jonathan Cooper-Ellis created issue -

          People

          • Assignee:
            Unassigned
            Reporter:
            Jonathan Cooper-Ellis
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:

              Development