Flume
  1. Flume
  2. FLUME-2089

ElasticsearchSink blocks and raises exceptions when event body has unexpected encoding

    Details

    • Type: Bug Bug
    • Status: Patch Available
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: v1.4.0, v1.3.1
    • Fix Version/s: None
    • Component/s: Sinks+Sources
    • Labels:
      None
    • Release Note:
      ElasticsearchSink now handles event bodies with unexpected encodings or parse failures by storing them as simple fields

      Description

      Detected by Allan Feid and documented on the user list http://mail-archives.apache.org/mod_mbox/flume-user/201306.mbox/%3CCAN94UWe6UvcOKT1S%2BXANC-sy0qFsxet3RJY9PVkj-eSfO5fk6Q%40mail.gmail.com%3E

      Steps:
      Send an event with the body as follows:
      foo¤data¤1371126476.436¤0.005¤555¤10.1.1.1¤HTTP/1.1¤GET¤http¤vhost¤/path/url¤¤-¤200¤
      referrer.com/search/?query=\x8D\x91\x89\xEF\x8Bc\x8E\x96\x93\xB0¤¤¤-

      Expected Results:
      The event is stored in elasticsearch.

      Actual Results:
      >> 10 Jun 2013 09:52:34,360 ERROR
      >> [SinkRunner-PollingRunner-DefaultSinkProcessor]
      >> (org.apache.flume.SinkRunner$PollingRunner.run:160) - Unable to deliver
      >> event. Exception follows.
      >> org.elasticsearch.common.jackson.dataformat.yaml.snakeyaml.error.YAMLException:
      >> java.io.CharConversionException: Invalid UTF-8 start byte 0xfc (at char
      >> #81, byte #-1)
      >> at
      >> org.elasticsearch.common.jackson.dataformat.yaml.snakeyaml.reader.StreamReader.update(StreamReader.java:198)
      >> at
      >> org.elasticsearch.common.jackson.dataformat.yaml.snakeyaml.reader.StreamReader.<init>(StreamReader.java:62)
      >> at
      >> org.elasticsearch.common.jackson.dataformat.yaml.YAMLParser.<init>(YAMLParser.java:147)
      >> at
      >> org.elasticsearch.common.jackson.dataformat.yaml.YAMLFactory._createParser(YAMLFactory.java:530)
      >> at
      >> org.elasticsearch.common.jackson.dataformat.yaml.YAMLFactory.createJsonParser(YAMLFactory.java:420)
      >> at
      >> org.elasticsearch.common.xcontent.yaml.YamlXContent.createParser(YamlXContent.java:83)
      >> at
      >> org.apache.flume.sink.elasticsearch.ContentBuilderUtil.addComplexField(ContentBuilderUtil.java:61)
      >> at
      >> org.apache.flume.sink.elasticsearch.ContentBuilderUtil.appendField(ContentBuilderUtil.java:47)
      >> at
      >> org.apache.flume.sink.elasticsearch.ElasticSearchLogStashEventSerializer.appendBody(ElasticSearchLogStashEventSerializer.java:87)
      >> at
      >> org.apache.flume.sink.elasticsearch.ElasticSearchLogStashEventSerializer.getContentBuilder(ElasticSearchLogStashEventSerializer.java:79)
      >> at
      >> org.apache.flume.sink.elasticsearch.ElasticSearchSink.process(ElasticSearchSink.java:178)
      >> at
      >> org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
      >> at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
      >> at java.lang.Thread.run(Thread.java:662)
      >> Caused by: java.io.CharConversionException: Invalid UTF-8 start byte 0xfc
      >> (at char #81, byte #-1)
      >> at
      >> org.elasticsearch.common.jackson.dataformat.yaml.UTF8Reader.reportInvalidInitial(UTF8Reader.java:395)
      >> at
      >> org.elasticsearch.common.jackson.dataformat.yaml.UTF8Reader.read(UTF8Reader.java:247)
      >> at
      >> org.elasticsearch.common.jackson.dataformat.yaml.UTF8Reader.read(UTF8Reader.java:157)
      >> at
      >> org.elasticsearch.common.jackson.dataformat.yaml.snakeyaml.reader.StreamReader.update(StreamReader.java:182)
      >> ... 13 more

      1. flume-2089.diff
        3 kB
        Edward Sargisson

        Issue Links

          Activity

          Ashish Paliwal made changes -
          Assignee Ashish Paliwal [ paliwalashish ]
          Edward Sargisson made changes -
          Release Note catch YAML parsing errors when adding data to elasticsearch ElasticsearchSink now handles event bodies with unexpected encodings or parse failures by storing them as simple fields
          Edward Sargisson made changes -
          Remote Link This issue links to "Review Board (Web Link)" [ 12514 ]
          Edward Sargisson made changes -
          Summary ElasticsearchSink blocks raises exceptions when event body has unexpected encoding ElasticsearchSink blocks and raises exceptions when event body has unexpected encoding
          Edward Sargisson made changes -
          Attachment flume-2089.diff [ 12597576 ]
          Edward Sargisson made changes -
          Description Detected by Allan Feid and documented on the user list http://mail-archives.apache.org/mod_mbox/flume-user/201306.mbox/%3CCAN94UWe6UvcOKT1S%2BXANC-sy0qFsxet3RJY9PVkj-eSfO5fk6Q%40mail.gmail.com%3E

          Steps:
          Send an event with the body as follows:
          foo¤data¤1371126476.436¤0.005¤555¤10.1.1.1¤HTTP/1.1¤GET¤http¤vhost¤/path/url¤¤-¤200¤
          referrer.com/search/?query=\x8D\x91\x89\xEF\x8Bc\x8E\x96\x93\xB0¤-¤-¤-

          Expected Results:
          The event is stored in elasticsearch.

          Actual Results:
          >> 10 Jun 2013 09:52:34,360 ERROR
          >> [SinkRunner-PollingRunner-DefaultSinkProcessor]
          >> (org.apache.flume.SinkRunner$PollingRunner.run:160) - Unable to deliver
          >> event. Exception follows.
          >> org.elasticsearch.common.jackson.dataformat.yaml.snakeyaml.error.YAMLException:
          >> java.io.CharConversionException: Invalid UTF-8 start byte 0xfc (at char
          >> #81, byte #-1)
          >> at
          >> org.elasticsearch.common.jackson.dataformat.yaml.snakeyaml.reader.StreamReader.update(StreamReader.java:198)
          >> at
          >> org.elasticsearch.common.jackson.dataformat.yaml.snakeyaml.reader.StreamReader.<init>(StreamReader.java:62)
          >> at
          >> org.elasticsearch.common.jackson.dataformat.yaml.YAMLParser.<init>(YAMLParser.java:147)
          >> at
          >> org.elasticsearch.common.jackson.dataformat.yaml.YAMLFactory._createParser(YAMLFactory.java:530)
          >> at
          >> org.elasticsearch.common.jackson.dataformat.yaml.YAMLFactory.createJsonParser(YAMLFactory.java:420)
          >> at
          >> org.elasticsearch.common.xcontent.yaml.YamlXContent.createParser(YamlXContent.java:83)
          >> at
          >> org.apache.flume.sink.elasticsearch.ContentBuilderUtil.addComplexField(ContentBuilderUtil.java:61)
          >> at
          >> org.apache.flume.sink.elasticsearch.ContentBuilderUtil.appendField(ContentBuilderUtil.java:47)
          >> at
          >> org.apache.flume.sink.elasticsearch.ElasticSearchLogStashEventSerializer.appendBody(ElasticSearchLogStashEventSerializer.java:87)
          >> at
          >> org.apache.flume.sink.elasticsearch.ElasticSearchLogStashEventSerializer.getContentBuilder(ElasticSearchLogStashEventSerializer.java:79)
          >> at
          >> org.apache.flume.sink.elasticsearch.ElasticSearchSink.process(ElasticSearchSink.java:178)
          >> at
          >> org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
          >> at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
          >> at java.lang.Thread.run(Thread.java:662)
          >> Caused by: java.io.CharConversionException: Invalid UTF-8 start byte 0xfc
          >> (at char #81, byte #-1)
          >> at
          >> org.elasticsearch.common.jackson.dataformat.yaml.UTF8Reader.reportInvalidInitial(UTF8Reader.java:395)
          >> at
          >> org.elasticsearch.common.jackson.dataformat.yaml.UTF8Reader.read(UTF8Reader.java:247)
          >> at
          >> org.elasticsearch.common.jackson.dataformat.yaml.UTF8Reader.read(UTF8Reader.java:157)
          >> at
          >> org.elasticsearch.common.jackson.dataformat.yaml.snakeyaml.reader.StreamReader.update(StreamReader.java:182)
          >> ... 13 more

          Detected by Allan Feid and documented on the user list http://mail-archives.apache.org/mod_mbox/flume-user/201306.mbox/%3CCAN94UWe6UvcOKT1S%2BXANC-sy0qFsxet3RJY9PVkj-eSfO5fk6Q%40mail.gmail.com%3E

          Steps:
          Send an event with the body as follows:
          foo¤data¤1371126476.436¤0.005¤555¤10.1.1.1¤HTTP/1.1¤GET¤http¤vhost¤/path/url¤¤-¤200¤
          referrer.com/search/?query=\x8D\x91\x89\xEF\x8Bc\x8E\x96\x93\xB0¤-¤-¤-

          Expected Results:
          The event is stored in elasticsearch.

          Actual Results:
          >> 10 Jun 2013 09:52:34,360 ERROR
          >> [SinkRunner-PollingRunner-DefaultSinkProcessor]
          >> (org.apache.flume.SinkRunner$PollingRunner.run:160) - Unable to deliver
          >> event. Exception follows.
          >> org.elasticsearch.common.jackson.dataformat.yaml.snakeyaml.error.YAMLException:
          >> java.io.CharConversionException: Invalid UTF-8 start byte 0xfc (at char
          >> #81, byte #-1)
          >> at
          >> org.elasticsearch.common.jackson.dataformat.yaml.snakeyaml.reader.StreamReader.update(StreamReader.java:198)
          >> at
          >> org.elasticsearch.common.jackson.dataformat.yaml.snakeyaml.reader.StreamReader.<init>(StreamReader.java:62)
          >> at
          >> org.elasticsearch.common.jackson.dataformat.yaml.YAMLParser.<init>(YAMLParser.java:147)
          >> at
          >> org.elasticsearch.common.jackson.dataformat.yaml.YAMLFactory._createParser(YAMLFactory.java:530)
          >> at
          >> org.elasticsearch.common.jackson.dataformat.yaml.YAMLFactory.createJsonParser(YAMLFactory.java:420)
          >> at
          >> org.elasticsearch.common.xcontent.yaml.YamlXContent.createParser(YamlXContent.java:83)
          >> at
          >> org.apache.flume.sink.elasticsearch.ContentBuilderUtil.addComplexField(ContentBuilderUtil.java:61)
          >> at
          >> org.apache.flume.sink.elasticsearch.ContentBuilderUtil.appendField(ContentBuilderUtil.java:47)
          >> at
          >> org.apache.flume.sink.elasticsearch.ElasticSearchLogStashEventSerializer.appendBody(ElasticSearchLogStashEventSerializer.java:87)
          >> at
          >> org.apache.flume.sink.elasticsearch.ElasticSearchLogStashEventSerializer.getContentBuilder(ElasticSearchLogStashEventSerializer.java:79)
          >> at
          >> org.apache.flume.sink.elasticsearch.ElasticSearchSink.process(ElasticSearchSink.java:178)
          >> at
          >> org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
          >> at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
          >> at java.lang.Thread.run(Thread.java:662)
          >> Caused by: java.io.CharConversionException: Invalid UTF-8 start byte 0xfc
          >> (at char #81, byte #-1)
          >> at
          >> org.elasticsearch.common.jackson.dataformat.yaml.UTF8Reader.reportInvalidInitial(UTF8Reader.java:395)
          >> at
          >> org.elasticsearch.common.jackson.dataformat.yaml.UTF8Reader.read(UTF8Reader.java:247)
          >> at
          >> org.elasticsearch.common.jackson.dataformat.yaml.UTF8Reader.read(UTF8Reader.java:157)
          >> at
          >> org.elasticsearch.common.jackson.dataformat.yaml.snakeyaml.reader.StreamReader.update(StreamReader.java:182)
          >> ... 13 more
          Edward Sargisson made changes -
          Summary ElasticSearchSink raises YAMLException when event body has unexpected encoding. ElasticsearchSink blocks raises exceptions when event body has unexpected encoding
          Mike Percy made changes -
          Fix Version/s v1.4.0 [ 12323372 ]
          Mike Percy made changes -
          Fix Version/s v1.4.0 [ 12323372 ]
          Allan Feid made changes -
          Field Original Value New Value
          Status Open [ 1 ] Patch Available [ 10002 ]
          Release Note catch YAML parsing errors when adding data to elasticsearch
          Affects Version/s v1.4.0 [ 12323372 ]
          Edward Sargisson created issue -

            People

            • Assignee:
              Ashish Paliwal
              Reporter:
              Edward Sargisson
            • Votes:
              2 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:

                Development