Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: v1.1.0
    • Fix Version/s: v1.4.0
    • Component/s: Sinks+Sources
    • Labels:
      None
    • Release Note:
      Bug Fix: Default batch size now set to 100 for Elastic Search sink.

      Description

      AvroSink default batch size is 100
      HDFSEventSink default batch size is 1
      RollingFileSink has no configurable batch size

      1. FLUME-1076.patch
        1.0 kB
        Roshan Naik

        Activity

        Hide
        Roshan Naik added a comment -

        What exactly needs to happen here ? Should they all be 100 ?

        Show
        Roshan Naik added a comment - What exactly needs to happen here ? Should they all be 100 ?
        Hide
        Brock Noland added a comment -

        Yeah I think so. HDFSEventSink might have increased and a few other sources/sinks have had batch sizes added. I just think that we should standardize this across all sinks/sources. 100 seems good.

        Show
        Brock Noland added a comment - Yeah I think so. HDFSEventSink might have increased and a few other sources/sinks have had batch sizes added. I just think that we should standardize this across all sinks/sources. 100 seems good.
        Hide
        Mike Percy added a comment -

        Maybe back in March it would have been a good time to change all the defaults batch size values. But now?

        Show
        Mike Percy added a comment - Maybe back in March it would have been a good time to change all the defaults batch size values. But now?
        Hide
        Roshan Naik added a comment -

        I see that the following sinks have their batch sizes at 100 ..

        • Rolling File, HDFS, HBase, Async HBase, Null, Avro

        The avro sink seems to be picking up its default in a relatively complicated manner from RpcClientConfigurationConstants.DEFAULT_BATCH_SIZE inside NettyAvroRpcClient.configure()

        Following Sinks do not use batch size.. and its probably ok that way (they operate as if batch size = 1).

        • IRC, Logger

        The following do not have any default although the documentation claims otherwise :

        • ElasticSearch

        In ElasticSearch it defaults to 0. This appears to be a bug.

        I dont see any other variation as such in the default batch sizes.

        Show
        Roshan Naik added a comment - I see that the following sinks have their batch sizes at 100 .. Rolling File, HDFS, HBase, Async HBase, Null, Avro The avro sink seems to be picking up its default in a relatively complicated manner from RpcClientConfigurationConstants.DEFAULT_BATCH_SIZE inside NettyAvroRpcClient.configure() Following Sinks do not use batch size.. and its probably ok that way (they operate as if batch size = 1). IRC, Logger The following do not have any default although the documentation claims otherwise : ElasticSearch In ElasticSearch it defaults to 0. This appears to be a bug. I dont see any other variation as such in the default batch sizes.
        Hide
        Roshan Naik added a comment -

        Set default batchSize to 100 in ElasticSearch sink

        Show
        Roshan Naik added a comment - Set default batchSize to 100 in ElasticSearch sink
        Hide
        Roshan Naik added a comment -

        Setting default batch size to 100 for Elastic Search sink

        Show
        Roshan Naik added a comment - Setting default batch size to 100 for Elastic Search sink
        Hide
        Brock Noland added a comment -

        Mike, didn't FLUME-1631 just increase the HDFSEventSink batch size to 100 from 1? The good news is that, thanks to Roshan's detective work, it looks like we only have to change the ElasticSearchSink.

        Show
        Brock Noland added a comment - Mike, didn't FLUME-1631 just increase the HDFSEventSink batch size to 100 from 1? The good news is that, thanks to Roshan's detective work, it looks like we only have to change the ElasticSearchSink.
        Hide
        Roshan Naik added a comment -

        Could somebody take a look at committing this patch. its a trivial one.

        Show
        Roshan Naik added a comment - Could somebody take a look at committing this patch. its a trivial one.
        Hide
        Mike Percy added a comment -

        You guys are right, nice detective work Roshan.

        +1

        Show
        Mike Percy added a comment - You guys are right, nice detective work Roshan. +1
        Hide
        Mike Percy added a comment -

        Pushed to trunk & flume-1.4 branches. Thanks for the patch Roshan!

        Show
        Mike Percy added a comment - Pushed to trunk & flume-1.4 branches. Thanks for the patch Roshan!
        Hide
        Hudson added a comment -

        Integrated in flume-trunk #347 (See https://builds.apache.org/job/flume-trunk/347/)
        FLUME-1076. Sink default batch sizes vary wildly. (Revision 58173b8983027124a61783b4326dee3347ab7552)

        Result = SUCCESS
        mpercy : http://git-wip-us.apache.org/repos/asf/flume/repo?p=flume.git&a=commit&h=58173b8983027124a61783b4326dee3347ab7552
        Files :

        • flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/ElasticSearchSink.java
        Show
        Hudson added a comment - Integrated in flume-trunk #347 (See https://builds.apache.org/job/flume-trunk/347/ ) FLUME-1076 . Sink default batch sizes vary wildly. (Revision 58173b8983027124a61783b4326dee3347ab7552) Result = SUCCESS mpercy : http://git-wip-us.apache.org/repos/asf/flume/repo?p=flume.git&a=commit&h=58173b8983027124a61783b4326dee3347ab7552 Files : flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/ElasticSearchSink.java

          People

          • Assignee:
            Roshan Naik
            Reporter:
            Brock Noland
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development