Flume
  1. Flume
  2. FLUME-2003

It'll be nice if we can control the HDFS block-size and replication for specific HDFS-sink instances

    Details

    • Type: Improvement Improvement
    • Status: Patch Available
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Sinks+Sources
    • Labels:
      None

      Description

      The forthcoming patch provides that functionality.

      1. FLUME-2003-2.patch
        14 kB
        Thiruvalluvan M. G.
      2. FLUME-2003.patch
        4 kB
        Thiruvalluvan M. G.

        Issue Links

          Activity

          Hide
          Thiruvalluvan M. G. added a comment -

          The fix is to define three new optional configuration parameters:

          • hdfs.bufferSize
          • hdfs.blockSize
          • hdfs.replication

          For each of these, if the value is not specified in the configuration file, the default values (defined by the file system) are used.

          It is unfortunate that there is no clean way to get the default value for buffer size from FileSystem one has to use the HDFS configuration parameter "io.file.buffer.size" and fall back on an yet another (hard coded) default value. There doesn't seem to be a way to instruct filesystem to ask for the default value for the buffer size or to let it use the default value implicitly. If either block-size or replication is specified, one must specify a valid value for buffer size as well.

          I tested that these work by specifying the values and checking that they are honored.

          Show
          Thiruvalluvan M. G. added a comment - The fix is to define three new optional configuration parameters: hdfs.bufferSize hdfs.blockSize hdfs.replication For each of these, if the value is not specified in the configuration file, the default values (defined by the file system) are used. It is unfortunate that there is no clean way to get the default value for buffer size from FileSystem one has to use the HDFS configuration parameter "io.file.buffer.size" and fall back on an yet another (hard coded) default value. There doesn't seem to be a way to instruct filesystem to ask for the default value for the buffer size or to let it use the default value implicitly. If either block-size or replication is specified, one must specify a valid value for buffer size as well. I tested that these work by specifying the values and checking that they are honored.
          Hide
          Mike Percy added a comment -

          Hi Thiruvalluvan,
          Thanks for the patch. Offhand, I'd recommend setting these values via your local hdfs-site.xml file which should affect what Flume sees as its defaults. Any reason that approach would not work for you?

          Show
          Mike Percy added a comment - Hi Thiruvalluvan, Thanks for the patch. Offhand, I'd recommend setting these values via your local hdfs-site.xml file which should affect what Flume sees as its defaults. Any reason that approach would not work for you?
          Hide
          Thiruvalluvan M. G. added a comment -

          Yes it would. But such a configuration will affect all the files created by Flume. One does not have control over individual files. It is observed that the throughput of HDFS write operations is a function of replication factor and block size. They have huge impact on flush() and hence on small "transactions" because HDFS-sink flushes at the end of each transaction. With this patch one can configure smaller block size and lower replication for certain types of events (without affecting others) and then convert to larger configuration as the data gets processed.

          Show
          Thiruvalluvan M. G. added a comment - Yes it would. But such a configuration will affect all the files created by Flume. One does not have control over individual files. It is observed that the throughput of HDFS write operations is a function of replication factor and block size. They have huge impact on flush() and hence on small "transactions" because HDFS-sink flushes at the end of each transaction. With this patch one can configure smaller block size and lower replication for certain types of events (without affecting others) and then convert to larger configuration as the data gets processed.
          Hide
          Mike Percy added a comment -

          I can see that... playing devil's advocate here... so why would you want lots of small flushes to HDFS? If you use large batch sizes then your throughput will be quite high. If you reduce replication factor your durability also goes down.

          Show
          Mike Percy added a comment - I can see that... playing devil's advocate here... so why would you want lots of small flushes to HDFS? If you use large batch sizes then your throughput will be quite high. If you reduce replication factor your durability also goes down.
          Hide
          Thiruvalluvan M. G. added a comment -

          In my case, we use Flume to collect events whose size varies from very small to very large. At present, there is a preprocessing stage which trims the events in order to reduce load on HDFS. The pre-processing stage is neither reliable nor scalable. We'd like to insert raw data into HDFS and implement pre-processing logic as a Hadoop job. So the file created by Flume will get consumed very quickly. By this we will exploit scalability offered by Hadoop for the pre-processing stage.

          This patch essentially exposes a feature offered by HDFS file system API to the Flume user. It is backward compatible and hence the current usage can continue. It merely allows the Flume user to make a trade-off between reliability and performance, if he so wishes. It is not necessary that the user should only reduce replication or block size. If desired he can choose a larger block size or more replication (at the cost of performance). I don't see a downside. Usually new flexibility would mean lower performance, more complex design or hard-to maintain code. I don't think any of those is true in this case. In other words, this patch gives some benefits to some users with practically no additional cost - either to developers or to other users.

          Show
          Thiruvalluvan M. G. added a comment - In my case, we use Flume to collect events whose size varies from very small to very large. At present, there is a preprocessing stage which trims the events in order to reduce load on HDFS. The pre-processing stage is neither reliable nor scalable. We'd like to insert raw data into HDFS and implement pre-processing logic as a Hadoop job. So the file created by Flume will get consumed very quickly. By this we will exploit scalability offered by Hadoop for the pre-processing stage. This patch essentially exposes a feature offered by HDFS file system API to the Flume user. It is backward compatible and hence the current usage can continue. It merely allows the Flume user to make a trade-off between reliability and performance, if he so wishes. It is not necessary that the user should only reduce replication or block size. If desired he can choose a larger block size or more replication (at the cost of performance). I don't see a downside. Usually new flexibility would mean lower performance, more complex design or hard-to maintain code. I don't think any of those is true in this case. In other words, this patch gives some benefits to some users with practically no additional cost - either to developers or to other users.
          Hide
          Mike Percy added a comment -

          Hi Thiruvalluvan, fair points. OK let's do this:

          1. For these new properties that are literally HDFS properties overriding the behavior in the hdfs-site.xml config file, let's come up with a convention to name them according to their HDFS config property names. For example, instead of "hdfs.bufferSize" let's call it "hdfs.hdfsIoFileBufferSize", because the property in hdfs-site.xml is "io.file.buffer.size". Please do the same for the other two also.
          2. Please add an implementation of this feature for the CompressedDataStream class as well.
          3. Please document these new parameters in the user guide under the HDFS Sink section. The source file is at flume-ng-doc/sphinx/FlumeUserGuide.rst

          Finally, please also post your patch on http://reviews.apache.org/groups/Flume for easier review and link to it here. The source repository to use is flume-git. Details here: https://cwiki.apache.org/confluence/display/FLUME/How+to+Contribute#HowtoContribute-ProvidingPatches

          Show
          Mike Percy added a comment - Hi Thiruvalluvan, fair points. OK let's do this: 1. For these new properties that are literally HDFS properties overriding the behavior in the hdfs-site.xml config file, let's come up with a convention to name them according to their HDFS config property names. For example, instead of "hdfs.bufferSize" let's call it "hdfs.hdfsIoFileBufferSize", because the property in hdfs-site.xml is "io.file.buffer.size". Please do the same for the other two also. 2. Please add an implementation of this feature for the CompressedDataStream class as well. 3. Please document these new parameters in the user guide under the HDFS Sink section. The source file is at flume-ng-doc/sphinx/FlumeUserGuide.rst Finally, please also post your patch on http://reviews.apache.org/groups/Flume for easier review and link to it here. The source repository to use is flume-git. Details here: https://cwiki.apache.org/confluence/display/FLUME/How+to+Contribute#HowtoContribute-ProvidingPatches
          Hide
          Thiruvalluvan M. G. added a comment -

          Done as suggested.

          It appears that io.file.buffer.size was the name of the parameter in the older versions of HDFS. With 2.x, it seems to be dfs.stream-buffer-size as documented at http://hadoop.apache.org/docs/r2.0.3-alpha/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml. If I missed something, I'll be happy to resubmit.

          Show
          Thiruvalluvan M. G. added a comment - Done as suggested. It appears that io.file.buffer.size was the name of the parameter in the older versions of HDFS. With 2.x, it seems to be dfs.stream-buffer-size as documented at http://hadoop.apache.org/docs/r2.0.3-alpha/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml . If I missed something, I'll be happy to resubmit.
          Hide
          Mike Percy added a comment -

          Thiruvalluvan M. G.: Sorry to take so long to get back to this. Please mark https://reviews.apache.org/r/10606/ as a public review and also add Group = Flume to the list of reviewers. I get the following message when I try to review that URL:

          You don't have access to this review request.
          
          This review request is private. You must be a requested reviewer, either directly or on a requested group, and have permission to access the repository in order to view this review request.
          

          Thanks,
          Mike

          Show
          Mike Percy added a comment - Thiruvalluvan M. G. : Sorry to take so long to get back to this. Please mark https://reviews.apache.org/r/10606/ as a public review and also add Group = Flume to the list of reviewers. I get the following message when I try to review that URL: You don't have access to this review request. This review request is private. You must be a requested reviewer, either directly or on a requested group, and have permission to access the repository in order to view this review request. Thanks, Mike
          Hide
          Thiruvalluvan M. G. added a comment -

          Please mark https://reviews.apache.org/r/10606/ as a public review and also add Group = Flume to the list of reviewers

          I'm sorry I had failed to publish the review request. I did so and added Flume to the Groups just now. Hope you'll be able to access it.

          Show
          Thiruvalluvan M. G. added a comment - Please mark https://reviews.apache.org/r/10606/ as a public review and also add Group = Flume to the list of reviewers I'm sorry I had failed to publish the review request. I did so and added Flume to the Groups just now. Hope you'll be able to access it.
          Hide
          Thiruvalluvan M. G. added a comment -

          Is it possible for someone to review this. Thanks.

          Thiru

          Show
          Thiruvalluvan M. G. added a comment - Is it possible for someone to review this. Thanks. Thiru
          Hide
          Mike Percy added a comment -

          Thiru, I added some comments on the review board, not sure if you saw them.

          Show
          Mike Percy added a comment - Thiru, I added some comments on the review board, not sure if you saw them.
          Hide
          Mike Percy added a comment -

          This will probably not make it into v1.4.0, clearing fixVersion.

          Show
          Mike Percy added a comment - This will probably not make it into v1.4.0, clearing fixVersion.

            People

            • Assignee:
              Thiruvalluvan M. G.
              Reporter:
              Thiruvalluvan M. G.
            • Votes:
              1 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:

                Development