Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-4412

Support HDFS IO throttling

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      When an applications upload/download files from/to HDFS clusters, it would be nice if the IO could be throttled so that they won't go beyond the specified maximum bandwidth.

      Two options to implement this IO throttling:

      #1. IO Throttling happens at the FSDataInputStream and FSDataOutputStream level.

      Add an IO Throttler to FSDataInputStream/FSDataOutputStram, and whenever an read/write happens, throttle it first(if throttler is set), then do the actual read/write.

      We may need to add new FileSystem apis to take an IO throttler as input parameter.

      #2. IO Throttling happens at the application level.

      Instead of changing the FSDataInputStream/FSDataOutputStream, all IO throttling is done at the application level.

      In this approach, FileSystem api remains unchanged.

      Either case, an IO throttler interface is needed, which has a:
      public void throttle(long numOfBytes);

      The current DataTransferThrottler could be an implementation of this IO throttler interface.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              zhenxiao Zhenxiao Luo
              Votes:
              1 Vote for this issue
              Watchers:
              31 Start watching this issue

              Dates

                Created:
                Updated: