Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-14790

Support Client Write Fan-Out

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.3.0
    • None
    • None

    Description

      The default behavior of an HDFS write is to setup a pipeline. A file is broken into packets and sent through the pipeline. Pipelining provides good throughput, but latency suffers.

      Allowing a client to specify a fan-out strategy allows the client to send the packets to the DataNodes concurrently instead of passing the packet through a pipeline serially.

      # Pipeline
      C |-------> DN -------> DN -------> DN
      
      # Fan Out
      
        |-------> DN
      C |-------> DN
        |-------> DN
      

      Also, if there's a 'min replication' of, for example, 2. The client only needs to wait for the first 2 ACKs before writing the next packet as long as the 2 ACKs are from different racks. The block placement rules may need to support this.

      HBase requires this improved latency.

      Attachments

        Activity

          People

            Unassigned Unassigned
            belugabehr David Mollitor
            Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

            Dates

              Created:
              Updated: