Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-14790

Support Client Write Fan-Out

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 3.3.0
    • Fix Version/s: None
    • Labels:
      None

      Description

      The default behavior of an HDFS write is to setup a pipeline. A file is broken into packets and sent through the pipeline. Pipelining provides good throughput, but latency suffers.

      Allowing a client to specify a fan-out strategy allows the client to send the packets to the DataNodes concurrently instead of passing the packet through a pipeline serially.

      # Pipeline
      C |-------> DN -------> DN -------> DN
      
      # Fan Out
      
        |-------> DN
      C |-------> DN
        |-------> DN
      

      Also, if there's a 'min replication' of, for example, 2. The client only needs to wait for the first 2 ACKs before writing the next packet as long as the 2 ACKs are from different racks. The block placement rules may need to support this.

      HBase requires this improved latency.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              belugabehr David Mollitor
            • Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

              • Created:
                Updated: