Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-3328

DFS write pipeline : only the last datanode needs to verify checksum

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.16.0
    • Fix Version/s: 0.19.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Currently all the datanodes in DFS write pipeline verify checksum. Since the current protocol includes acks from the datanodes, an ack from the last node could also serve as verification that checksum ok. In that sense, only the last datanode needs to verify checksum. Based on this comment from HADOOP-1702, CPU consumption might go down by another 25-30% (4/14) after HADOOP-1702.

      Also this would make it easier to use transferTo() and transferFrom() on intermediate datanodes since they don't need to look at the data.

        Attachments

        1. HADOOP-3328.patch
          1.0 kB
          Raghu Angadi
        2. HADOOP-3328.patch
          1 kB
          Raghu Angadi

          Activity

            People

            • Assignee:
              rangadi Raghu Angadi
              Reporter:
              rangadi Raghu Angadi
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: