Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-10178

Permanent write failures can happen if pipeline recoveries occur for the first packet

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.8.0, 2.7.3, 2.6.5, 3.0.0-alpha1
    • Component/s: None
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      We have observed that write fails permanently if the first packet doesn't go through properly and pipeline recovery happens. If the write op creates a pipeline, but the actual data packet does not reach one or more datanodes in time, the pipeline recovery will be done against the 0-byte partial block.

      If additional datanodes are added, the block is transferred to the new nodes. After the transfer, each node will have a meta file containing the header and 0-length data block file. The pipeline recovery seems to work correctly up to this point, but write fails when actual data packet is resent.

        Attachments

        1. HDFS-10178.patch
          8 kB
          Kihwal Lee
        2. HDFS-10178.v2.patch
          8 kB
          Kihwal Lee
        3. HDFS-10178.v3.patch
          9 kB
          Kihwal Lee
        4. HDFS-10178.v4.patch
          6 kB
          Kihwal Lee
        5. HDFS-10178.v5.patch
          5 kB
          Kihwal Lee

          Issue Links

            Activity

              People

              • Assignee:
                kihwal Kihwal Lee
                Reporter:
                kihwal Kihwal Lee
              • Votes:
                0 Vote for this issue
                Watchers:
                17 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: