Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-10178

Permanent write failures can happen if pipeline recoveries occur for the first packet

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • None
    • 2.8.0, 2.7.3, 2.6.5, 3.0.0-alpha1
    • None
    • None
    • Reviewed

    Description

      We have observed that write fails permanently if the first packet doesn't go through properly and pipeline recovery happens. If the write op creates a pipeline, but the actual data packet does not reach one or more datanodes in time, the pipeline recovery will be done against the 0-byte partial block.

      If additional datanodes are added, the block is transferred to the new nodes. After the transfer, each node will have a meta file containing the header and 0-length data block file. The pipeline recovery seems to work correctly up to this point, but write fails when actual data packet is resent.

      Attachments

        1. HDFS-10178.patch
          8 kB
          Kihwal Lee
        2. HDFS-10178.v2.patch
          8 kB
          Kihwal Lee
        3. HDFS-10178.v3.patch
          9 kB
          Kihwal Lee
        4. HDFS-10178.v4.patch
          6 kB
          Kihwal Lee
        5. HDFS-10178.v5.patch
          5 kB
          Kihwal Lee

        Issue Links

          Activity

            People

              kihwal Kihwal Lee
              kihwal Kihwal Lee
              Votes:
              0 Vote for this issue
              Watchers:
              17 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: