Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-7587

Edit log corruption can happen if append fails with a quota violation

    Details

    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      We have seen a standby namenode crashing due to edit log corruption. It was complaining that OP_CLOSE cannot be applied because the file is not under-construction.

      When a client was trying to append to the file, the remaining space quota was very small. This caused a failure in prepareFileForWrite(), but after the inode was already converted for writing and a lease added. Since these were not undone when the quota violation was detected, the file was left in under-construction with an active lease without edit logging OP_ADD.

      A subsequent append() eventually caused a lease recovery after the soft limit period. This resulted in commitBlockSynchronization(), which closed the file with OP_CLOSE being logged. Since there was no corresponding OP_ADD, edit replaying could not apply this.

        Attachments

        1. HDFS-7587.001.patch
          6 kB
          Jing Zhao
        2. HDFS-7587.002.patch
          9 kB
          Jing Zhao
        3. HDFS-7587.003.patch
          10 kB
          Jing Zhao
        4. HDFS-7587.patch
          5 kB
          Kihwal Lee
        5. HDFS-7587-branch-2.6.patch
          13 kB
          Ming Ma

          Issue Links

            Activity

              People

              • Assignee:
                jingzhao Jing Zhao
                Reporter:
                kihwal Kihwal Lee
              • Votes:
                0 Vote for this issue
                Watchers:
                21 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: