Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-5080

BootstrapStandby not working with QJM when the existing NN is active

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 3.0.0-alpha1
    • 2.1.1-beta
    • ha, qjm
    • None
    • Reviewed

    Description

      Currently when QJM is used, running BootstrapStandby while the existing NN is active can get the following exception:

      FATAL ha.BootstrapStandby: Unable to read transaction ids 6175397-6175405 from the configured shared edits storage. Please copy these logs into the shared edits storage or call saveNamespace on the active node.
      Error: Gap in transactions. Expected to be able to read up until at least txid 6175405 but unable to find any edit logs containing txid 6175405
      java.io.IOException: Gap in transactions. Expected to be able to read up until at least txid 6175405 but unable to find any edit logs containing txid 6175405
      	at org.apache.hadoop.hdfs.server.namenode.FSEditLog.checkForGaps(FSEditLog.java:1300)
      	at org.apache.hadoop.hdfs.server.namenode.FSEditLog.selectInputStreams(FSEditLog.java:1258)
      	at org.apache.hadoop.hdfs.server.namenode.ha.BootstrapStandby.checkLogsAvailableForRead(BootstrapStandby.java:229)
      

      Looks like the cause of the exception is that, when the active NN is queries by BootstrapStandby about the last written transaction ID, the in-progress edit log segment is included. However, when journal nodes are asked about the last written transaction ID, in-progress edit log is excluded. This causes BootstrapStandby#checkLogsAvailableForRead to complain gaps.

      To fix this, we can either let journal nodes take into account the in-progress editlog, or let active NN exclude the in-progress edit log segment.

      Attachments

        1. HDFS-5080.000.patch
          22 kB
          Jing Zhao
        2. HDFS-5080.001.patch
          22 kB
          Jing Zhao
        3. HDFS-5080.002.patch
          32 kB
          Jing Zhao

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            jingzhao Jing Zhao
            jingzhao Jing Zhao
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment