Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-12943 Consistent Reads from Standby Node
  3. HDFS-14674

[SBN read] Got an unexpected txid when tail editlog

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • None
    • 3.3.0, 3.2.1, 3.1.3
    • None
    • None

    Description

      Add the following configuration

      error:

      //
      [2019-07-17T11:50:21.048+08:00] [INFO] [Edit log tailer] : replaying edit log: 1/20512836 transactions completed. (0%) [2019-07-17T11:50:21.059+08:00] [INFO] [Edit log tailer] : Edits file http://ip/getJournal?jid=ns1003&segmentTxId=232056426162&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH, http://ip/getJournal?ipjid=ns1003&segmentTxId=232056426162&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH, http://ip/getJournal?ipjid=ns1003&segmentTxId=232056426162&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH of size 3126782311 edits # 500 loaded in 3 seconds [2019-07-17T11:50:21.059+08:00] [INFO] [Edit log tailer] : Reading org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@51ceb7bc expecting start txid #232056752162 [2019-07-17T11:50:21.059+08:00] [INFO] [Edit log tailer] : Start loading edits file http://ip/getJournal?ipjid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH, http://ip/getJournal?ipjid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH, http://ip/getJournal?ipjid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH maxTxnipsToRead = 500 [2019-07-17T11:50:21.059+08:00] [INFO] [Edit log tailer] : Fast-forwarding stream 'http://ip/getJournal?jid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH, http://ip/getJournal?ipjid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH, http://ip/getJournal?ipjid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH' to transaction ID 232056751662 [2019-07-17T11:50:21.059+08:00] [INFO] [Edit log tailer] ip: Fast-forwarding stream 'http://ip/getJournal?jid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH' to transaction ID 232056751662 [2019-07-17T11:50:21.061+08:00] [ERROR] [Edit log tailer] : Unknown error encountered while tailing edits. Shutting down standby NN. java.io.IOException: There appears to be a gap in the edit log. We expected txid 232056752162, but got txid 232077264498. at org.apache.hadoop.hdfs.server.namenode.MetaRecoveryContext.editLogLoaderPrompt(MetaRecoveryContext.java:94) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:239) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:161) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:895) at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:321) at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:460) at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$400(EditLogTailer.java:410) at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:427) at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:414) at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:423) [2019-07-17T11:50:21.064+08:00] [INFO] [Edit log tailer] : Exiting with status 1 [2019-07-17T11:50:21.066+08:00] [INFO] [Thread-1] : SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down NameNode at ip ************************************************************/
      
      

       

      if dfs.ha.tail-edits.max-txns-per-lock value is 500,when the namenode load the editlog util 500,the current namenode will load the next editlog,but editlog more than 500.So,namenode got an unexpected txid when tail editlog.

       

       

      //
      [2019-07-17T11:50:21.059+08:00] [INFO] [Edit log tailer] : Edits file http://ip/getJournal?jid=ns1003&segmentTxId=232056426162&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH, http://ip/getJournal?jid=ns1003&segmentTxId=232056426162&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH, http://ip/getJournal?jid=ns1003&segmentTxId=232056426162&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH of size 3126782311 edits # 500 loaded in 3 seconds
      [2019-07-17T11:50:21.059+08:00] [INFO] [Edit log tailer] : Reading org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@51ceb7bc expecting start txid #232056752162
      [2019-07-17T11:50:21.059+08:00] [INFO] [Edit log tailer] : Start loading edits file http://ip/getJournal?jid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH, http://ip/getJournal?jid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH, http://ip/getJournal?jid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH maxTxnsToRead = 500
      [2019-07-17T11:50:21.059+08:00] [INFO] [Edit log tailer] : Fast-forwarding stream 'http://ip/getJournal?jid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH, http://ip/getJournal?jid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH, http://ip/getJournal?jid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH' to transaction ID 232056751662
      [2019-07-17T11:50:21.059+08:00] [INFO] [Edit log tailer] : Fast-forwarding stream 'http://ip/getJournal?jid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH' to transaction ID 232056751662
      [2019-07-17T11:50:21.061+08:00] [ERROR] [Edit log tailer] : Unknown error encountered while tailing edits. Shutting down standby NN.
      java.io.IOException: There appears to be a gap in the edit log.  We expected txid 232056752162, but got txid 232077264498.
      

      Read data from JN twice in the same second,changed segmentTxId,finally quit because of txid mismatch.

      Attachments

        1. HDFS-14674-001.patch
          2 kB
          Zhaohui Wang
        2. HDFS-14674-003.patch
          5 kB
          Zhaohui Wang
        3. HDFS-14674-004.patch
          4 kB
          Zhaohui Wang
        4. HDFS-14674-005.patch
          4 kB
          Zhaohui Wang
        5. HDFS-14674-006.patch
          4 kB
          Zhaohui Wang
        6. HDFS-14674-007.patch
          7 kB
          Zhaohui Wang
        7. HDFS-14674-008.patch
          6 kB
          Zhaohui Wang
        8. HDFS-14674-009.patch
          6 kB
          Zhaohui Wang
        9. HDFS-14674-010.patch
          7 kB
          Zhaohui Wang
        10. HDFS-14674-011.patch
          6 kB
          Zhaohui Wang
        11. image.png
          13 kB
          ZanderXu
        12. image-2019-08-22-16-24-06-518.png
          35 kB
          Zhaohui Wang

        Issue Links

          Activity

            People

              wangzhaohui Zhaohui Wang
              wangzhaohui Zhaohui Wang
              Votes:
              0 Vote for this issue
              Watchers:
              15 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: