Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-12943 Consistent Reads from Standby Node
  3. HDFS-14674

[SBN read] Got an unexpected txid when tail editlog

    XMLWordPrintableJSON

    Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.3.0, 3.2.1, 3.1.3
    • Component/s: None
    • Labels:
      None

      Description

      Add the following configuration

      error:

      //
      [2019-07-17T11:50:21.048+08:00] [INFO] [Edit log tailer] : replaying edit log: 1/20512836 transactions completed. (0%) [2019-07-17T11:50:21.059+08:00] [INFO] [Edit log tailer] : Edits file http://ip/getJournal?jid=ns1003&segmentTxId=232056426162&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH, http://ip/getJournal?ipjid=ns1003&segmentTxId=232056426162&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH, http://ip/getJournal?ipjid=ns1003&segmentTxId=232056426162&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH of size 3126782311 edits # 500 loaded in 3 seconds [2019-07-17T11:50:21.059+08:00] [INFO] [Edit log tailer] : Reading org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@51ceb7bc expecting start txid #232056752162 [2019-07-17T11:50:21.059+08:00] [INFO] [Edit log tailer] : Start loading edits file http://ip/getJournal?ipjid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH, http://ip/getJournal?ipjid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH, http://ip/getJournal?ipjid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH maxTxnipsToRead = 500 [2019-07-17T11:50:21.059+08:00] [INFO] [Edit log tailer] : Fast-forwarding stream 'http://ip/getJournal?jid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH, http://ip/getJournal?ipjid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH, http://ip/getJournal?ipjid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH' to transaction ID 232056751662 [2019-07-17T11:50:21.059+08:00] [INFO] [Edit log tailer] ip: Fast-forwarding stream 'http://ip/getJournal?jid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH' to transaction ID 232056751662 [2019-07-17T11:50:21.061+08:00] [ERROR] [Edit log tailer] : Unknown error encountered while tailing edits. Shutting down standby NN. java.io.IOException: There appears to be a gap in the edit log. We expected txid 232056752162, but got txid 232077264498. at org.apache.hadoop.hdfs.server.namenode.MetaRecoveryContext.editLogLoaderPrompt(MetaRecoveryContext.java:94) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:239) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:161) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:895) at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:321) at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:460) at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$400(EditLogTailer.java:410) at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:427) at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:414) at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:423) [2019-07-17T11:50:21.064+08:00] [INFO] [Edit log tailer] : Exiting with status 1 [2019-07-17T11:50:21.066+08:00] [INFO] [Thread-1] : SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down NameNode at ip ************************************************************/
      
      

       

      if dfs.ha.tail-edits.max-txns-per-lock value is 500,when the namenode load the editlog util 500,the current namenode will load the next editlog,but editlog more than 500.So,namenode got an unexpected txid when tail editlog.

       

       

      //
      [2019-07-17T11:50:21.059+08:00] [INFO] [Edit log tailer] : Edits file http://ip/getJournal?jid=ns1003&segmentTxId=232056426162&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH, http://ip/getJournal?jid=ns1003&segmentTxId=232056426162&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH, http://ip/getJournal?jid=ns1003&segmentTxId=232056426162&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH of size 3126782311 edits # 500 loaded in 3 seconds
      [2019-07-17T11:50:21.059+08:00] [INFO] [Edit log tailer] : Reading org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@51ceb7bc expecting start txid #232056752162
      [2019-07-17T11:50:21.059+08:00] [INFO] [Edit log tailer] : Start loading edits file http://ip/getJournal?jid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH, http://ip/getJournal?jid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH, http://ip/getJournal?jid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH maxTxnsToRead = 500
      [2019-07-17T11:50:21.059+08:00] [INFO] [Edit log tailer] : Fast-forwarding stream 'http://ip/getJournal?jid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH, http://ip/getJournal?jid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH, http://ip/getJournal?jid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH' to transaction ID 232056751662
      [2019-07-17T11:50:21.059+08:00] [INFO] [Edit log tailer] : Fast-forwarding stream 'http://ip/getJournal?jid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH' to transaction ID 232056751662
      [2019-07-17T11:50:21.061+08:00] [ERROR] [Edit log tailer] : Unknown error encountered while tailing edits. Shutting down standby NN.
      java.io.IOException: There appears to be a gap in the edit log.  We expected txid 232056752162, but got txid 232077264498.
      

      Read data from JN twice in the same second,changed segmentTxId,finally quit because of txid mismatch.

        Attachments

        1. image-2019-08-22-16-24-06-518.png
          35 kB
          wangzhaohui
        2. image.png
          13 kB
          xuzq
        3. HDFS-14674-011.patch
          6 kB
          wangzhaohui
        4. HDFS-14674-010.patch
          7 kB
          wangzhaohui
        5. HDFS-14674-009.patch
          6 kB
          wangzhaohui
        6. HDFS-14674-008.patch
          6 kB
          wangzhaohui
        7. HDFS-14674-007.patch
          7 kB
          wangzhaohui
        8. HDFS-14674-006.patch
          4 kB
          wangzhaohui
        9. HDFS-14674-005.patch
          4 kB
          wangzhaohui
        10. HDFS-14674-004.patch
          4 kB
          wangzhaohui
        11. HDFS-14674-003.patch
          5 kB
          wangzhaohui
        12. HDFS-14674-001.patch
          2 kB
          wangzhaohui

          Issue Links

            Activity

              People

              • Assignee:
                wangzhaohui wangzhaohui
                Reporter:
                wangzhaohui wangzhaohui
              • Votes:
                0 Vote for this issue
                Watchers:
                15 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: