Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-25053

WAL replay should ignore 0-length files

    XMLWordPrintableJSON

Details

    • Reviewed

    Description

      I overdrove a small testing cluster, filling HDFS. After cleaning up data to bring HBase back up, I noticed all masters refused to start abort. Logs complain of seeking past EOF. Indeed the last wal file name logged is a 0-length file. WAL replay should gracefully skip and clean up such an empty file.

      2020-09-16 19:51:30,297 ERROR org.apache.hadoop.hbase.master.HMaster: Failed to become active master
      java.io.EOFException: Cannot seek after EOF
              at org.apache.hadoop.hdfs.DFSInputStream.seek(DFSInputStream.java:1448)
              at org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:66)
              at org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.initInternal(ProtobufLogReader.java:211)
              at org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.initReader(ProtobufLogReader.java:173)
              at org.apache.hadoop.hbase.regionserver.wal.ReaderBase.init(ReaderBase.java:64)
              at org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.init(ProtobufLogReader.java:168)
              at org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:323)
              at org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:305)
              at org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:293)
              at org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:429)
              at org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEdits(HRegion.java:4859)
              at org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEditsIfAny(HRegion.java:4765)
              at org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:1014)
              at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:956)
              at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7496)
              at org.apache.hadoop.hbase.regionserver.HRegion.openHRegionFromTableDir(HRegion.java:7454)
              at org.apache.hadoop.hbase.master.region.MasterRegion.open(MasterRegion.java:269)
              at org.apache.hadoop.hbase.master.region.MasterRegion.create(MasterRegion.java:309)
              at org.apache.hadoop.hbase.master.region.MasterRegionFactory.create(MasterRegionFactory.java:104)
              at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:949)
              at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2240)
              at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:622)
              at java.base/java.lang.Thread.run(Thread.java:834)
      

      Attachments

        Issue Links

          Activity

            People

              niuyulin Yulin Niu
              ndimiduk Nick Dimiduk
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: