Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-251

[hbase] Stuck replaying the edits of crashed machine

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Blocker
    • Resolution: Fixed
    • 0.1.0
    • 0.1.0, 0.2.0
    • master
    • None

    Description

      Rapleaf master got stuck trying to replay the logs of the server holding the .META. region. Here are pertinent log excerpts:

      2008-01-12 02:17:42,621 DEBUG org.apache.hadoop.hbase.HLog: Creating new log file writer for path /data/hbase1/hregion_1679905157/oldlogfile.log; map content {spider_pages,25_530417241,1200073704087=org.apache.hadoop.io.SequenceFile$RecordCompressWriter@24336556, spider_pages,6_74488371,1200029312876=org.apache.had
      oop.io.SequenceFile$RecordCompressWriter@2a4203ab, spider_pages,2_561473281,1200054637086=org.apache.hadoop.io.SequenceFile$RecordCompressWriter@b972625, .META.,,1=org.apache.hadoop.io.SequenceFile$RecordCompressWriter@67a044a7, spider_pages,5_544278041,1199025825074=org.apache.hadoop.io.SequenceFile$RecordCompress
      Writer@42be0008, spider_pages,49_567090611,1200028378117=org.apache.hadoop.io.SequenceFile$RecordCompressWriter@7bf4cbaa, spider_pages,5_566039401,1200058871594=org.apache.hadoop.io.SequenceFile$RecordCompressWriter@16479e88, spider_pages,59_360738971,1200073647952=org.apache.hadoop.io.SequenceFile$RecordCompressWr
      iter@70494d14, spider_pages,59_302628011,1200073647951=org.apache.hadoop.io.SequenceFile$RecordCompressWriter@654670a8}
      2008-01-12 02:17:44,124 DEBUG org.apache.hadoop.hbase.HLog: Applied 20000 edits
      2008-01-12 02:17:49,076 DEBUG org.apache.hadoop.hbase.HLog: Applied 30000 edits
      2008-01-12 02:17:49,078 DEBUG org.apache.hadoop.hbase.HLog: Applied 30003 total edits
      2008-01-12 02:17:49,078 DEBUG org.apache.hadoop.hbase.HLog: Splitting 1 of 2: hdfs://tf1:7276/data/hbase1/log_XX.XX.XX.32_1200011947645_60020/hlog.dat.003
      2008-01-12 02:17:52,574 DEBUG org.apache.hadoop.hbase.HLog: Applied 10000 edits
      2008-01-12 02:17:59,822 WARN org.apache.hadoop.hbase.HMaster: Processing pending operations: ProcessServerShutdown of XX.XX.XX.32:60020
      java.io.EOFException
              at java.io.DataInputStream.readFully(DataInputStream.java:180)
              at org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:56)
              at org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:90)
              at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1763)
              at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1663)
              at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1709)
              at org.apache.hadoop.hbase.HLog.splitLog(HLog.java:168)
              at org.apache.hadoop.hbase.HMaster$ProcessServerShutdown.process(HMaster.java:2144)
              at org.apache.hadoop.hbase.HMaster.run(HMaster.java:1056)
      2008-01-12 02:17:59,822 DEBUG org.apache.hadoop.hbase.HMaster: Main processing loop: ProcessServerShutdown of XX.XX.XX.32:60020
      

      It keeps doing the above over and over again.

      I suppose we could skip bad logs... or just shut down master w/ a reason why.

      Odd is that we seem to be well into the file – we've run over 10000 edits... before we trip over the EOF.

      I've asked for an fsck to see what that says.

      Attachments

        Issue Links

          Activity

            People

              jimk Jim Kellerman
              stack Michael Stack
              Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: