Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-1314

master sees HRS znode expire and splits log while the HRS is still running and accepting edits

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Duplicate
    • 0.20.0
    • 0.20.0
    • None
    • None

    Description

      ZK session expiration related problem. HRS loses its ephemeral node while it is still up and running and accepting edits. Master sees it go away and starts splitting its logs while edits are still being written. After this, all reconstruction logs have to be manually removed from the region directories or the regions will never deploy (CRC errors). I think on HDFS edits would be lost, not corrupted. (I am using a HBase root on local file system.)

      HRS ZK session expires, causing its znode to go away:

      2009-04-07 03:50:39,953 INFO org.apache.hadoop.hbase.master.ServerManager: localhost.localdomain_1239068648333_60020 znode expired
      2009-04-07 03:50:40,565 DEBUG org.apache.hadoop.hbase.master.HMaster: Processing todo: ProcessServerShutdown of localhost.localdomain_1239068648333_60020
      2009-04-07 03:50:40,637 INFO org.apache.hadoop.hbase.master.RegionServerOperation: process shutdown of server localhost.localdomain_1239068648333_60020: logSplit: false, rootRescanned: false, numberOfMetaRegions: 1, onlineMetaRegions.size(): 1

      But here we have the HRS still reporting in, triggering a LeaseStillHeldException:

      2009-04-07 03:50:40,826 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 2 on 60000, call regionServerReport(address: 127.0.0.1:60020, startcode: 1239068648333, load: (requests=14, regions=7, usedHeap=582, maxHeap=888), [Lorg.apache.hadoop.hbase.HMsg;@6da21389, [Lorg.apache.hadoop.hbase.HRegionInfo;@2bb0bf9a) from 127.0.0.1:39238: error: org.apache.hadoop.hbase.Leases$LeaseStillHeldException
      org.apache.hadoop.hbase.Leases$LeaseStillHeldException
      at org.apache.hadoop.hbase.master.ServerManager.regionServerReport(ServerManager.java:198)
      at org.apache.hadoop.hbase.master.HMaster.regionServerReport(HMaster.java:601)
      at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      at java.lang.reflect.Method.invoke(Method.java:597)
      at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:632)
      at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:909)

      And log splitting starts anyway:

      2009-04-07 03:50:41,139 INFO org.apache.hadoop.hbase.regionserver.HLog: Splitting 3 log(s) in file:/data/hbase/log_localhost.localdomain_1239068648333_60020
      2009-04-07 03:50:41,139 DEBUG org.apache.hadoop.hbase.regionserver.HLog: Splitting 1 of 3: file:/data/hbase/log_localhost.localdomain_1239068648333_60020/hlog.dat.1239075060711
      [...]

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              apurtell Andrew Kyle Purtell
              Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: