Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-846

hbase looses its mind when hdfs fills

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Invalid
    • None
    • None
    • None
    • None

    Description

      Looking in log, I see:

      2008-08-26 18:57:23,602 INFO org.apache.hadoop.dfs.DFSClient: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /hbase/aa0-000-8.u.powerset.com/log_208.76.45.95_1218666613846_60020/hlog.dat.1219776799293 could only be replicated to 0 nodes, instead of 1
              at org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1145)
              at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:300)
              at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
              at java.lang.reflect.Method.invoke(Unknown Source)
              at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:446)
              at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896)
      
              at org.apache.hadoop.ipc.Client.call(Client.java:557)
              at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:212)
              at org.apache.hadoop.dfs.$Proxy1.addBlock(Unknown Source)
              at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
              at java.lang.reflect.Method.invoke(Unknown Source)
              at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
              at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
              at org.apache.hadoop.dfs.$Proxy1.addBlock(Unknown Source)
              at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2335)
              at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2220)
              at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1700(DFSClient.java:1702)
              at org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1842)
      

      ... and then:

      2008-08-26 18:57:28,423 WARN org.apache.hadoop.dfs.DFSClient: Error Recovery for block null bad datanode[0]
      2008-08-26 18:57:28,424 FATAL org.apache.hadoop.hbase.regionserver.HLog: Could not append. Requesting close of log
      java.io.IOException: Could not get block locations. Aborting... 
              at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2081)
              at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1300(DFSClient.java:1702)
              at org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1818)
      2008-08-26 18:57:28,424 INFO org.apache.hadoop.hbase.regionserver.LogRoller: Rolling hlog. Number of entries: 127
      2008-08-26 18:57:28,424 ERROR org.apache.hadoop.hbase.regionserver.LogRoller: Log rolling failed
      java.io.IOException: Could not get block locations. Aborting...
              at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2081)
              at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1300(DFSClient.java:1702)
              at org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1818)
      ...
      

      ... and so on.

      Meantime clients are trying to do updates and getting below:

      2008-08-26 22:49:42,834 INFO org.apache.hadoop.ipc.Server: IPC Server handler 9 on 60020, call batchUpdate([B@40e830f1, row => IKwQLMJ3rKRvtAv_ZkQlAk==, {column => page:url, value => '...', column => page:contents, value => '...'}) from 208.76.45.3:51164: error: java.io.IOException: Could not get block locations. Aborting... 
      java.io.IOException: Could not get block locations. Aborting...
              at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2081)
              at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1300(DFSClient.java:1702)
              at org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1818)
      ..
      

      DFSClient seems horked.

      Need to be able to ride out these kind of event.

      Restart is needed.

      Test this by filling HDFS.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            stack Michael Stack
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment