HBase
  1. HBase
  2. HBASE-4202

Check filesystem permissions on startup

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Trivial Trivial
    • Resolution: Won't Fix
    • Affects Version/s: 0.20.4
    • Fix Version/s: None
    • Component/s: regionserver
    • Labels:
    • Environment:

      debian squeeze

      Description

      We added a new node to a 44 node cluster starting the datanode, mapred and regionserver processes on it. The Unix filesystem was configured incorrectly, i.e. /tmp was not writable to processes. All three processes had issues with this. Datanode and mapred shutdown on exception.
      Regionserver did not stop, in fact reported to master that its up without regions. So master assigned regions to it. Regionserver would not accept them, resulting in a constant assign, reject, reassign cycle, that put many regions into a state of not being available. There are no logs about this, but we could observer the regioncount fluctuate by hundredths of regions and the application throwing many NotServingRegion exceptions.

      In fact to the master process the regionserver looked fine, so it was trying to send regions its way. Regionserver rejected them. So the master/balancer was going into a assign/reassign cycle destabilizing the cluster. Many puts and gets simply failed with NotServingRegionExceptions and took a long time to complete.

      Exception from regionserver:
      2011-08-06 23:57:13,953 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Got ZooKeeper event, state: SyncConnected, type: NodeCreated, path: /hbase/master
      2011-08-06 23:57:13,957 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 17.1.0.1:60000 that we are up
      2011-08-06 23:57:13,957 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 17.1.0.1:60000 that we are up
      2011-08-07 00:07:39.648::INFO: Logging to STDERR via org.mortbay.log.StdErrLog
      2011-08-07 00:07:39.712::INFO: jetty-6.1.14
      2011-08-07 00:07:39.742::WARN: tmpdir
      java.io.IOException: Permission denied
      at java.io.UnixFileSystem.createFileExclusively(Native Method)
      at java.io.File.checkAndCreate(File.java:1704)
      at java.io.File.createTempFile(File.java:1792)
      at java.io.File.createTempFile(File.java:1828)
      at org.mortbay.jetty.webapp.WebAppContext.getTempDirectory(WebAppContext.java:745)
      at org.mortbay.jetty.webapp.WebAppContext.doStart(WebAppContext.java:458)
      at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
      at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152)
      at org.mortbay.jetty.handler.ContextHandlerCollection.doStart(ContextHandlerCollection.java:156)
      at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
      at org.mortbay.jetty.handler.HandlerWrapper.doStart(HandlerWrapper.java:130)
      at org.mortbay.jetty.Server.doStart(Server.java:222)
      at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
      at org.apache.hadoop.http.HttpServer.start(HttpServer.java:461)
      at org.apache.hadoop.hbase.regionserver.HRegionServer.startServiceThreads(HRegionServer.java:1168)
      at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:792)
      at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:430)
      at java.lang.Thread.run(Thread.java:619)

      Exception from datanode:
      2011-08-06 23:37:20,444 INFO org.apache.hadoop.http.HttpServer: Jetty bound to port 50075
      2011-08-06 23:37:20,444 INFO org.mortbay.log: jetty-6.1.14
      2011-08-06 23:37:20,469 WARN org.mortbay.log: tmpdir
      java.io.IOException: Permission denied
      at java.io.UnixFileSystem.createFileExclusively(Native Method)
      at java.io.File.checkAndCreate(File.java:1704)
      at java.io.File.createTempFile(File.java:1792)
      at java.io.File.createTempFile(File.java:1828)
      at org.mortbay.jetty.webapp.WebAppContext.getTempDirectory(WebAppContext.java:745)
      at org.mortbay.jetty.webapp.WebAppContext.doStart(WebAppContext.java:458)
      at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
      at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152)
      at org.mortbay.jetty.handler.ContextHandlerCollection.doStart(ContextHandlerCollection.java:156)
      at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
      at org.mortbay.jetty.handler.HandlerWrapper.doStart(HandlerWrapper.java:130)
      at org.mortbay.jetty.Server.doStart(Server.java:222)
      at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
      at org.apache.hadoop.http.HttpServer.start(HttpServer.java:463)
      at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:384)
      at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:225)
      at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1309)
      at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1264)
      at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1272)
      at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1394)
      2011-08-06 23:37:20,471 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
      /************************************************************
      SHUTDOWN_MSG: Shutting down DataNode at hdp1122/17.1.0.22
      ************************************************************/

      Exception from tasktracker:
      2011-08-06 23:33:50,380 INFO org.apache.hadoop.http.HttpServer: Jetty bound to port 50060
      2011-08-06 23:33:50,380 INFO org.mortbay.log: jetty-6.1.14
      2011-08-06 23:33:50,415 WARN org.mortbay.log: tmpdir
      java.io.IOException: Permission denied
      at java.io.UnixFileSystem.createFileExclusively(Native Method)
      at java.io.File.checkAndCreate(File.java:1704)
      at java.io.File.createTempFile(File.java:1792)
      at java.io.File.createTempFile(File.java:1828)
      at org.mortbay.jetty.webapp.WebAppContext.getTempDirectory(WebAppContext.java:745)
      at org.mortbay.jetty.webapp.WebAppContext.doStart(WebAppContext.java:458)
      at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
      at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152)
      at org.mortbay.jetty.handler.ContextHandlerCollection.doStart(ContextHandlerCollection.java:156)
      at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
      at org.mortbay.jetty.handler.HandlerWrapper.doStart(HandlerWrapper.java:130)
      at org.mortbay.jetty.Server.doStart(Server.java:222)
      at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
      at org.apache.hadoop.http.HttpServer.start(HttpServer.java:463)
      at org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:935)
      at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:2837)
      2011-08-06 23:33:50,416 INFO org.apache.hadoop.mapred.TaskTracker: SHUTDOWN_MSG:
      /************************************************************
      SHUTDOWN_MSG: Shutting down TaskTracker at hdp1122/17.1.0.22
      ************************************************************/

        Activity

        Hide
        stack added a comment -

        Marking as noob. Would be sweet if someone tried this on 0.90.x hbase and if still an issue, make a change so we exit rather than do the pathological dance Matthias describes above.

        Show
        stack added a comment - Marking as noob. Would be sweet if someone tried this on 0.90.x hbase and if still an issue, make a change so we exit rather than do the pathological dance Matthias describes above.
        Hide
        ramkrishna.s.vasudevan added a comment -

        @Stack,
        I think 0.90.x behaves correctly. I checked the behaviour. Below are the logs

        2011-08-18 10:57:49,345 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
        2011-08-18 10:57:49,475 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Failed initialization
        2011-08-18 10:57:49,479 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: Failed init
        java.lang.IllegalArgumentException: Bad temp directory: /tmp/hadoop-test666/Jetty/regionserver
        	at org.mortbay.jetty.webapp.WebAppContext.setTempDirectory(WebAppContext.java:1201)
        	at org.apache.hadoop.http.HttpServer.<init>(HttpServer.java:128)
        	at org.apache.hadoop.hbase.util.InfoServer.<init>(InfoServer.java:54)
        	at org.apache.hadoop.hbase.regionserver.HRegionServer.startServiceThreads(HRegionServer.java:1262)
        	at org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:880)
        	at org.apache.hadoop.hbase.regionserver.HRegionServer.tryReportForDuty(HRegionServer.java:1481)
        	at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:571)
        	at java.lang.Thread.run(Thread.java:619)
        2011-08-18 10:57:49,488 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server serverName=linux-kxjl,60020,1313645267001, load=(requests=0, regions=0, usedHeap=22, maxHeap=995): Unhandled exception: Region server startup failed
        java.io.IOException: Region server startup failed
        	at org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:987)
        	at org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:889)
        	at org.apache.hadoop.hbase.regionserver.HRegionServer.tryReportForDuty(HRegionServer.java:1481)
        	at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:571)
        	at java.lang.Thread.run(Thread.java:619)
        Caused by: java.lang.IllegalArgumentException: Bad temp directory: /tmp/hadoop-test666/Jetty/regionserver
        	at org.mortbay.jetty.webapp.WebAppContext.setTempDirectory(WebAppContext.java:1201)
        	at org.apache.hadoop.http.HttpServer.<init>(HttpServer.java:128)
        	at org.apache.hadoop.hbase.util.InfoServer.<init>(InfoServer.java:54)
        	at org.apache.hadoop.hbase.regionserver.HRegionServer.startServiceThreads(HRegionServer.java:1262)
        	at org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:880)
        	... 3 more
        -----------
        2011-08-18 10:57:49,770 INFO org.apache.hadoop.hbase.regionserver.ShutdownHook: Shutdown hook starting; hbase.shutdown.hook=true; fsShutdownHook=Thread[Thread-14,5,main]
        2011-08-18 10:57:49,771 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Shutdown hook
        2011-08-18 10:57:49,771 INFO org.apache.hadoop.hbase.regionserver.ShutdownHook: Starting fs shutdown hook thread.
        2011-08-18 10:57:49,874 INFO org.apache.hadoop.hbase.regionserver.ShutdownHook: Shutdown hook finished.
        

        This defect may not be valid in 0.90.x version.

        Show
        ramkrishna.s.vasudevan added a comment - @Stack, I think 0.90.x behaves correctly. I checked the behaviour. Below are the logs 2011-08-18 10:57:49,345 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog 2011-08-18 10:57:49,475 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Failed initialization 2011-08-18 10:57:49,479 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: Failed init java.lang.IllegalArgumentException: Bad temp directory: /tmp/hadoop-test666/Jetty/regionserver at org.mortbay.jetty.webapp.WebAppContext.setTempDirectory(WebAppContext.java:1201) at org.apache.hadoop.http.HttpServer.<init>(HttpServer.java:128) at org.apache.hadoop.hbase.util.InfoServer.<init>(InfoServer.java:54) at org.apache.hadoop.hbase.regionserver.HRegionServer.startServiceThreads(HRegionServer.java:1262) at org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:880) at org.apache.hadoop.hbase.regionserver.HRegionServer.tryReportForDuty(HRegionServer.java:1481) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:571) at java.lang.Thread.run(Thread.java:619) 2011-08-18 10:57:49,488 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server serverName=linux-kxjl,60020,1313645267001, load=(requests=0, regions=0, usedHeap=22, maxHeap=995): Unhandled exception: Region server startup failed java.io.IOException: Region server startup failed at org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:987) at org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:889) at org.apache.hadoop.hbase.regionserver.HRegionServer.tryReportForDuty(HRegionServer.java:1481) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:571) at java.lang.Thread.run(Thread.java:619) Caused by: java.lang.IllegalArgumentException: Bad temp directory: /tmp/hadoop-test666/Jetty/regionserver at org.mortbay.jetty.webapp.WebAppContext.setTempDirectory(WebAppContext.java:1201) at org.apache.hadoop.http.HttpServer.<init>(HttpServer.java:128) at org.apache.hadoop.hbase.util.InfoServer.<init>(InfoServer.java:54) at org.apache.hadoop.hbase.regionserver.HRegionServer.startServiceThreads(HRegionServer.java:1262) at org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:880) ... 3 more ----------- 2011-08-18 10:57:49,770 INFO org.apache.hadoop.hbase.regionserver.ShutdownHook: Shutdown hook starting; hbase.shutdown.hook=true; fsShutdownHook=Thread[Thread-14,5,main] 2011-08-18 10:57:49,771 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Shutdown hook 2011-08-18 10:57:49,771 INFO org.apache.hadoop.hbase.regionserver.ShutdownHook: Starting fs shutdown hook thread. 2011-08-18 10:57:49,874 INFO org.apache.hadoop.hbase.regionserver.ShutdownHook: Shutdown hook finished. This defect may not be valid in 0.90.x version.
        Hide
        stack added a comment -

        Thanks for doing the check Ram.

        I'm marking this a 'trivial' bug since it seems it only an issue in old 0.20.x hbase fixed (it looks like) in 0.90.x

        Show
        stack added a comment - Thanks for doing the check Ram. I'm marking this a 'trivial' bug since it seems it only an issue in old 0.20.x hbase fixed (it looks like) in 0.90.x
        Hide
        Lars Hofhansl added a comment -

        @Ram: are you still doing this, or should we close?

        Show
        Lars Hofhansl added a comment - @Ram: are you still doing this, or should we close?
        Hide
        ramkrishna.s.vasudevan added a comment -

        @Lars
        Am not doing this and may not take up any time sooner i feel.. We can close this Lars.

        Show
        ramkrishna.s.vasudevan added a comment - @Lars Am not doing this and may not take up any time sooner i feel.. We can close this Lars.
        Hide
        Lars Hofhansl added a comment -

        Alright. Can always reopen.
        (We're doing some jira house cleaning at Salesforce)

        Show
        Lars Hofhansl added a comment - Alright. Can always reopen. (We're doing some jira house cleaning at Salesforce)

          People

          • Assignee:
            ramkrishna.s.vasudevan
            Reporter:
            Matthias Hofschen
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development