Accumulo
  1. Accumulo
  2. ACCUMULO-181

Monitor status page says that "Name Node is Down'

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Trivial Trivial
    • Resolution: Won't Fix
    • Affects Version/s: 1.3.5, 1.4.0
    • Fix Version/s: None
    • Component/s: monitor
    • Labels:
      None
    • Environment:

      CentOS release 5.6
      single node Accumulo setup

      Description

      The Accumulo monitor/status page (http://localhost:50095) says that the "Name Node is Down" (in a red background), but is displaying the correct capacity, % used, and # of corrupt blocks. NameNode is up as confirmed by http://localhost:50070/dfshealth.jsp

        Activity

        Minh Duc Nguyen created issue -
        Hide
        Eric Newton added a comment -

        Are you seeing any errors in the monitor log files? Can you attach a little screenshot of the overview page?

        Show
        Eric Newton added a comment - Are you seeing any errors in the monitor log files? Can you attach a little screenshot of the overview page?
        Hide
        Minh Duc Nguyen added a comment -

        I've attached a screenshot.

        In the monitor.log file:

        java.lang.RuntimeException:
        org.apache.accumulo.core.client.impl.ThriftScanner$ScanTimedOutException
        at
        org.apache.accumulo.core.client.impl.ScannerIterator.hasNext(ScannerIterator.java:161)
        at
        org.apache.accumulo.server.problems.ProblemReports$3.hasNext(ProblemReports.java:239)
        at
        org.apache.accumulo.server.problems.ProblemReports.summarize(ProblemReports.java:297)
        at
        org.apache.accumulo.server.monitor.Monitor.fetchData(Monitor.java:323)
        at
        org.apache.accumulo.server.monitor.Monitor$3.run(Monitor.java:452)
        at
        org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34)
        at java.lang.Thread.run(Thread.java:662)
        Caused by:
        org.apache.accumulo.core.client.impl.ThriftScanner$ScanTimedOutException
        at
        org.apache.accumulo.core.client.impl.ThriftScanner.scan(ThriftScanner.java:238)
        at
        org.apache.accumulo.core.client.impl.ScannerIterator$Reader.run(ScannerIterator.java:73)
        at
        org.apache.accumulo.core.client.impl.ScannerIterator.hasNext(ScannerIterator.java:151)
        ... 6 more
        23 16:00:44,401 [net.SocketNode] INFO : Caught java.io.EOFException
        closing conneciton.

        In the monitor.debug.log file:

        23 15:59:42,203 [servlets.DefaultServlet] DEBUG:
        org.apache.hadoop.security.AccessControlException:
        org.apache.hadoop.security.AccessControlException: Permission denied:
        user=pegasus, ac
        cess=READ_EXECUTE, inode="/var/tmp/hdfs/.staging":hdfs:supergroup:drwx------
        org.apache.hadoop.security.AccessControlException:
        org.apache.hadoop.security.AccessControlException: Permission denied:
        user=pegasus, access=READ_EXECUTE, inode="/var/tmp/hdfs/.staging":
        hdfs:supergroup:drwx------
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
        Method)
        at
        sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
        at
        sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
        at
        org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:95)
        at
        org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:57)
        at
        org.apache.hadoop.hdfs.DFSClient.getContentSummary(DFSClient.java:1014)
        at
        org.apache.hadoop.hdfs.DistributedFileSystem.getContentSummary(DistributedFileSystem.java:261)
        at
        org.apache.accumulo.server.monitor.servlets.DefaultServlet.doAccumuloTable(DefaultServlet.java:253)
        at
        org.apache.accumulo.server.monitor.servlets.DefaultServlet.pageBody(DefaultServlet.java:193)
        at
        org.apache.accumulo.server.monitor.servlets.BasicServlet.doGet(BasicServlet.java:50)
        at
        org.apache.accumulo.server.monitor.servlets.DefaultServlet.doGet(DefaultServlet.java:136)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
        at
        org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
        at
        org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:401)
        at
        org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
        at org.mortbay.jetty.Server.handle(Server.java:326)
        at
        org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
        at
        org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
        at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
        at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
        at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
        at
        org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228)
        at
        org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
        Caused by: org.apache.hadoop.ipc.RemoteException:
        org.apache.hadoop.security.AccessControlException: Permission denied:
        user=pegasus, access=READ_EXECUTE,
        inode="/var/tmp/hdfs/.staging":hdfs:supergroup:drwx------
        at
        org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:203)
        at
        org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkSubAccess(FSPermissionChecker.java:172)
        at
        org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:141)
        at
        org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5073)
        at
        org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getContentSummary(FSNamesystem.java:2002)
        at
        org.apache.hadoop.hdfs.server.namenode.NameNode.getContentSummary(NameNode.java:893)
        at sun.reflect.GeneratedMethodAccessor32.invoke(Unknown Source)
        at
        sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:557)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1434)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1430)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at
        org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1428)

        at org.apache.hadoop.ipc.Client.call(Client.java:1107)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:226)
        at $Proxy0.getContentSummary(Unknown Source)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
        sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at
        sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at
        org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
        at
        org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
        at $Proxy0.getContentSummary(Unknown Source)
        at
        org.apache.hadoop.hdfs.DFSClient.getContentSummary(DFSClient.java:1012)
        ... 18 more

        23 15:59:42,510 [impl.ScannerIterator] DEBUG:
        org.apache.accumulo.core.client.impl.ThriftScanner$ScanTimedOutException
        org.apache.accumulo.core.client.impl.ThriftScanner$ScanTimedOutException
        at
        org.apache.accumulo.core.client.impl.ThriftScanner.scan(ThriftScanner.java:238)
        at
        org.apache.accumulo.core.client.impl.ScannerIterator$Reader.run(ScannerIterator.java:73)
        at
        org.apache.accumulo.core.client.impl.ScannerIterator.hasNext(ScannerIterator.java:151)
        at
        org.apache.accumulo.server.problems.ProblemReports$3.hasNext(ProblemReports.java:239)
        at
        org.apache.accumulo.server.problems.ProblemReports.summarize(ProblemReports.java:297)
        at
        org.apache.accumulo.server.monitor.Monitor.fetchData(Monitor.java:323)
        at
        org.apache.accumulo.server.monitor.Monitor$3.run(Monitor.java:452)
        at
        org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34)
        at java.lang.Thread.run(Thread.java:662)
        23 15:59:42,511 [monitor.Monitor] INFO : Failed to obtain problem reports
        java.lang.RuntimeException:
        org.apache.accumulo.core.client.impl.ThriftScanner$ScanTimedOutException
        at
        org.apache.accumulo.core.client.impl.ScannerIterator.hasNext(ScannerIterator.java:161)
        at
        org.apache.accumulo.server.problems.ProblemReports$3.hasNext(ProblemReports.java:239)
        at
        org.apache.accumulo.server.problems.ProblemReports.summarize(ProblemReports.java:297)
        at
        org.apache.accumulo.server.monitor.Monitor.fetchData(Monitor.java:323)
        at
        org.apache.accumulo.server.monitor.Monitor$3.run(Monitor.java:452)
        at
        org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34)
        at java.lang.Thread.run(Thread.java:662)
        Caused by:
        org.apache.accumulo.core.client.impl.ThriftScanner$ScanTimedOutException
        at
        org.apache.accumulo.core.client.impl.ThriftScanner.scan(ThriftScanner.java:238)
        at
        org.apache.accumulo.core.client.impl.ScannerIterator$Reader.run(ScannerIterator.java:73)
        at
        org.apache.accumulo.core.client.impl.ScannerIterator.hasNext(ScannerIterator.java:151)
        ... 6 more

        On Wed, Nov 23, 2011 at 3:57 PM, Eric Newton (Commented) (JIRA) <

        Show
        Minh Duc Nguyen added a comment - I've attached a screenshot. In the monitor.log file: java.lang.RuntimeException: org.apache.accumulo.core.client.impl.ThriftScanner$ScanTimedOutException at org.apache.accumulo.core.client.impl.ScannerIterator.hasNext(ScannerIterator.java:161) at org.apache.accumulo.server.problems.ProblemReports$3.hasNext(ProblemReports.java:239) at org.apache.accumulo.server.problems.ProblemReports.summarize(ProblemReports.java:297) at org.apache.accumulo.server.monitor.Monitor.fetchData(Monitor.java:323) at org.apache.accumulo.server.monitor.Monitor$3.run(Monitor.java:452) at org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34) at java.lang.Thread.run(Thread.java:662) Caused by: org.apache.accumulo.core.client.impl.ThriftScanner$ScanTimedOutException at org.apache.accumulo.core.client.impl.ThriftScanner.scan(ThriftScanner.java:238) at org.apache.accumulo.core.client.impl.ScannerIterator$Reader.run(ScannerIterator.java:73) at org.apache.accumulo.core.client.impl.ScannerIterator.hasNext(ScannerIterator.java:151) ... 6 more 23 16:00:44,401 [net.SocketNode] INFO : Caught java.io.EOFException closing conneciton. In the monitor.debug.log file: 23 15:59:42,203 [servlets.DefaultServlet] DEBUG: org.apache.hadoop.security.AccessControlException: org.apache.hadoop.security.AccessControlException: Permission denied: user=pegasus, ac cess=READ_EXECUTE, inode="/var/tmp/hdfs/.staging":hdfs:supergroup:drwx------ org.apache.hadoop.security.AccessControlException: org.apache.hadoop.security.AccessControlException: Permission denied: user=pegasus, access=READ_EXECUTE, inode="/var/tmp/hdfs/.staging": hdfs:supergroup:drwx------ at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:95) at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:57) at org.apache.hadoop.hdfs.DFSClient.getContentSummary(DFSClient.java:1014) at org.apache.hadoop.hdfs.DistributedFileSystem.getContentSummary(DistributedFileSystem.java:261) at org.apache.accumulo.server.monitor.servlets.DefaultServlet.doAccumuloTable(DefaultServlet.java:253) at org.apache.accumulo.server.monitor.servlets.DefaultServlet.pageBody(DefaultServlet.java:193) at org.apache.accumulo.server.monitor.servlets.BasicServlet.doGet(BasicServlet.java:50) at org.apache.accumulo.server.monitor.servlets.DefaultServlet.doGet(DefaultServlet.java:136) at javax.servlet.http.HttpServlet.service(HttpServlet.java:707) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:401) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:326) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) Caused by: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.security.AccessControlException: Permission denied: user=pegasus, access=READ_EXECUTE, inode="/var/tmp/hdfs/.staging":hdfs:supergroup:drwx------ at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:203) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkSubAccess(FSPermissionChecker.java:172) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:141) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5073) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getContentSummary(FSNamesystem.java:2002) at org.apache.hadoop.hdfs.server.namenode.NameNode.getContentSummary(NameNode.java:893) at sun.reflect.GeneratedMethodAccessor32.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:557) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1434) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1430) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1428) at org.apache.hadoop.ipc.Client.call(Client.java:1107) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:226) at $Proxy0.getContentSummary(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59) at $Proxy0.getContentSummary(Unknown Source) at org.apache.hadoop.hdfs.DFSClient.getContentSummary(DFSClient.java:1012) ... 18 more 23 15:59:42,510 [impl.ScannerIterator] DEBUG: org.apache.accumulo.core.client.impl.ThriftScanner$ScanTimedOutException org.apache.accumulo.core.client.impl.ThriftScanner$ScanTimedOutException at org.apache.accumulo.core.client.impl.ThriftScanner.scan(ThriftScanner.java:238) at org.apache.accumulo.core.client.impl.ScannerIterator$Reader.run(ScannerIterator.java:73) at org.apache.accumulo.core.client.impl.ScannerIterator.hasNext(ScannerIterator.java:151) at org.apache.accumulo.server.problems.ProblemReports$3.hasNext(ProblemReports.java:239) at org.apache.accumulo.server.problems.ProblemReports.summarize(ProblemReports.java:297) at org.apache.accumulo.server.monitor.Monitor.fetchData(Monitor.java:323) at org.apache.accumulo.server.monitor.Monitor$3.run(Monitor.java:452) at org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34) at java.lang.Thread.run(Thread.java:662) 23 15:59:42,511 [monitor.Monitor] INFO : Failed to obtain problem reports java.lang.RuntimeException: org.apache.accumulo.core.client.impl.ThriftScanner$ScanTimedOutException at org.apache.accumulo.core.client.impl.ScannerIterator.hasNext(ScannerIterator.java:161) at org.apache.accumulo.server.problems.ProblemReports$3.hasNext(ProblemReports.java:239) at org.apache.accumulo.server.problems.ProblemReports.summarize(ProblemReports.java:297) at org.apache.accumulo.server.monitor.Monitor.fetchData(Monitor.java:323) at org.apache.accumulo.server.monitor.Monitor$3.run(Monitor.java:452) at org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34) at java.lang.Thread.run(Thread.java:662) Caused by: org.apache.accumulo.core.client.impl.ThriftScanner$ScanTimedOutException at org.apache.accumulo.core.client.impl.ThriftScanner.scan(ThriftScanner.java:238) at org.apache.accumulo.core.client.impl.ScannerIterator$Reader.run(ScannerIterator.java:73) at org.apache.accumulo.core.client.impl.ScannerIterator.hasNext(ScannerIterator.java:151) ... 6 more On Wed, Nov 23, 2011 at 3:57 PM, Eric Newton (Commented) (JIRA) <
        Minh Duc Nguyen made changes -
        Field Original Value New Value
        Attachment NameNodeIsDown.png [ 12504936 ]
        Hide
        Minh Duc Nguyen added a comment -

        I fixed the permissions problems causing the AccessControlException and I'm no longer seeing the ScanTimedOutException. In fact, there are no errors or exceptions in the logs. The correct NameNode and JobTracker info are being displayed in the monitor page despite the "Name Node is Down" message.

        Show
        Minh Duc Nguyen added a comment - I fixed the permissions problems causing the AccessControlException and I'm no longer seeing the ScanTimedOutException. In fact, there are no errors or exceptions in the logs. The correct NameNode and JobTracker info are being displayed in the monitor page despite the "Name Node is Down" message.
        Hide
        Eric Newton added a comment -

        You can add the appropriate permissions to the grant section of conf/monitor.security.policy, or just give up and add:

        permission java.security.AllPermission;
        
        Show
        Eric Newton added a comment - You can add the appropriate permissions to the grant section of conf/monitor.security.policy, or just give up and add: permission java.security.AllPermission;
        Eric Newton made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Resolution Won't Fix [ 2 ]
        Gavin made changes -
        Workflow no-reopen-closed, patch-avail [ 12643247 ] patch-available, re-open possible [ 12671613 ]

          People

          • Assignee:
            Eric Newton
            Reporter:
            Minh Duc Nguyen
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development