Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-8972

Impala deamon crashing frequently

    XMLWordPrintableJSON

    Details

    • Type: Question
    • Status: Resolved
    • Priority: Major
    • Resolution: Information Provided
    • Affects Version/s: Impala 2.8.0
    • Fix Version/s: Not Applicable
    • Component/s: Infrastructure
    • Labels:
      None
    • Environment:
      Impala version 2.8.0-cdh5-INTERNAL RELEASE (build )
    • Flags:
      Important
    • Epic Color:
      ghx-label-11

      Description

      Hi Team,

       

      Impala deamon is crashing frequently and need to restart .

       

      Please help in troubleshooting the same 

       

      I could see below error messages in deamon logs

       

      1.

       

      Java exception follows:
      org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException): No lease on /user/hive/warehouse/steelwedge_psnokiadmt_p
      rod.db/ppd_ro_im_dpart_bkp/_impala_insert_staging/e847d4231bb8c531_c166c98d00000000/.e847d4231bb8c531-c166c98d00000002_664317806_dir/e847d4231bb8c531-c166c98
      d00000002_844965293_data.0.parq (inode 17854099): File does not exist. Holder DFSClient_NONMAPREDUCE_-924590406_1 does not have any open files.
      at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:3635)
      at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.analyzeFileState(FSNamesystem.java:3438)
      at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3294)
      at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:679)
      at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.addBlock(AuthorizationProviderProxyClientProtocol.java:214)
      at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:489)
      at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
      at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
      at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
      at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
      at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
      at java.security.AccessController.doPrivileged(Native Method)
      at javax.security.auth.Subject.doAs(Subject.java:415)
      at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
      at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)

      at org.apache.hadoop.ipc.Client.call(Client.java:1472)
      at org.apache.hadoop.ipc.Client.call(Client.java:1409)
      at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
      at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
      at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:409)
      at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      at java.lang.reflect.Method.invoke(Method.java:606)
      at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
      at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
      at com.sun.proxy.$Proxy10.addBlock(Unknown Source)
      at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1739)
      at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1535)
      at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:689)
      Wrote minidump to /var/log/impala/minidumps/impalad/6352d57e-7493-b4db-27e7f36f-518eec8e.dmp
      #

      1. A fatal error has been detected by the Java Runtime Environment:
        #
      2. SIGSEGV (0xb) at pc=0x00007f42744b72fc, pid=4881, tid=139899197028096
        #
      3. JRE version: Java(TM) SE Runtime Environment (7.0_80-b15) (build 1.7.0_80-b15)
      4. Java VM: Java HotSpot(TM) 64-Bit Server VM (24.80-b11 mixed mode linux-amd64 compressed oops)
      5. Problematic frame:
      6. C [libkudu_client.so.0+0x27d2fc] void std::_Rb_tree<std::string, std::pair<std::string const, std::string>, std::_Select1st<std::pair<std::string const,
        std::string> >, std::less<std::string>, std::allocator<std::pair<std::string const, std::string> > >::_M_insert_unique<std::_Rb_tree_iterator<std::pair<std::
        string const, std::string> > >(std::_Rb_tree_iterator<std::pair<std::string const, std::string> >, std::_Rb_tree_iterator<std::pair<std::string const, std::s
        tring> >)+0x2381c

       

      2.

       

      W0917 22:35:49.505252  1265 BlockReaderFactory.java:778] I/O error constructing remote block reader.W0917 22:35:49.505252  1265 BlockReaderFactory.java:778] I/O error constructing remote block reader.Java exception follows:java.io.IOException: Got error for OP_READ_BLOCK, status=ERROR, self=/10.111.92.61:46531, remote=/10.111.92.61:50010, for file /user/hive/warehouse/steelwedge_psnokiadmt_prod.db/graph_ppt_in_list/job_id=149042/ea46e454a6357b08-2622f21800000002_731828525_data.0., for pool BP-1380753826-10.128.50.16-1462783635263 block 1081077775_7337065 at org.apache.hadoop.hdfs.RemoteBlockReader2.checkSuccess(RemoteBlockReader2.java:467) at org.apache.hadoop.hdfs.RemoteBlockReader2.newBlockReader(RemoteBlockReader2.java:432) at org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReader(BlockReaderFactory.java:881) at org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:759) at org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:376) at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:662) at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:889) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:965) at org.apache.hadoop.fs.FSDataInputStream.read(FSDataInputStream.java:147)W0917 22:35:49.506021  1265 DFSInputStream.java:699] Failed to connect to /10.111.92.61:50010 for block, add to deadNodes and continue. java.io.IOException: Got error for OP_READ_BLOCK, status=ERROR, self=/10.111.92.61:46531, remote=/10.111.92.61:50010, for file /user/hive/warehouse/steelwedge_psnokiadmt_prod.db/graph_ppt_in_list/job_id=149042/ea46e454a6357b08-2622f21800000002_731828525_data.0., for pool BP-1380753826-10.128.50.16-1462783635263 block 1081077775_7337065Java exception follows:java.io.IOException: Got Aborting Impala for OP_READ_BLOCK, status=ERROR, self=/10.111.92.61:46531, remote=/10.111.92.61:50010, for file /user/hive/warehouse/steelwedge_psnokiadmt_prod.db/graph_ppt_in_list/job_id=149042/ea46e454a6357b08-2622f21800000002_731828525_data.0., for pool BP-1380753826-10.128.50.16-1462783635263 block 1081077775_7337065 at org.apache.hadoop.hdfs.RemoteBlockReader2.checkSuccess(RemoteBlockReader2.java:467) at org.apache.hadoop.hdfs.RemoteBlockReader2.newBlockReader(RemoteBlockReader2.java:432) at org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReader(BlockReaderFactory.java:881) at org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:759) at org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:376) at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:662) at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:889) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:965) at org.apache.hadoop.fs.FSDataInputStream.read(FSDataInputStream.java:147)

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              ashok@524 Ashok
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: