Uploaded image for project: 'ZooKeeper'
  1. ZooKeeper
  2. ZOOKEEPER-4247

NPE while processing message from restarted quorum member

    XMLWordPrintableJSON

Details

    Description

      Problem:

      While upgrading K8S cluster, container running Zookeeper (during serving it's client) will rollover one by one.
      During this rollover, Null Pointer Exception was observed as below.
      After updating to the latest Zookeeper 3.6.2 we still see the problem.
      This is happening on a fresh install (and has all the time).

       

      Stack-trace:

      <from zk-pod-0-log>

      2021-02-08T12:42:08.229+0000 [myid:] - ERROR [nioEventLoopGroup-4-1:NettyServerCnxnFactory$CnxnChannelHandler@329] - Unexpected exception in receive
       java.lang.NullPointerException: null
               at org.apache.zookeeper.server.NettyServerCnxn.receiveMessage(NettyServerCnxn.java:518) ~[zookeeper-3.6.2.jar:3.6.2]
               at org.apache.zookeeper.server.NettyServerCnxn.processMessage(NettyServerCnxn.java:368) ~[zookeeper-3.6.2.jar:3.6.2]
               at org.apache.zookeeper.server.NettyServerCnxnFactory$CnxnChannelHandler.channelRead(NettyServerCnxnFactory.java:326) [zookeeper-3.6.2.jar:3.6.2]
               at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.50.Final.jar:4.1.50.Final]
               at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.50.Final.jar:4.1.50.Final]
               at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [netty-transport-4.1.50.Final.jar:4.1.50.Final]
               at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) [netty-transport-4.1.50.Final.jar:4.1.50.Final]
               at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.50.Final.jar:4.1.50.Final]
               at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.50.Final.jar:4.1.50.Final]
               at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) [netty-transport-4.1.50.Final.jar:4.1.50.Final]
               at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163) [netty-transport-4.1.50.Final.jar:4.1.50.Final]
               at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:714) [netty-transport-4.1.50.Final.jar:4.1.50.Final]
               at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:650) [netty-transport-4.1.50.Final.jar:4.1.50.Final]
               at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576) [netty-transport-4.1.50.Final.jar:4.1.50.Final]
               at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493) [netty-transport-4.1.50.Final.jar:4.1.50.Final]
               at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) [netty-common-4.1.50.Final.jar:4.1.50.Final]
               at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-common-4.1.50.Final.jar:4.1.50.Final]
               at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [netty-common-4.1.50.Final.jar:4.1.50.Final]
               at java.lang.Thread.run(Thread.java:834) [?:?]
      

       

       

      Expectation:

      This scenario should be handled and Zookeeper should not print Null Pointer Exception in logs when peer member goes down as a part of the upgrade procedure. 

      We are kindly requesting Apache Zookeeper team to fix this issue.

      Attachments

        Issue Links

          Activity

            People

              symat Mate Szalay-Beko
              TheDevarshiShah Devarshi Shah
              Votes:
              1 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h 50m
                  1h 50m