Uploaded image for project: 'ZooKeeper'
  1. ZooKeeper
  2. ZOOKEEPER-4247

NPE while processing message from restarted quorum member

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

      Description

      Problem:

      While upgrading K8S cluster, container running Zookeeper (during serving it's client) will rollover one by one.
      During this rollover, Null Pointer Exception was observed as below.
      After updating to the latest Zookeeper 3.6.2 we still see the problem.
      This is happening on a fresh install (and has all the time).

       

      Stack-trace:

      <from zk-pod-0-log>

      2021-02-08T12:42:08.229+0000 [myid:] - ERROR [nioEventLoopGroup-4-1:NettyServerCnxnFactory$CnxnChannelHandler@329] - Unexpected exception in receive
       java.lang.NullPointerException: null
               at org.apache.zookeeper.server.NettyServerCnxn.receiveMessage(NettyServerCnxn.java:518) ~[zookeeper-3.6.2.jar:3.6.2]
               at org.apache.zookeeper.server.NettyServerCnxn.processMessage(NettyServerCnxn.java:368) ~[zookeeper-3.6.2.jar:3.6.2]
               at org.apache.zookeeper.server.NettyServerCnxnFactory$CnxnChannelHandler.channelRead(NettyServerCnxnFactory.java:326) [zookeeper-3.6.2.jar:3.6.2]
               at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.50.Final.jar:4.1.50.Final]
               at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.50.Final.jar:4.1.50.Final]
               at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [netty-transport-4.1.50.Final.jar:4.1.50.Final]
               at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) [netty-transport-4.1.50.Final.jar:4.1.50.Final]
               at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.50.Final.jar:4.1.50.Final]
               at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.50.Final.jar:4.1.50.Final]
               at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) [netty-transport-4.1.50.Final.jar:4.1.50.Final]
               at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163) [netty-transport-4.1.50.Final.jar:4.1.50.Final]
               at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:714) [netty-transport-4.1.50.Final.jar:4.1.50.Final]
               at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:650) [netty-transport-4.1.50.Final.jar:4.1.50.Final]
               at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576) [netty-transport-4.1.50.Final.jar:4.1.50.Final]
               at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493) [netty-transport-4.1.50.Final.jar:4.1.50.Final]
               at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) [netty-common-4.1.50.Final.jar:4.1.50.Final]
               at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-common-4.1.50.Final.jar:4.1.50.Final]
               at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [netty-common-4.1.50.Final.jar:4.1.50.Final]
               at java.lang.Thread.run(Thread.java:834) [?:?]
      

       

       

      Expectation:

      This scenario should be handled and Zookeeper should not print Null Pointer Exception in logs when peer member goes down as a part of the upgrade procedure. 

      We are kindly requesting Apache Zookeeper team to fix this issue.

        Attachments

          Activity

            People

            • Assignee:
              symat Mate Szalay-Beko
              Reporter:
              TheDevarshiShah Devarshi Shah

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 1h 50m
                1h 50m

                  Issue deployment