Uploaded image for project: 'Apache RocketMQ'
  1. Apache RocketMQ
  2. ROCKETMQ-101

Possible NullPointerException when retry in send Async way

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 4.1.0-incubating
    • rocketmq-client
    • None

    Description

      When retry async send, possible NPE will occure:

      java.lang.NullPointerException: null
      at com.alibaba.rocketmq.client.latency.MQFaultStrategy.selectOneMessageQueue(MQFaultStrategy.java:91) ~[classes/:na]
      at com.alibaba.rocketmq.client.impl.producer.DefaultMQProducerImpl.selectOneMessageQueue(DefaultMQProducerImpl.java:404) ~[classes/:na]
      at com.alibaba.rocketmq.client.impl.MQClientAPIImpl.onExceptionImpl(MQClientAPIImpl.java:385) ~[classes/:na]
      at com.alibaba.rocketmq.client.impl.MQClientAPIImpl.access$100(MQClientAPIImpl.java:72) ~[classes/:na]
      at com.alibaba.rocketmq.client.impl.MQClientAPIImpl$1.operationComplete(MQClientAPIImpl.java:356) ~[classes/:na]
      at com.alibaba.rocketmq.remoting.netty.ResponseFuture.executeInvokeCallback(ResponseFuture.java:58) ~[classes/:na]
      at com.alibaba.rocketmq.remoting.netty.NettyRemotingAbstract.scanResponseTable(NettyRemotingAbstract.java:255) ~[classes/:na]
      at com.alibaba.rocketmq.remoting.netty.NettyRemotingClient$5.run(NettyRemotingClient.java:165) [classes/:na]
      at java.util.TimerThread.mainLoop(Timer.java:555) [na:1.7.0_80]
      at java.util.TimerThread.run(Timer.java:505) [na:1.7.0_80]

      The problem is : when selectOneMessageQueue in MQFaultStrategy, the topicPublishInfo which is just passed from sendKernelImpl, will be possiblly null, which causes NPE.

      There are some places where sendKernelImpl wii have null TopicPublishInfo, for example :

      private SendResult sendSelectImpl(//
                                            Message msg, //
                                            MessageQueueSelector selector, //
                                            Object arg, //
                                            final CommunicationMode communicationMode, //
                                            final SendCallback sendCallback, final long timeout//
          ) throws MQClientException, RemotingException, MQBrokerException, InterruptedException {
                  this.makeSureStateOK();
                  Validators.checkMessage(msg, this.defaultMQProducer);
      
                  TopicPublishInfo topicPublishInfo = this.tryToFindTopicPublishInfo(msg.getTopic());
                  if (topicPublishInfo != null && topicPublishInfo.ok()) {
                      MessageQueue mq = null;
                      try {
                          mq = selector.select(topicPublishInfo.getMessageQueueList(), msg, arg);
                      } catch (Throwable e) {
                          throw new MQClientException("select message queue throws exception.", e);
                      }
      
                       if (mq != null) {
                           return this.sendKernelImpl(msg, mq, communicationMode, sendCallback, null, timeout);//here, the topicroutinfo is null, which has the risk of NPE
                       } else {
                           throw new MQClientException("select message queue return null.", null);
                      }   
                  }
      
                   throw new MQClientException("No route info for this topic, " + msg.getTopic(), null);
              }
      

      Though I find out the bug in 3.5.8, the same issue exists in 4.0 since the relative code is the same

      This NPE will make retry fail, and even ,onException callback fail to be called.

      Attachments

        Issue Links

          Activity

            People

              Jaskey Jaskey Lam
              Jaskey Jaskey Lam
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: