Uploaded image for project: 'Apache RocketMQ'
  1. Apache RocketMQ
  2. ROCKETMQ-272

The config `syncFlushTimeout` doesn't work for SYNC_MASTER

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 4.1.0-incubating
    • Fix Version/s: None
    • Component/s: rocketmq-broker
    • Labels:
      None

      Description

      It's quite frequent to get result as `sendStatus=FLUSH_SLAVE_TIMEOUT` when sending big messages(>500k) in SYNC_MASTER/SLAVE scenario.
      The timeout value used by the sync process currently as I found, is the config `syncFlushTimeout`. And its default value is 5000 milliseconds.
      But it shows that producer get the result as `FLUSH_SLAVE_TIMEOUT` less than 1 second.
      So why does the config not work as expected?

      Relevant code:

      // CommitLog.java
      public void handleHA(AppendMessageResult result, PutMessageResult putMessageResult, MessageExt messageExt) {
          if (BrokerRole.SYNC_MASTER == this.defaultMessageStore.getMessageStoreConfig().getBrokerRole()) {
              HAService service = this.defaultMessageStore.getHaService();
              if (messageExt.isWaitStoreMsgOK()) {
                  // Determine whether to wait
                  if (service.isSlaveOK(result.getWroteOffset() + result.getWroteBytes())) {
                      GroupCommitRequest  request = new GroupCommitRequest(result.getWroteOffset() + result.getWroteBytes());
                      service.putRequest(request);
                      service.getWaitNotifyObject().wakeupAll();
                      boolean flushOK =
                          request.waitForFlush(this.defaultMessageStore.getMessageStoreConfig().getSyncFlushTimeout());
                      if (!flushOK) {
                          log.error("do sync transfer other node, wait return, but failed, topic: " + messageExt.getTopic() + " tags: "
                              + messageExt.getTags() + " client address: " + messageExt.getBornHostNameString());
                          putMessageResult.setPutMessageStatus(PutMessageStatus.FLUSH_SLAVE_TIMEOUT);
                      }
                  }
                  // Slave problem
                  else {
                      // Tell the producer, slave not available
                      putMessageResult.setPutMessageStatus(PutMessageStatus.SLAVE_NOT_AVAILABLE);
                  }
              }
          }
      }
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                yukon Xinyu Zhou
                Reporter:
                evthoriz Yu Kaiyuan
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated: