Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
4.1.0-incubating
-
None
-
None
Description
It's quite frequent to get result as `sendStatus=FLUSH_SLAVE_TIMEOUT` when sending big messages(>500k) in SYNC_MASTER/SLAVE scenario.
The timeout value used by the sync process currently as I found, is the config `syncFlushTimeout`. And its default value is 5000 milliseconds.
But it shows that producer get the result as `FLUSH_SLAVE_TIMEOUT` less than 1 second.
So why does the config not work as expected?
Relevant code:
// CommitLog.java public void handleHA(AppendMessageResult result, PutMessageResult putMessageResult, MessageExt messageExt) { if (BrokerRole.SYNC_MASTER == this.defaultMessageStore.getMessageStoreConfig().getBrokerRole()) { HAService service = this.defaultMessageStore.getHaService(); if (messageExt.isWaitStoreMsgOK()) { // Determine whether to wait if (service.isSlaveOK(result.getWroteOffset() + result.getWroteBytes())) { GroupCommitRequest request = new GroupCommitRequest(result.getWroteOffset() + result.getWroteBytes()); service.putRequest(request); service.getWaitNotifyObject().wakeupAll(); boolean flushOK = request.waitForFlush(this.defaultMessageStore.getMessageStoreConfig().getSyncFlushTimeout()); if (!flushOK) { log.error("do sync transfer other node, wait return, but failed, topic: " + messageExt.getTopic() + " tags: " + messageExt.getTags() + " client address: " + messageExt.getBornHostNameString()); putMessageResult.setPutMessageStatus(PutMessageStatus.FLUSH_SLAVE_TIMEOUT); } } // Slave problem else { // Tell the producer, slave not available putMessageResult.setPutMessageStatus(PutMessageStatus.SLAVE_NOT_AVAILABLE); } } } }
Attachments
Issue Links
- links to