Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
1.4.0
Description
From the https://issues.apache.org/jira/browse/HDDS-10750?focusedCommentId=17847435&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17847435 in HDDS-10750, it's found GroupMismatchException are thrown during the ClosePipelineCommandHandler
This is because XceiverRatisServer#removeGroup is called before XceiverRatisServer#getRaftPeersInPipeline, which causes XceiverRatisServer#getRaftPeersInPipeline to throw GroupMismatchException when it's trying to get the RaftServerProxy#getDivision since the group has been removed.
Therefore, we need to first call the XceiverRatisServer#getRaftPeersInPipeline before calling XceiverRatisServer#removeGroup. On top of that, we can catch the GroupMismatchException in case the group has been removed by earlier ClosePipelineCommandHandler in other datanode for the same pipeline.
Attachments
Issue Links
- is caused by
-
HDDS-9959 Propagate group remove to other datanodes during pipeline close
- Resolved
- is related to
-
HDDS-10750 Intermittent fork timeout while stopping Ratis server
- Resolved
- links to