[HDFS-10902] QJM should not consider stale/failed txn available in any one of JNs. - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Open
Priority: Critical
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Component/s: qjm
Labels:
None

Description

In one of our cluster faced an issue, where NameNode restart failed due to a stale/failed txn available in one JN but not others.

Scenario is:
1. Full cluster restart
2. startLogSegment Txn(195222) synced in Only one JN but failed to others, because they were shutting down. Only editlog file was created but txn was not synced in others, so after restart they were marked as empty.
3. Cluster restarted. During failover, this new logSegment missed the recovery because this JN was slow in responding to this call.
4. Other JNs recover was successfull, as there was no in-progress files.
5. editlog.openForWrite() detected that (195222) was already available, and failed the failover.

Same steps repeated until that stale editlog in JN was manually deleted.

Since QJM is a quorum of JNs, txn is considered successfull, if its written min quorum. Otherwise it will be failed.
So, same case should be applied while selecting streams for reading also.
Stale/failed txns available in only less JNs should not be considered for reading.

~~HDFS-10519~~, does similar work to consider 'durable' txns based on 'committedTxnId'. But updating 'committedTxnId' for every flush with one more RPC seems tobe problematic to performance.

Attachments

Activity

People

Assignee:: Vinayakumar B

Reporter:: Vinayakumar B

Votes:: 0 Vote for this issue

Watchers:: 14 Start watching this issue

Dates

Created:: 26/Sep/16 06:21

Updated:: 26/Sep/16 14:26