XMLWordPrintableJSON

    Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: QuorumJournalManager (HDFS-3077)
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      In doing some stress tests, I ran into an issue with failover if the current edit log segment written by the old active is large. With a 327MB log segment containing 6.4M transactions, the JN took ~11 seconds to read and validate it during the recovery step. This was longer than the 10 second timeout for createNewEpoch, which caused the recovery to fail.

        Attachments

        1. hdfs-3906.txt
          24 kB
          Todd Lipcon

          Activity

            People

            • Assignee:
              tlipcon Todd Lipcon
              Reporter:
              tlipcon Todd Lipcon
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: