Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-3077 Quorum-based protocol for reading and writing edit logs
  3. HDFS-3901

QJM: send 'heartbeat' messages to JNs even when they are out-of-sync

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • QuorumJournalManager (HDFS-3077)
    • None
    • None

    Description

      Currently, if one of the JNs has fallen out of sync with the writer (eg because it went down), it will be marked as such until the next log roll. This causes the writer to no longer send any RPCs to it. This means that the JN's metrics will no longer reflect up-to-date information on how far laggy they are.

      This patch will introduce a heartbeat() RPC that has no effect except to update the JN's view of the latest committed txid. When the writer is talking to an out-of-sync logger, it will send these heartbeat messages once a second.

      In a future patch we can extend the heartbeat functionality so that NNs periodically check their connections to JNs if no edits arrive, such that a fenced NN won't accidentally continue to serve reads indefinitely.

      Attachments

        1. hdfs-3901.txt
          22 kB
          Todd Lipcon
        2. hdfs-3901.txt
          21 kB
          Todd Lipcon

        Activity

          People

            tlipcon Todd Lipcon
            tlipcon Todd Lipcon
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: