Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-16100

HA: Improve performance of Standby node transition to Active

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Patch Available
    • Major
    • Resolution: Unresolved
    • 3.3.1
    • None
    • namenode
    • None

    Description

      pendingDNMessages in Standby is used to support process postponed block reports. Block reports in pendingDNMessages would be processed:

      1. If GS of replica is in the future, Standby Node will process it when corresponding edit log(e.g add_block) is loaded.
      2. If replica is corrupted, Standby Node will process it while it transfer to Active.
      3. If DataNode is removed, corresponding of block reports will be removed in pendingDNMessages.

      Obviously, if num of corrupted replica grows, more time cost during transferring. In out situation, there're 60 millions block reports in pendingDNMessages before transfer. Processing block reports cost almost 7mins and it's killed by zkfc. The replica state of the most block reports is RBW with wrong GS(less than storedblock in Standby Node).

      In my opinion, Standby Node could ignore the block reports that replica state is RBW with wrong GS. Because Active node/DataNode will remove it later.

       

      Attachments

        1. HDFS-16100.001.patch
          2 kB
          wudeyu
        2. HDFS-16100.patch
          2 kB
          wudeyu

        Activity

          People

            g20141821 wudeyu
            g20141821 wudeyu
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: