Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-1052 HDFS scalability with multiple namenodes
  3. HDFS-1718

HDFS Federation: MiniDFSCluster#waitActive() bug causes some tests to fail

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • Federation Branch
    • Federation Branch
    • test
    • None
    • Reviewed

    Description

      MiniDFSCluster#shouldWait() method waits for all the datanodes to come up and register with the namenode.

      Due to threading issues some of the tests fail for two reasons:

      1. Datanode#isDatanodeUp() fails even if all the BPOfferService threads have exited. This is due to Thread.isAlive()
        returning true, even though the thread has exited. Adding a check to BPOfferService#shouldService run as an addition,
        fixes this issues.
      2. shouldWait(), where isBPServiceAlive() is called, does not work when a BPOfferService thread fails before the
        datanode has discovered the BPID, from handshake with namenode. This can be fixed by checking the thread state using
        InetSocketAddress to determine the BPOfferService, instead of BPID.

      Attachments

        1. HDFS-1718.patch
          9 kB
          Suresh Srinivas

        Activity

          People

            sureshms Suresh Srinivas
            sureshms Suresh Srinivas
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: