Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-4816

transitionToActive blocks if the SBN is doing checkpoint image transfer

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 2.0.4-alpha, 3.0.0-alpha1
    • 2.3.0
    • namenode
    • None

    Description

      The NN and SBN do this dance during checkpoint image transfer with nested HTTP GETs via HttpURLConnection. When an admin does a -transitionToActive during this transfer, part of that is interrupting an ongoing checkpoint so we can transition immediately.

      However, the thread.interrupt() in StandbyCheckpointer#stop gets swallowed by connection.getResponseCode() in TransferFsImage#doGetUrl. None of the methods in HttpURLConnection throw InterruptedException, so we need to do something else (perhaps HttpClient [1]):

      [1]: http://hc.apache.org/httpclient-3.x/

      Attachments

        1. hdfs-4816-4.patch
          5 kB
          Andrew Wang
        2. stacks.out
          770 kB
          Andrew Wang
        3. hdfs-4816-3.patch
          5 kB
          Andrew Wang
        4. hdfs-4816-slow-shutdown.txt
          103 kB
          Andrew Wang
        5. hdfs-4816-2.patch
          4 kB
          Andrew Wang
        6. hdfs-4816-1.patch
          2 kB
          Andrew Wang

        Issue Links

          Activity

            People

              andrew.wang Andrew Wang
              andrew.wang Andrew Wang
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: