Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-4816

transitionToActive blocks if the SBN is doing checkpoint image transfer

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.0.4-alpha, 3.0.0-alpha1
    • Fix Version/s: 2.3.0
    • Component/s: namenode
    • Labels:
      None

      Description

      The NN and SBN do this dance during checkpoint image transfer with nested HTTP GETs via HttpURLConnection. When an admin does a -transitionToActive during this transfer, part of that is interrupting an ongoing checkpoint so we can transition immediately.

      However, the thread.interrupt() in StandbyCheckpointer#stop gets swallowed by connection.getResponseCode() in TransferFsImage#doGetUrl. None of the methods in HttpURLConnection throw InterruptedException, so we need to do something else (perhaps HttpClient [1]):

      [1]: http://hc.apache.org/httpclient-3.x/

        Attachments

        1. hdfs-4816-1.patch
          2 kB
          Andrew Wang
        2. hdfs-4816-2.patch
          4 kB
          Andrew Wang
        3. hdfs-4816-3.patch
          5 kB
          Andrew Wang
        4. hdfs-4816-4.patch
          5 kB
          Andrew Wang
        5. hdfs-4816-slow-shutdown.txt
          103 kB
          Andrew Wang
        6. stacks.out
          770 kB
          Andrew Wang

          Issue Links

            Activity

              People

              • Assignee:
                andrew.wang Andrew Wang
                Reporter:
                andrew.wang Andrew Wang
              • Votes:
                0 Vote for this issue
                Watchers:
                7 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: