Hadoop Common
  1. Hadoop Common
  2. HADOOP-10588

Workaround for jetty6 acceptor startup issue

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.23.11, 2.5.0
    • Component/s: None
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      When a cluster is restarted, jetty is not functioning for a small percentage of datanodes, requiring restart of those datanodes. This is caused by JETTY-1316.

      We've tried overriding isRunning() and retrying on super.isRunning() returning false, as the reporter of JETTY-1316 mentioned in the description. It looks like the code was actually exercised (i.e. the issue was caused by this jetty bug) and the acceptor was working fine after retry.

      Since we will probably move to a later version of jetty after branch-3 is cut, we can put this workaround in branch-2 only.

      1. selector.patch
        4 kB
        Kihwal Lee
      2. selector23.patch
        2 kB
        Kihwal Lee

        Activity

        Hide
        Kihwal Lee added a comment -

        Thanks for the reviews, Sangjin and Jon. I've committed this to branch-0.23 and branch-2.

        Show
        Kihwal Lee added a comment - Thanks for the reviews, Sangjin and Jon. I've committed this to branch-0.23 and branch-2.
        Hide
        Jonathan Eagles added a comment -

        +1. Checking this into branch-2 and branch-0.23 (not trunk)

        Show
        Jonathan Eagles added a comment - +1. Checking this into branch-2 and branch-0.23 (not trunk)
        Hide
        Sangjin Lee added a comment -

        LGTM. We also run into this issue sporadically.

        Show
        Sangjin Lee added a comment - LGTM. We also run into this issue sporadically.
        Hide
        Kihwal Lee added a comment -

        I am not going to submit the patch because it is not meant for trunk and trunk only has HttpServer2.

        Show
        Kihwal Lee added a comment - I am not going to submit the patch because it is not meant for trunk and trunk only has HttpServer2.
        Hide
        Kihwal Lee added a comment -

        The following is from a datanode that hit the race. The acceptor came up successfully on this with the workaround.

        2014-05-07 19:21:14,123 [9276504@qtp-30275147-1 - Acceptor0
        HttpServer$SelectChannelConnectorWithSafeStartup@0.0.0.0:1006] WARN
        org.apache.hadoop.http.HttpServer: HttpServer Acceptor: isRunning is false.
        Rechecking.
        2014-05-07 19:21:14,133 [9276504@qtp-30275147-1 - Acceptor0
        HttpServer$SelectChannelConnectorWithSafeStartup@0.0.0.0:1006] WARN
        org.apache.hadoop.http.HttpServer: HttpServer Acceptor: isRunning is true
        
        Show
        Kihwal Lee added a comment - The following is from a datanode that hit the race. The acceptor came up successfully on this with the workaround. 2014-05-07 19:21:14,123 [9276504@qtp-30275147-1 - Acceptor0 HttpServer$SelectChannelConnectorWithSafeStartup@0.0.0.0:1006] WARN org.apache.hadoop.http.HttpServer: HttpServer Acceptor: isRunning is false. Rechecking. 2014-05-07 19:21:14,133 [9276504@qtp-30275147-1 - Acceptor0 HttpServer$SelectChannelConnectorWithSafeStartup@0.0.0.0:1006] WARN org.apache.hadoop.http.HttpServer: HttpServer Acceptor: isRunning is true
        Hide
        Kihwal Lee added a comment -

        Attaching two patches.

        • selector23.patch is for branch-0.23. It does not have HttpServer2.
        • selector.patch is for branch-2.
        Show
        Kihwal Lee added a comment - Attaching two patches. selector23.patch is for branch-0.23. It does not have HttpServer2. selector.patch is for branch-2.

          People

          • Assignee:
            Kihwal Lee
            Reporter:
            Kihwal Lee
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development