Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-958

TT should bail out early when mapred.job.tracker is bound to 0:0:0:0

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Incomplete
    • Affects Version/s: 0.21.0
    • Fix Version/s: None
    • Component/s: tasktracker
    • Labels:
      None

      Description

      It's OK for your job tracker's config to tells the JobTracker to come up on port 0:0:0:0, but its not OK for the TaskTrackers to get the same mapred.job.tracker configuration value, as it stops the TT from being able to report its heartbeat.

      This misconfiguration surfaces in the TT's offerService() routine catching and logging a ConnectionRefused exception every time it tries to heartbeat. Now we have improved the error message in such a situation, it is still a bit late in the process to encounter a problem which should be obvious the moment the TT looks at its configuration.

      Better to have the TT refuse to start up if jobTrackAddr.getAddress().isAnyLocalAddress().

        Activity

        Hide
        Allen Wittenauer added a comment -

        This is probably stale, especially with HARM.

        Show
        Allen Wittenauer added a comment - This is probably stale, especially with HARM.
        Hide
        steve_l added a comment -

        That's correct, loopback still works happily. All we need to check for is the isAnyLocalAddress() value, and not isLoopbackAddress(), which is different. Something like (from a TT subclass that has this test and did find my error) :-

                if(jobTrackAddr.getAddress().isAnyLocalAddress()) {
                    throw new IOException("Cannot start the Task Tracker as it has been started with "
                    + "mapred.job.tracker set to icp://"+getJobTrackerAddress()+"/");
                }
        

        Allen, if you fly with colleagues you could set up a WLAN and run work across everyone's machines, though someone needs to bring up a DNS server unless Hadoop works with Bonjour. Me, I stick to virtualized linux images with hand-edited /etc/hosts files

        Show
        steve_l added a comment - That's correct, loopback still works happily. All we need to check for is the isAnyLocalAddress() value, and not isLoopbackAddress() , which is different. Something like (from a TT subclass that has this test and did find my error) :- if (jobTrackAddr.getAddress().isAnyLocalAddress()) { throw new IOException( "Cannot start the Task Tracker as it has been started with " + "mapred.job.tracker set to icp: //" +getJobTrackerAddress()+ "/" ); } Allen, if you fly with colleagues you could set up a WLAN and run work across everyone's machines, though someone needs to bring up a DNS server unless Hadoop works with Bonjour. Me, I stick to virtualized linux images with hand-edited /etc/hosts files
        Hide
        Vinod Kumar Vavilapalli added a comment -

        Even I thought so, but when I verified it("http://java.sun.com/javase/6/docs/api/java/net/InetAddress.html#isAnyLocalAddress()"), I realized that this method checks if the InetAddress in a wildcard addresses only and not the loopback/localhost addresses. So I guess single node clusters will still work fine. Steve?

        Show
        Vinod Kumar Vavilapalli added a comment - Even I thought so, but when I verified it("http://java.sun.com/javase/6/docs/api/java/net/InetAddress.html#isAnyLocalAddress()"), I realized that this method checks if the InetAddress in a wildcard addresses only and not the loopback/localhost addresses. So I guess single node clusters will still work fine. Steve?
        Hide
        Allen Wittenauer added a comment -

        I'm likely misunderstanding, but won't this prevent hadoop working on machines with only a loopback? [I'm thinking primarily of single node clusters on laptops on an airplane.] It would also be interesting to see how Solaris Zones would interpret this changes.

        Show
        Allen Wittenauer added a comment - I'm likely misunderstanding, but won't this prevent hadoop working on machines with only a loopback? [I'm thinking primarily of single node clusters on laptops on an airplane.] It would also be interesting to see how Solaris Zones would interpret this changes.

          People

          • Assignee:
            Unassigned
            Reporter:
            Steve Loughran
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development