Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-1505

Cluster class should create the rpc client only when needed

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.20.2
    • Fix Version/s: 0.22.0
    • Component/s: client
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Lazily construct a connection to the JobTracker from the job-submission client.

      Description

      It will be good to have the org.apache.hadoop.mapreduce.Cluster create the rpc client object only when needed (when a call to the jobtracker is actually required). org.apache.hadoop.mapreduce.Job constructs the Cluster object internally and in many cases the application that created the Job object really wants to look at the configuration only. It'd help to not have these connections to the jobtracker especially when Job is used in the tasks (for e.g., Pig calls mapreduce.FileInputFormat.setInputPath in the tasks and that requires a Job object to be passed).

      In Hadoop 20, the Job object internally creates the JobClient object, and the same argument applies there too.

      1. MAPREDUCE-1505_yhadoop20_9.patch
        3 kB
        Arun C Murthy
      2. MAPREDUCE-1505_yhadoop20.patch
        3 kB
        Arun C Murthy
      3. mapreduce-1505--2010-05-19.patch
        11 kB
        Dick King
      4. mapreduce-1505--2010-05-26.patch
        33 kB
        Dick King

        Issue Links

          Activity

          Devaraj Das created issue -
          Devaraj Das made changes -
          Field Original Value New Value
          Summary Job class should create the rpc client only when needed Cluster class should create the rpc client only when needed
          Hide
          Arun C Murthy added a comment -

          Patch for yahoo20, not to be committed.

          Show
          Arun C Murthy added a comment - Patch for yahoo20, not to be committed.
          Arun C Murthy made changes -
          Attachment MAPREDUCE-1505_yhadoop20.patch [ 12436628 ]
          Amareshwari Sriramadasu made changes -
          Link This issue blocks MAPREDUCE-118 [ MAPREDUCE-118 ]
          Hide
          Arun C Murthy added a comment -

          Patch for an older version of yahoo-hadoop-0.20, not for commit.

          Show
          Arun C Murthy added a comment - Patch for an older version of yahoo-hadoop-0.20, not for commit.
          Arun C Murthy made changes -
          Attachment MAPREDUCE-1505_yhadoop20_9.patch [ 12440995 ]
          Hide
          Vinod Kumar Vavilapalli added a comment -

          I looked at the latest 20 patch. Looks good except that the changes in ensureState() are never reachable.

          Show
          Vinod Kumar Vavilapalli added a comment - I looked at the latest 20 patch. Looks good except that the changes in ensureState() are never reachable.
          Hide
          Amareshwari Sriramadasu added a comment -

          The patch looks good to me also.

          Show
          Amareshwari Sriramadasu added a comment - The patch looks good to me also.
          Arun C Murthy made changes -
          Release Note Lazily construct a connection to the JobTracker from the job-submission client.
          Dick King made changes -
          Assignee Dick King [ dking ]
          Amareshwari Sriramadasu made changes -
          Link This issue blocks MAPREDUCE-118 [ MAPREDUCE-118 ]
          Amareshwari Sriramadasu made changes -
          Link This issue is depended upon by MAPREDUCE-118 [ MAPREDUCE-118 ]
          Hide
          Dick King added a comment -

          Delays making a connection to the job tracker node until it's needed.

          Provides a new API so a user can tell whether this has been done, for a given job [although usually there would be no need to know].

          Show
          Dick King added a comment - Delays making a connection to the job tracker node until it's needed. Provides a new API so a user can tell whether this has been done, for a given job [although usually there would be no need to know] .
          Dick King made changes -
          Attachment mapreduce-1505--2010-05-19.patch [ 12444965 ]
          Dick King made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12444965/mapreduce-1505--2010-05-19.patch
          against trunk revision 944427.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/193/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/193/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/193/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/193/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12444965/mapreduce-1505--2010-05-19.patch against trunk revision 944427. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/193/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/193/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/193/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/193/console This message is automatically generated.
          Hide
          Arun C Murthy added a comment -

          I think a simpler fix is to have o.a.h.mapreduce.Job create o.a.h.mapreduce.Cluster only when needed...

          Show
          Arun C Murthy added a comment - I think a simpler fix is to have o.a.h.mapreduce.Job create o.a.h.mapreduce.Cluster only when needed...
          Arun C Murthy made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Hide
          Dick King added a comment -

          All of the o.a.h.mapreduce.Job constructors that don't require the caller to have already created and supplied a Cluster are deprecated.

          Show
          Dick King added a comment - All of the o.a.h.mapreduce.Job constructors that don't require the caller to have already created and supplied a Cluster are deprecated.
          Hide
          Amareshwari Sriramadasu added a comment -

          All of the o.a.h.mapreduce.Job constructors that don't require the caller to have already created and supplied a Cluster are deprecated.

          Dick, I did not understand your comment above. Job constructors are deprecated in favor of static getInstance methods wrt comment1 and comment2

          If the user is passing a Cluster handle, it is fine to initialize it in the constructor. So, current constructors and getInstance methods look fine. Only if user does not pass Cluster handle, then we need to create it lazily.

          We can add following method in Job.java which creates Cluster lazily:

          public static getInstance(Configuration conf)
          

          Also, will have to change deprecated constructors to create Cluster handle lazily.

          Thoughts?

          Show
          Amareshwari Sriramadasu added a comment - All of the o.a.h.mapreduce.Job constructors that don't require the caller to have already created and supplied a Cluster are deprecated. Dick, I did not understand your comment above. Job constructors are deprecated in favor of static getInstance methods wrt comment1 and comment2 If the user is passing a Cluster handle, it is fine to initialize it in the constructor. So, current constructors and getInstance methods look fine. Only if user does not pass Cluster handle, then we need to create it lazily. We can add following method in Job.java which creates Cluster lazily: public static getInstance(Configuration conf) Also, will have to change deprecated constructors to create Cluster handle lazily. Thoughts?
          Hide
          Amareshwari Sriramadasu added a comment -

          public static getInstance(Configuration conf)

          This should be

          public static Job getInstance(Configuration conf);
          
          Show
          Amareshwari Sriramadasu added a comment - public static getInstance(Configuration conf) This should be public static Job getInstance(Configuration conf);
          Hide
          Dick King added a comment -

          Restructured the patch to lazily create a Cluster , and replaced MR's uses of the deprecated constructors in Job with the new factory methods.

          Show
          Dick King added a comment - Restructured the patch to lazily create a Cluster , and replaced MR's uses of the deprecated constructors in Job with the new factory methods.
          Dick King made changes -
          Attachment mapreduce-1505--2010-05-26.patch [ 12445678 ]
          Hide
          Dick King added a comment -

          The patch is substantially longer than its predecessors. The core of the patch is simpler, but this patch also converts the direct callers of Job constructors to call its corresponding factory methods.

          Show
          Dick King added a comment - The patch is substantially longer than its predecessors. The core of the patch is simpler, but this patch also converts the direct callers of Job constructors to call its corresponding factory methods.
          Dick King made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12445678/mapreduce-1505--2010-05-26.patch
          against trunk revision 947758.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 90 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/208/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/208/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/208/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/208/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12445678/mapreduce-1505--2010-05-26.patch against trunk revision 947758. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 90 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/208/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/208/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/208/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/208/console This message is automatically generated.
          Hide
          Chris Douglas added a comment -

          +1

          I committed this. Thanks, Dick!

          Show
          Chris Douglas added a comment - +1 I committed this. Thanks, Dick!
          Chris Douglas made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Hadoop Flags [Reviewed]
          Resolution Fixed [ 1 ]
          Konstantin Shvachko made changes -
          Status Resolved [ 5 ] Closed [ 6 ]

            People

            • Assignee:
              Dick King
              Reporter:
              Devaraj Das
            • Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development