Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-5342

There are methods that should be deprecated and new methods should be added with names matching purposes

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 2.1.0-beta, 2.0.5-alpha
    • Fix Version/s: None
    • Component/s: jobtracker
    • Labels:
      None
    • Environment:

      Does not matter

      Description

      ClusterStatus class has the following methods that need to be deprecated and new methods added:
      getMapTasks does not return map tasks, it returns the number of map tasks.

      • getBlacklistedTrackers -> getNumBlacklistedTrackers
      • getMapTasks -> getNumMapTasks
      • getReduceTasks -> getNumReduceTasks
      • getTaskTrackers -> getNumTaskTrackers

      Cluster class needs the following change:
      There is a ClusterStatus class. When getClusterStatus is called, one would expect ClusterStatus to be returned. Instead, one gets ClusterMetrics.

      • getClusterStatus -> getClusterMetrics

      Job class has the following methods that need to be deprecated and new methods added to match the purposes:

      mapProgress suggests that the method is going to map progress and is misleading because, in fact, the method provides progress information about map tasks. It should be deprecated and a method should be added with a name that matches the purpose: getMapTasksProgress or getMapProgress.

      • mapProgress -> getMapProgress
      • reduceProgress -> getReduceProgress
      • cleanupProgress -> getCleanupProgress

      JobStatus:

      • getQueue -> getQueueName

      JobClient:

      • getAllJobs -> getJobStatuses

        Issue Links

          Activity

          Hide
          Karthik Kambatla (Inactive) added a comment -

          While it sounds nice to rename methods to reflect their behavior better, it would lead to breaking several user apps which rely on them. Having multiple methods with different names might be an overkill. So, it might be better just to leave them alone.

          Show
          Karthik Kambatla (Inactive) added a comment - While it sounds nice to rename methods to reflect their behavior better, it would lead to breaking several user apps which rely on them. Having multiple methods with different names might be an overkill. So, it might be better just to leave them alone.
          Hide
          Pranay Varma added a comment -

          I think the goal should be to refactor and correct mistakes as time goes on. While it is a noble goal to not break existing apps, it is also a noble goal to make it easier and intuitive for those who are going to use it afresh. Java itself has moved towards such a goal by deprecating methods that needed to be enhanced. Hadoop has deprecated quite a few methods as well. All Job constructors are deprecated.

          Show
          Pranay Varma added a comment - I think the goal should be to refactor and correct mistakes as time goes on. While it is a noble goal to not break existing apps, it is also a noble goal to make it easier and intuitive for those who are going to use it afresh. Java itself has moved towards such a goal by deprecating methods that needed to be enhanced. Hadoop has deprecated quite a few methods as well. All Job constructors are deprecated.

            People

            • Assignee:
              Unassigned
              Reporter:
              Pranay Varma
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:

                Development