Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-5348

There are innconsistencies and duplication in the design of Cluster and other classes.


    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 2.1.0-beta, 2.0.5-alpha
    • Fix Version/s: None
    • Component/s: None
    • Labels:


      Cluster seems to keep TaskTracker information using class TaskTrackerInfo[].
      getTaskTrackers returns TaskTrackerInfo[]. Ideally, I would like it to simply return TaskTracker[] or else have the method be called getTaskTrackerInfos.

      A user would expect JobTracker information as well. For consistency, I would expect a getJobTracker method to return JobTrackerInfo or JobTracker. This method is not available, instead I see getJobTrackerStatus which returns JobTrackerStatus.

      Although Cluster has TaskTracker and JobTracker, it uses QueueInfo instead of JobQueueInfo, which leaves one wondering what other kinds of queues are there. There is no documentation.

      Not clear why Scheduling info is a totally unstructured string. Perhaps some information should be structured and the class should contain some String member to keep the unstructured information for extensibility.

      JobStatus has information that is duplicated by calls in Job. This will lead to a wrapping function every time a method is added to JobStatus. I think it should be clearly decided as to what belongs in Job and what belongs in JobStatus. Let the users then get the information from JobStatus that is unique to JobStatus.

      In some cases Job History is obtained as getHistoryFile while, in other cases, it is obtained as getHistoryUrl. There is consistency required in naming.




            • Assignee:
              varma.pranay Pranay Varma
            • Votes:
              0 Vote for this issue
              2 Start watching this issue


              • Created: