Hadoop Common
  1. Hadoop Common
  2. HADOOP-3930

Decide how to integrate scheduler info into CLI and job tracker web page

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.19.0
    • Fix Version/s: 0.19.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Changed TaskScheduler to expose API for Web UI and Command Line Tool.

      Description

      We need a way for job schedulers such as HADOOP-3445 and HADOOP-3476 to provide info to display on the JobTracker web interface and in the CLI. The main things needed seem to be:

      • A way for schedulers to provide info to show in a column on the web UI and in the CLI - something as simple as a single string, or a map<string, int> for multiple parameters.
      • Some sorting order for jobs - maybe a method to sort a list of jobs.

      Let's figure out what the best way to do this is and implement it in the existing schedulers.

      My first-order proposal at an API: Augment the TaskScheduler with

      • public Map<String, String> getSchedulingInfo(JobInProgress job) – returns key-value pairs which are displayed in columns on the web UI or the CLI for the list of jobs.
      • public Map<String, String> getSchedulingInfo(String queue) – returns key-value pairs which are displayed in columns on the web UI or the CLI for the list of queues.
      • public Collection<JobInProgress> getJobs(String queueName) – returns the list of jobs in a given queue, sorted by a scheduler-specific order (the order it wants to run them in / schedule the next task in / etc).
      • public List<String> getQueues();
      1. HADOOP-3930-11.patch
        61 kB
        Sreekanth Ramakrishnan
      2. HADOOP-3930-10.patch
        60 kB
        Sreekanth Ramakrishnan
      3. HADOOP-3930-9.patch
        63 kB
        Sreekanth Ramakrishnan
      4. HADOOP-3930-8.patch
        63 kB
        Sreekanth Ramakrishnan
      5. HADOOP-3930-7.patch
        57 kB
        Sreekanth Ramakrishnan
      6. HADOOP-3930-6.patch
        48 kB
        Sreekanth Ramakrishnan
      7. HADOOP-3930-5.patch
        48 kB
        Sreekanth Ramakrishnan
      8. HADOOP-3930-4.patch
        46 kB
        Sreekanth Ramakrishnan
      9. HADOOP-3930-3.patch
        46 kB
        Sreekanth Ramakrishnan
      10. HADOOP-3930-2.patch
        38 kB
        Sreekanth Ramakrishnan
      11. mockup.JPG
        175 kB
        Sreekanth Ramakrishnan
      12. 3930-1.patch
        12 kB
        Sreekanth Ramakrishnan

        Issue Links

          Activity

          Hide
          Vivek Ratan added a comment -

          I think we need to first decide whether queues are explicit in this API or not. The problem with making queues explicit in the API is that every scheduler will have to support one, or at least a default one. But that's not so bad, IMO.

          getSchedulingInfo() should really return key-value pairs for queues, not for jobs. In the HADOOP-3445 scheduler, for example, we need to display scheduling information associated with a queue - its capacity (both 'guaranteed' and 'allocated'), how many unique users have submitted jobs, how many tasks are running, how many are waiting. etc. This information is per queue, and doesn't make sense per job. I'd much rather have getSchedulingInfo() take in a queue name as a parameter, if we make queues explicit. In fact, I don't see what kind of scheduling information you'd associate with a job. Matei, do you have examples of what getSchedulingInfo would return for jobs?

          Similarly, getJobComparator() makes more sense when applied to a queue. In 3445, jobs are ordered per queue, and there is no global ordering. Furthermore, doesn't it make more sense to get a sorted collection of jobs, per queue, back from the scheduler, rather than a Comparator? Or are you imagining the UI and CLI to maintain a list of jobs all the time and then apply the comparator periodically?

          Show
          Vivek Ratan added a comment - I think we need to first decide whether queues are explicit in this API or not. The problem with making queues explicit in the API is that every scheduler will have to support one, or at least a default one. But that's not so bad, IMO. getSchedulingInfo() should really return key-value pairs for queues, not for jobs. In the HADOOP-3445 scheduler, for example, we need to display scheduling information associated with a queue - its capacity (both 'guaranteed' and 'allocated'), how many unique users have submitted jobs, how many tasks are running, how many are waiting. etc. This information is per queue, and doesn't make sense per job. I'd much rather have getSchedulingInfo() take in a queue name as a parameter, if we make queues explicit. In fact, I don't see what kind of scheduling information you'd associate with a job. Matei, do you have examples of what getSchedulingInfo would return for jobs? Similarly, getJobComparator() makes more sense when applied to a queue. In 3445, jobs are ordered per queue, and there is no global ordering. Furthermore, doesn't it make more sense to get a sorted collection of jobs, per queue, back from the scheduler, rather than a Comparator? Or are you imagining the UI and CLI to maintain a list of jobs all the time and then apply the comparator periodically?
          Hide
          Matei Zaharia added a comment -

          Making queues explicit makes sense for the purposes of getSchedulingInfo then. As for what it should do when applied to a job, in the fair scheduler at least we can have it show the job's fair share of map slots / reduce slots and its weight in the fair sharing calculations. This was useful both for debugging and for letting administrators understand the effects of putting jobs in a particular pool, changing their priority, etc.

          Regarding the comparator, I made it that because Owen/Sameer/Arun wanted to also be able to compare a subset of the jobs, for example to be able to filter jobs by user or something of that sort. With a comparator, you choose your subset as you wish and then sort it. (In all this I'm assuming that the JobTracker or JobQueueManager knows the full list of jobs and can therefore filter it.) However, it would also be possible to return the whole job list and filter it afterwards - which one is easier?

          Show
          Matei Zaharia added a comment - Making queues explicit makes sense for the purposes of getSchedulingInfo then. As for what it should do when applied to a job, in the fair scheduler at least we can have it show the job's fair share of map slots / reduce slots and its weight in the fair sharing calculations. This was useful both for debugging and for letting administrators understand the effects of putting jobs in a particular pool, changing their priority, etc. Regarding the comparator, I made it that because Owen/Sameer/Arun wanted to also be able to compare a subset of the jobs, for example to be able to filter jobs by user or something of that sort. With a comparator, you choose your subset as you wish and then sort it. (In all this I'm assuming that the JobTracker or JobQueueManager knows the full list of jobs and can therefore filter it.) However, it would also be possible to return the whole job list and filter it afterwards - which one is easier?
          Hide
          Matei Zaharia added a comment -

          Added getSchedulingInfo(queue).

          Show
          Matei Zaharia added a comment - Added getSchedulingInfo(queue).
          Hide
          Hemanth Yamijala added a comment -

          We probably also need a way to get all the configured queues. Something like:

          • public List<String> getQueues();

          By default, this would return the single default queue that is there today in the jobtracker. Makes sense ?

          Show
          Hemanth Yamijala added a comment - We probably also need a way to get all the configured queues. Something like: public List<String> getQueues(); By default, this would return the single default queue that is there today in the jobtracker. Makes sense ?
          Hide
          Matei Zaharia added a comment -

          Okay, I added that too.

          Show
          Matei Zaharia added a comment - Okay, I added that too.
          Hide
          Vivek Ratan added a comment -

          Regarding the comparator, I made it that ...it would also be possible to return the whole job list and filter it afterwards - which one is easier?

          I don't think a Comparator is the right abstraction here. There is a difference between filtering and reordering. A Comparator is probably needed for the latter, but not for filtering. The Scheduler imposes an ordering on the jobs. A caller may choose to see (filter) only some of those jobs, but the ordering is determined by the Scheduler. I think you need a method like:

          Collection<JobInProgress> getJobs(String queueName)
          

          Users can filter this collection as they seem fit.

          Show
          Vivek Ratan added a comment - Regarding the comparator, I made it that ...it would also be possible to return the whole job list and filter it afterwards - which one is easier? I don't think a Comparator is the right abstraction here. There is a difference between filtering and reordering. A Comparator is probably needed for the latter, but not for filtering. The Scheduler imposes an ordering on the jobs. A caller may choose to see (filter) only some of those jobs, but the ordering is determined by the Scheduler. I think you need a method like: Collection<JobInProgress> getJobs( String queueName) Users can filter this collection as they seem fit.
          Hide
          Matei Zaharia added a comment -

          Makes sense, I've changed it to that.

          Show
          Matei Zaharia added a comment - Makes sense, I've changed it to that.
          Hide
          Sreekanth Ramakrishnan added a comment - - edited

          Attaching patch with adding API's in the TaskScheduler to expose scheduling information related to it.

          Added following methods to TaskScheduler:

          Map<String,String> getSchedulingInfo(JobInProgress job) - Returns map containing scheduling information related to a particular job
          Map<String,String> getSchedulingInfo(String queueName ) - Returns map containing scheduling information related to a particular queue.
          Collection<JobInProgress> getJobs(String queue) - Returns a list of jobs for particular queue
          List<String> getQueues() - Returns all the queues which scheduler uses.

          List<String> getQueueSchedulingParameterList() - Returns ordered List of the scheduling parameters related to queues.
          List<String> getJobSchedulingParameterList() - Returns ordered List of the scheduling parameters related to a particular Job

          The above two methods were introduced, to determine the the order in which the columns in a table have to be generated by the web UI.

          A new method was introduced in JobTracker:
          TaskScheduler getTaskScheduler() - Returns the instance of task scheduler which is used by JobTracker.

          JobQueueTaskScheduler and LimitTasksPerJobTaskScheduler have been modified to implement the new API's to expose scheduling information.

          Have made changes in the jobtracker.jsp to do the following:

          Create a new section called Scheduler information and build a table dynamically for displaying the scheduler information pertaining to queues which scheduler holds. The order of the column is determined by value returned from getQueueSchedulingParameterList().
          Created sections in the Job Table generation for displaying scheduling information pertaining to the particular job. The order of the column is determined by value returned from getJobSchedulingParameterList ().

          If a particular scheduler returns null for getQueueSchedulingParameterList, then the new section called Scheduler information is not displayed in the jobtracker.jsp
          If a particular scheduler returns null for the getSchedulingInfo(JobInProgress job) then no new section is added on to the Job Table.

          Any thoughts on improving the above approach

          Show
          Sreekanth Ramakrishnan added a comment - - edited Attaching patch with adding API's in the TaskScheduler to expose scheduling information related to it. Added following methods to TaskScheduler: Map<String,String> getSchedulingInfo(JobInProgress job) - Returns map containing scheduling information related to a particular job Map<String,String> getSchedulingInfo(String queueName ) - Returns map containing scheduling information related to a particular queue. Collection<JobInProgress> getJobs(String queue) - Returns a list of jobs for particular queue List<String> getQueues() - Returns all the queues which scheduler uses. List<String> getQueueSchedulingParameterList() - Returns ordered List of the scheduling parameters related to queues. List<String> getJobSchedulingParameterList() - Returns ordered List of the scheduling parameters related to a particular Job The above two methods were introduced, to determine the the order in which the columns in a table have to be generated by the web UI. A new method was introduced in JobTracker: TaskScheduler getTaskScheduler() - Returns the instance of task scheduler which is used by JobTracker. JobQueueTaskScheduler and LimitTasksPerJobTaskScheduler have been modified to implement the new API's to expose scheduling information. Have made changes in the jobtracker.jsp to do the following: Create a new section called Scheduler information and build a table dynamically for displaying the scheduler information pertaining to queues which scheduler holds. The order of the column is determined by value returned from getQueueSchedulingParameterList(). Created sections in the Job Table generation for displaying scheduling information pertaining to the particular job. The order of the column is determined by value returned from getJobSchedulingParameterList (). If a particular scheduler returns null for getQueueSchedulingParameterList, then the new section called Scheduler information is not displayed in the jobtracker.jsp If a particular scheduler returns null for the getSchedulingInfo(JobInProgress job) then no new section is added on to the Job Table. Any thoughts on improving the above approach
          Hide
          Hemanth Yamijala added a comment -

          Sreekanth selected Submit Patch by mistake instead of attaching the patch. Canceling it on his behalf, as he's not added to the list of contributors yet.

          Show
          Hemanth Yamijala added a comment - Sreekanth selected Submit Patch by mistake instead of attaching the patch. Canceling it on his behalf, as he's not added to the list of contributors yet.
          Hide
          Sreekanth Ramakrishnan added a comment -

          UI Mockup with default JobQueueTaskScheduler

          Show
          Sreekanth Ramakrishnan added a comment - UI Mockup with default JobQueueTaskScheduler
          Hide
          dhruba borthakur added a comment -

          The jobs-page is actually useful to users who have submitted jobs. The scheduler information is typically important for cluster administrators. Does it make sense to make the scheduler information show up as a separate page rather than the same page that lists all user's jobs? Of course, power users might want to see scheduling information sometime.

          Show
          dhruba borthakur added a comment - The jobs-page is actually useful to users who have submitted jobs. The scheduler information is typically important for cluster administrators. Does it make sense to make the scheduler information show up as a separate page rather than the same page that lists all user's jobs? Of course, power users might want to see scheduling information sometime.
          Hide
          Hemanth Yamijala added a comment -

          Regarding where the scheduler information should be - there are two types of scheduler information:

          • Job specific - which should be with the jobs
          • System wide - which can be either with the jobs or on a different page, as Dhruba points. I am fine with either, but I am leaning more towards having it on the jobs page because:
            • Then all scheduler information is on one page
            • As Dhruba agrees, power users might still want to see scheduling information.

          Also, as we are discussing in HADOOP-3698, it may be that queues being acknowledged as part of the mapred system, might not be in the scheduler API. I think we should wait for a consensus on that before moving forward on this issue.

          Show
          Hemanth Yamijala added a comment - Regarding where the scheduler information should be - there are two types of scheduler information: Job specific - which should be with the jobs System wide - which can be either with the jobs or on a different page, as Dhruba points. I am fine with either, but I am leaning more towards having it on the jobs page because: Then all scheduler information is on one page As Dhruba agrees, power users might still want to see scheduling information. Also, as we are discussing in HADOOP-3698 , it may be that queues being acknowledged as part of the mapred system, might not be in the scheduler API. I think we should wait for a consensus on that before moving forward on this issue.
          Hide
          Hemanth Yamijala added a comment -

          Also, as we are discussing in HADOOP-3698, it may be that queues being acknowledged as part of the mapred system, might not be in the scheduler API. I think we should wait for a consensus on that before moving forward on this issue

          I must clarify that I was talking only about the List<String> getQueues() API, and not the rest of the methods proposed here. Sorry for any confusion caused.

          As mentioned here, we are now considering adding this to a new class called QueueManager.

          Show
          Hemanth Yamijala added a comment - Also, as we are discussing in HADOOP-3698 , it may be that queues being acknowledged as part of the mapred system, might not be in the scheduler API. I think we should wait for a consensus on that before moving forward on this issue I must clarify that I was talking only about the List<String> getQueues() API, and not the rest of the methods proposed here. Sorry for any confusion caused. As mentioned here , we are now considering adding this to a new class called QueueManager .
          Hide
          Vinod Kumar Vavilapalli added a comment -

          Some comments about the patch ( independent of where the scheduler information is displayed)

          • If job specific information is seen to be not very common across schedulers, we can give a default implementation for getSchedulingInfo(JobInProgress job) returning null in TaskScheduler.
          • Every scheduler may not be concerned about how clients of scheduling information ordered it. Either getQueueSchedulingParameterList() and getJobSchedulingParameterList() can have default implementations to return the keys of the corresponding SchedulingInfo maps, or we can altogether remove these methods and treat scheduler information similar to the way we treat job-list, let the scheduler give out information in the order that it imposes.
          Show
          Vinod Kumar Vavilapalli added a comment - Some comments about the patch ( independent of where the scheduler information is displayed) If job specific information is seen to be not very common across schedulers, we can give a default implementation for getSchedulingInfo(JobInProgress job) returning null in TaskScheduler. Every scheduler may not be concerned about how clients of scheduling information ordered it. Either getQueueSchedulingParameterList() and getJobSchedulingParameterList() can have default implementations to return the keys of the corresponding SchedulingInfo maps, or we can altogether remove these methods and treat scheduler information similar to the way we treat job-list, let the scheduler give out information in the order that it imposes.
          Hide
          Owen O'Malley added a comment -

          After thinking about this for a bit, I think a more natural interface for getSchedulingInfo would be:

          class JobInProgress {
            ...
            Object getSchedulerInfo() { ... }
            void setSchedulerInfo(Object info) {...}
          

          The scheduler can then add its own information directly into the JobInProgress. Clearly each scheduler would have its own type for scheduler info. The framework would use the scheduler info's toString() method to generate the string for the user. Thoughts?

          Show
          Owen O'Malley added a comment - After thinking about this for a bit, I think a more natural interface for getSchedulingInfo would be: class JobInProgress { ... Object getSchedulerInfo() { ... } void setSchedulerInfo( Object info) {...} The scheduler can then add its own information directly into the JobInProgress. Clearly each scheduler would have its own type for scheduler info. The framework would use the scheduler info's toString() method to generate the string for the user. Thoughts?
          Hide
          Owen O'Malley added a comment -

          The framework would also need to handle null values in the scheduler info. Also note that this will replace the map in all of the schedulers that looks like:

          Map<JobInProgress, JobInfo> infos = ...
          
          Show
          Owen O'Malley added a comment - The framework would also need to handle null values in the scheduler info. Also note that this will replace the map in all of the schedulers that looks like: Map<JobInProgress, JobInfo> infos = ...
          Hide
          Matei Zaharia added a comment -

          That's a good idea, and should make the schedulers more efficient as well.

          Show
          Matei Zaharia added a comment - That's a good idea, and should make the schedulers more efficient as well.
          Hide
          Hemanth Yamijala added a comment -

          A couple of points if we are moving towards Owen's proposal:

          • In Sreekanth's comment above, he's mentioned that the attached patch adds one column for each entry in the Map returned by get...SchedulingInfo() APIs. The other option, of course, is to display all scheduling info in a single column.
            • The advantage of the multi-column approach is purely usability and aesthetics (schedulers which have per queue scheduling info will show only a name, and the scheduling info as a string, which will look quite odd). Also, it will allow changes to the UI easier, IMHO.
            • The advantage of the single column approach is simplicity for the current implementation.
          • I personally prefer multi-column, but willing to go with consensus.
          • If we go with the multi-column approach though, building scheduling information out of a toString API becomes harder.
          • If we do go with Owen's approach, I think we might also need:
            class TaskScheduler {
              Object getSchedulerInfo(String queueName);
            }
            

            to handle cases where the scheduler has a per queue specific info.

          Please try to vote on your preferred UI approach, if any, so we can move this forward.

          Show
          Hemanth Yamijala added a comment - A couple of points if we are moving towards Owen's proposal: In Sreekanth's comment above, he's mentioned that the attached patch adds one column for each entry in the Map returned by get...SchedulingInfo() APIs. The other option, of course, is to display all scheduling info in a single column. The advantage of the multi-column approach is purely usability and aesthetics (schedulers which have per queue scheduling info will show only a name, and the scheduling info as a string, which will look quite odd). Also, it will allow changes to the UI easier, IMHO. The advantage of the single column approach is simplicity for the current implementation. I personally prefer multi-column, but willing to go with consensus. If we go with the multi-column approach though, building scheduling information out of a toString API becomes harder. If we do go with Owen's approach, I think we might also need: class TaskScheduler { Object getSchedulerInfo( String queueName); } to handle cases where the scheduler has a per queue specific info. Please try to vote on your preferred UI approach, if any, so we can move this forward.
          Hide
          Hemanth Yamijala added a comment -

          I had an offline conversation with Owen and we came to the following proposal:

          • While the usability of the UI is enhanced by displaying a one-column per queue / job scheduling attribute, in the interest of simplicity, we are proposing to display the information as a single string in a single column.
          • This information would be available via a toString() API on the SchedulerInfo object proposed by Owen above.
            • One of the most important reasons to do it this way is to keep in mind that this information needs to be consumed by the CLI too, which should be transferred on wire. Passing something like a map is going to be tricky for the framework.
            • Also, as seen from discussions above, requiring additional APIs to determine the column order etc become unnecessary if we assume the scheduler will take care of formatting the string in the scheduler info as it pleases. This makes the API simpler.
          • Regarding getting the scheduler info per queue, Owen proposed adding this to the QueueManager class being discussed in HADOOP-3698. Something like:
            class QueueManager {
              public void setSchedulingInfo(String queue, Object queueInfo);
              public Object getSchedulingInfo(String queue);
            }
            
          Show
          Hemanth Yamijala added a comment - I had an offline conversation with Owen and we came to the following proposal: While the usability of the UI is enhanced by displaying a one-column per queue / job scheduling attribute, in the interest of simplicity, we are proposing to display the information as a single string in a single column. This information would be available via a toString() API on the SchedulerInfo object proposed by Owen above. One of the most important reasons to do it this way is to keep in mind that this information needs to be consumed by the CLI too, which should be transferred on wire. Passing something like a map is going to be tricky for the framework. Also, as seen from discussions above, requiring additional APIs to determine the column order etc become unnecessary if we assume the scheduler will take care of formatting the string in the scheduler info as it pleases. This makes the API simpler. Regarding getting the scheduler info per queue, Owen proposed adding this to the QueueManager class being discussed in HADOOP-3698 . Something like: class QueueManager { public void setSchedulingInfo( String queue, Object queueInfo); public Object getSchedulingInfo( String queue); }
          Hide
          Hemanth Yamijala added a comment -

          One thing we haven't discussed this far is the changes to the framework to aid the CLI.

          Showing scheduling information related to a job seems easy. We can augment JobStatus to contain a String schedulerInfo.

          For showing the queue related information, one approach could be as follows:

          public class QueueInfo implements Writable {
            String queueName;
            String schedulerInfo;
            ...
          }
          
          public interface JobSubmissionProtocol {
            ...
            QueueInfo[] getQueues();
            QueueInfo getQueueInfo(String queue);
            JobStatus[] getJobs(String queue);  
          }
          

          These APIs are similar to the Job related APIs, like getAllJobs(), getJobStatus(JobID), and getMap/ReduceTaskReports(JobID). Still, I am a little worried adding these to JobSubmissionProtocol since getQueues() and getQueueInfo() don't per-se relate to jobs directly. The alternative though seems to be to define a new protocol that has this info. Open to comments on which is better.

          Show
          Hemanth Yamijala added a comment - One thing we haven't discussed this far is the changes to the framework to aid the CLI. Showing scheduling information related to a job seems easy. We can augment JobStatus to contain a String schedulerInfo . For showing the queue related information, one approach could be as follows: public class QueueInfo implements Writable { String queueName; String schedulerInfo; ... } public interface JobSubmissionProtocol { ... QueueInfo[] getQueues(); QueueInfo getQueueInfo( String queue); JobStatus[] getJobs( String queue); } These APIs are similar to the Job related APIs, like getAllJobs(), getJobStatus(JobID), and getMap/ReduceTaskReports(JobID) . Still, I am a little worried adding these to JobSubmissionProtocol since getQueues() and getQueueInfo() don't per-se relate to jobs directly. The alternative though seems to be to define a new protocol that has this info. Open to comments on which is better.
          Hide
          Owen O'Malley added a comment -

          QueueInfo should have private fields and getters.

          JobSubmissionProtocol is not public and therefore the JobClient needs identical methods.

          Show
          Owen O'Malley added a comment - QueueInfo should have private fields and getters. JobSubmissionProtocol is not public and therefore the JobClient needs identical methods.
          Hide
          Sreekanth Ramakrishnan added a comment -

          Attaching a patch with following changes according to Owens and Hemanth's comments:
          Added following method to JobSubmissionProtocol

          public JobQueueInfo getJobQueueInfo(String queue);
          public JobQueueInfo[] getJobQueueInfos();
          public JobStatus[] getAllJobs(String queue);
          

          Added a new method to TaskScheduler

          public abstract Collection<JobInProgress> getJobs(String queueName);
          

          Added a new class to encapsulate the Scheduling information related to Job Queues :: JobQueueInfo

          Added new jsp page to display queue details and list of jobs held by the queue along with the Queue Scheduling Information: jobqueue_details.jsp

          Refactored Job Table generation into a new class in org.apache.hadoop.mapred.JSPUtil

          Added new command line options in the JobClient.java

          Currently the patch has no test case attached alongwith it. Would be attaching them soon.

          Show
          Sreekanth Ramakrishnan added a comment - Attaching a patch with following changes according to Owens and Hemanth's comments: Added following method to JobSubmissionProtocol public JobQueueInfo getJobQueueInfo( String queue); public JobQueueInfo[] getJobQueueInfos(); public JobStatus[] getAllJobs( String queue); Added a new method to TaskScheduler public abstract Collection<JobInProgress> getJobs( String queueName); Added a new class to encapsulate the Scheduling information related to Job Queues :: JobQueueInfo Added new jsp page to display queue details and list of jobs held by the queue along with the Queue Scheduling Information: jobqueue_details.jsp Refactored Job Table generation into a new class in org.apache.hadoop.mapred.JSPUtil Added new command line options in the JobClient.java Currently the patch has no test case attached alongwith it. Would be attaching them soon.
          Hide
          Sreekanth Ramakrishnan added a comment -

          Attaching a new patch with following changes:

          Made modification to CapacityTaskScheduler to use the new API for the web UI and command line interface

          Added test case which makes use of the new methods introduced in the JobSubmissionProtocol

          Show
          Sreekanth Ramakrishnan added a comment - Attaching a new patch with following changes: Made modification to CapacityTaskScheduler to use the new API for the web UI and command line interface Added test case which makes use of the new methods introduced in the JobSubmissionProtocol
          Hide
          Sreekanth Ramakrishnan added a comment -

          Updated the JobQueueInfo class because of a NullPointerException being thrown for the default scheduler, pointed out by Hemanth.

          Fixed a findbugs warnings in LimitTasksPerJobTaskScheduler regarding incorrect synchronization.

          Show
          Sreekanth Ramakrishnan added a comment - Updated the JobQueueInfo class because of a NullPointerException being thrown for the default scheduler, pointed out by Hemanth. Fixed a findbugs warnings in LimitTasksPerJobTaskScheduler regarding incorrect synchronization.
          Hide
          Hemanth Yamijala added a comment -

          JobTracker:

          • getAllJobs: if the scheduler returns null, it should return an empty JobStatus array.
          • There's code being repeated in getAllJobs(), getAllJobs(String queue) and jobsToComplete. I think it should be factored out so changes to one of the methods (for e.g. to return a new field) need not be duplicated.

          JobQueueInfo:

          • schedulingInfo stored here is a stringified version. I think it should be declared a String and get/set should deal with strings. The caller should basically call with actualObject.toString(). This makes it similar to JobStatus.
          • In JobStatus, we are using Text.readString whereas in JobQueueInfo, we are using readUTF. I think in similar cases elsewhere we use the UTF versions. Similar comments for the write APIs.

          JspUtil:

          • This is including JspHelper which is a class from the NameNode package. I don't think it is a good idea for a MapRed class to depend on this, however I understand this has always been this way. Maybe we should file a new JIRA to fix it.

          JobSubmissionProtocol:

          • Include HADOOP JIRA number in the comment related to version field.

          JobClient:

          • Usage prints: [-queueinfo <job-queue-name> [-showJobs] - this is missing a closing ']'
          • Return code should be set to 0 when the command syntax is found to be correct.
          • Since scheduler information is set to empty, it can never be null. I think in any case, it should print something like:
            Queue Name: default
            Scheduling Information: N/A
            
          • The line "Job List for the queue ::" needs a newline. Also, I think it can just read "Job list:"

          jobqueue_details.jsp:

          • Needs a backlink to the main jobtracker page
          • Needs a link to Hadoop web page - like in other pages.

          jobtracker.jsp:

          • The scheduling info column is not being split into rows. The HTML code generated does look fine. But still it is not showing up. Can you please check ?

          CapacityTaskScheduler:

          • Does not need supportsPriority as a separate field in the SchedulingInfo class. You can pick it up from one of the QueueSchedulingInfo objects.
          • guaranteedCapacity actual must be split between reduce and map slots. Currently, only the value for the map is being displayed.
          • Number of reclaimed resources is an internal variable and does not need to be displayed.
          • Rename getQSI to getQueueSchedulingInfo

          TestJobQueueInformation:

          • I think you can use JobClient, instead of directly dealing with JobSubmissionProtocol and having to duplicate the methods for createRPCProxy etc.
          Show
          Hemanth Yamijala added a comment - JobTracker: getAllJobs: if the scheduler returns null, it should return an empty JobStatus array. There's code being repeated in getAllJobs(), getAllJobs(String queue) and jobsToComplete. I think it should be factored out so changes to one of the methods (for e.g. to return a new field) need not be duplicated. JobQueueInfo: schedulingInfo stored here is a stringified version. I think it should be declared a String and get/set should deal with strings. The caller should basically call with actualObject.toString(). This makes it similar to JobStatus. In JobStatus, we are using Text.readString whereas in JobQueueInfo, we are using readUTF. I think in similar cases elsewhere we use the UTF versions. Similar comments for the write APIs. JspUtil: This is including JspHelper which is a class from the NameNode package. I don't think it is a good idea for a MapRed class to depend on this, however I understand this has always been this way. Maybe we should file a new JIRA to fix it. JobSubmissionProtocol: Include HADOOP JIRA number in the comment related to version field. JobClient: Usage prints: [-queueinfo <job-queue-name> [-showJobs] - this is missing a closing ']' Return code should be set to 0 when the command syntax is found to be correct. Since scheduler information is set to empty, it can never be null. I think in any case, it should print something like: Queue Name: default Scheduling Information: N/A The line "Job List for the queue ::" needs a newline. Also, I think it can just read "Job list:" jobqueue_details.jsp: Needs a backlink to the main jobtracker page Needs a link to Hadoop web page - like in other pages. jobtracker.jsp: The scheduling info column is not being split into rows. The HTML code generated does look fine. But still it is not showing up. Can you please check ? CapacityTaskScheduler: Does not need supportsPriority as a separate field in the SchedulingInfo class. You can pick it up from one of the QueueSchedulingInfo objects. guaranteedCapacity actual must be split between reduce and map slots. Currently, only the value for the map is being displayed. Number of reclaimed resources is an internal variable and does not need to be displayed. Rename getQSI to getQueueSchedulingInfo TestJobQueueInformation: I think you can use JobClient, instead of directly dealing with JobSubmissionProtocol and having to duplicate the methods for createRPCProxy etc.
          Hide
          Sreekanth Ramakrishnan added a comment -

          Made modifications according to comments. I have also mentioned reasons why somethings are left as it is from previous version of the patch.

          JobTracker:

          • There's code being repeated in getAllJobs(), getAllJobs(String queue) and jobsToComplete. I think it should be factored out so changes to one of the methods (for e.g. to return a new field) need not be duplicated.

          Code repetition for converting collection JobInProgress to an array of JobStatus has been removed. Modified getAllJobs and getAllJobs(Queue). Left jobsToComplete as is.

          JobQueueInfo:

          • schedulingInfo stored here is a stringified version. I think it should be declared a String and get/set should deal with strings. The caller should basically call with actualObject.toString(). This makes it similar to JobStatus.

          The reason why we are using an object and passing only String over wire is because we are setting scheduling information only once. Then underlying reference of the scheduling information is updated by the respective TaskScheduler's and we do a toString() while passing over wire. This way we can avoid to constantly update the scheduling information in queue manager. For example check CapacityTaskScheduler.

          JspUtil:

          • This is including JspHelper which is a class from the NameNode package. I don't think it is a good idea for a MapRed class to depend on this, however I understand this has always been this way. Maybe we should file a new JIRA to fix it.

          It is using JSPHelper from the package to generate the percentage graph. Maybe that method should be moved into ServletUtil class in core util package.

          CapacityTaskScheduler:

          • Does not need supportsPriority as a separate field in the SchedulingInfo class. You can pick it up from one of the QueueSchedulingInfo objects.

          If a queue supports priority or not is stored by the JobQueueManager in capacity scheduler. The queue scheduling information object does not contain if a particular queue can support priority or not. So that is why there is a seperate field.

          TestJobQueueInformation:

          • I think you can use JobClient, instead of directly dealing with JobSubmissionProtocol and having to duplicate the methods for createRPCProxy etc.

          Reason why I am not using JobClient directly is because: by calling them we are going to call up display methods, if we call up display methods then we would have to parse the output of the job client and then do the test for equality. Moreover all the display method newly defined are private. If it is really required I can make them public then change test to parse the display string and test equality.

          Show
          Sreekanth Ramakrishnan added a comment - Made modifications according to comments. I have also mentioned reasons why somethings are left as it is from previous version of the patch. JobTracker: There's code being repeated in getAllJobs(), getAllJobs(String queue) and jobsToComplete. I think it should be factored out so changes to one of the methods (for e.g. to return a new field) need not be duplicated. Code repetition for converting collection JobInProgress to an array of JobStatus has been removed. Modified getAllJobs and getAllJobs(Queue). Left jobsToComplete as is. JobQueueInfo: schedulingInfo stored here is a stringified version. I think it should be declared a String and get/set should deal with strings. The caller should basically call with actualObject.toString(). This makes it similar to JobStatus. The reason why we are using an object and passing only String over wire is because we are setting scheduling information only once. Then underlying reference of the scheduling information is updated by the respective TaskScheduler's and we do a toString() while passing over wire. This way we can avoid to constantly update the scheduling information in queue manager. For example check CapacityTaskScheduler . JspUtil: This is including JspHelper which is a class from the NameNode package. I don't think it is a good idea for a MapRed class to depend on this, however I understand this has always been this way. Maybe we should file a new JIRA to fix it. It is using JSPHelper from the package to generate the percentage graph. Maybe that method should be moved into ServletUtil class in core util package. CapacityTaskScheduler: Does not need supportsPriority as a separate field in the SchedulingInfo class. You can pick it up from one of the QueueSchedulingInfo objects. If a queue supports priority or not is stored by the JobQueueManager in capacity scheduler. The queue scheduling information object does not contain if a particular queue can support priority or not. So that is why there is a seperate field. TestJobQueueInformation: I think you can use JobClient, instead of directly dealing with JobSubmissionProtocol and having to duplicate the methods for createRPCProxy etc. Reason why I am not using JobClient directly is because: by calling them we are going to call up display methods, if we call up display methods then we would have to parse the output of the job client and then do the test for equality. Moreover all the display method newly defined are private. If it is really required I can make them public then change test to parse the display string and test equality.
          Hide
          Hemanth Yamijala added a comment -

          Left jobsToComplete as is.

          I was thinking of something like:

          private JobStatus[] getJobStatus(Collection<JobInProgress> jips, boolean onlyRunning) {
            // ..
            if (onlyRunning) {
              // consider only jobs which are running or prep.
            }
          }
          

          Would that work ?

          Regarding tests, taking cue from APIs like getAllJobs, I think it is OK to provide wrapper APIs around the queue info related methods. These could be package private and the test case can directly access these. So, something like:

          JobQueueInfo[] getJobQueueInfos() { return jobSubmitClient.getJobQueueInfos(); }
          private void displayQueueList() {
            JobQueueInfo[] queues = getJobQueueInfos();
            //
          }
          

          Agree with rest of your explanations.

          Show
          Hemanth Yamijala added a comment - Left jobsToComplete as is. I was thinking of something like: private JobStatus[] getJobStatus(Collection<JobInProgress> jips, boolean onlyRunning) { // .. if (onlyRunning) { // consider only jobs which are running or prep. } } Would that work ? Regarding tests, taking cue from APIs like getAllJobs , I think it is OK to provide wrapper APIs around the queue info related methods. These could be package private and the test case can directly access these. So, something like: JobQueueInfo[] getJobQueueInfos() { return jobSubmitClient.getJobQueueInfos(); } private void displayQueueList() { JobQueueInfo[] queues = getJobQueueInfos(); // } Agree with rest of your explanations.
          Hide
          Hemanth Yamijala added a comment -

          Actually going over some of the comments above, I see this comment from Owen:

          JobSubmissionProtocol is not public and therefore the JobClient needs identical methods.

          So, this agrees with what I've proposed above. In fact, we should make the APIs public and not package private.

          Show
          Hemanth Yamijala added a comment - Actually going over some of the comments above, I see this comment from Owen: JobSubmissionProtocol is not public and therefore the JobClient needs identical methods. So, this agrees with what I've proposed above. In fact, we should make the APIs public and not package private.
          Hide
          Hemanth Yamijala added a comment -

          Looked at other changes. They seem fine to me.
          Except in jobqueue_details.jsp, there's a line coming at the end as follows: Hadoop, 2008. \ No newline at end of file

          Show
          Hemanth Yamijala added a comment - Looked at other changes. They seem fine to me. Except in jobqueue_details.jsp, there's a line coming at the end as follows: Hadoop, 2008. \ No newline at end of file
          Hide
          Doug Cutting added a comment -

          > we should make the APIs public and not package private.

          That's the subject of HADOOP-3822. Let's not introduce new public APIs in this issue.

          Show
          Doug Cutting added a comment - > we should make the APIs public and not package private. That's the subject of HADOOP-3822 . Let's not introduce new public APIs in this issue.
          Hide
          Sreekanth Ramakrishnan added a comment -

          Refactored getJobsToComplete, getAllJobs and _getAllJobs(queue) in JobTracker.

          Introduced two new package level methods for getting JobQueueInfo from JobClient and used the same in test case.

          Fixed No newline at end of file message

          Show
          Sreekanth Ramakrishnan added a comment - Refactored getJobsToComplete , getAllJobs and _getAllJobs(queue) in JobTracker. Introduced two new package level methods for getting JobQueueInfo from JobClient and used the same in test case. Fixed No newline at end of file message
          Hide
          Owen O'Malley added a comment -

          I think we need to have a separate class for the queue command. Otherwise, users can cross the commands like:

          hadoop queue -kill job_00001

          which would be confusing. With separate classes, we can only support the sub-commands that make sense.

          Show
          Owen O'Malley added a comment - I think we need to have a separate class for the queue command. Otherwise, users can cross the commands like: hadoop queue -kill job_00001 which would be confusing. With separate classes, we can only support the sub-commands that make sense.
          Hide
          Sreekanth Ramakrishnan added a comment -

          Attaching patch with following modification from the previous patch:

          • Added a new class called JobQueueClient which implements the Command Line Interface methods for JobQueue related operations.
          • Refactored percentageGraph from JspHelper in the namenode class to ServletUtil.
          • Removed dependency on JspHelper conf from mapred jsp pages.
          Show
          Sreekanth Ramakrishnan added a comment - Attaching patch with following modification from the previous patch: Added a new class called JobQueueClient which implements the Command Line Interface methods for JobQueue related operations. Refactored percentageGraph from JspHelper in the namenode class to ServletUtil. Removed dependency on JspHelper conf from mapred jsp pages.
          Hide
          Sreekanth Ramakrishnan added a comment -

          Incorporating comments from Hemanth:

          • Add javadoc for JobQueueClient
          • Refactor common code to a new class rather than using static methods in JobClient
          • Apache license header for JobQueueClient
          Show
          Sreekanth Ramakrishnan added a comment - Incorporating comments from Hemanth: Add javadoc for JobQueueClient Refactor common code to a new class rather than using static methods in JobClient Apache license header for JobQueueClient
          Hide
          Sreekanth Ramakrishnan added a comment -

          In addition to the above changes, following also has been modified according to Hemanth's comments:

          • Modified bin/hadoop usage to display information about new option(./hadoop queue) introduced.
          Show
          Sreekanth Ramakrishnan added a comment - In addition to the above changes, following also has been modified according to Hemanth's comments: Modified bin/hadoop usage to display information about new option(./hadoop queue) introduced.
          Hide
          Hemanth Yamijala added a comment -

          One nit. Previously, the JobTracker was setting the start time in the JobStatus for all the jobs. This is missing from the refactored code, and hence client is showing start time as 0.

          Other than that, it looks good to me.

          Show
          Hemanth Yamijala added a comment - One nit. Previously, the JobTracker was setting the start time in the JobStatus for all the jobs. This is missing from the refactored code, and hence client is showing start time as 0. Other than that, it looks good to me.
          Hide
          Sreekanth Ramakrishnan added a comment -

          Fixing the start time issue, which was missed out while refactoring the code.

          Show
          Sreekanth Ramakrishnan added a comment - Fixing the start time issue, which was missed out while refactoring the code.
          Hide
          Owen O'Malley added a comment -

          This is getting close, but I have a few suggestions.

          When I asked you to split the queue queries out of JobClient, I didn't think about the API. I think the API is better in JobClient and JobQueueClient is only about the main that supports the cli commands. JobQueueClient shouldn't be a public class, because otherwise it ends up in the public API. So the API access is still through JobClient and JobQueueClient implements Tool, etc.

          Let's make JobSubmissionProtocol.getJobQueueInfos to getQueues(). getJobQueueInfo should be getQueueInfo(queueName).

          The methods in JobQueueClient should be public, moved to JobClient, and renamed:

          getAllQueueSchedulingInfo -> JobClient.getQueues()
          getAllJobs -> JobClient.getJobsFromQueue(queueName)
          getQueueSchedulingInfo -> JobClient.getQueueInfo(queueName)

          mapred.JSPUtil should not be public.

          Several of the new public API classes and methods are missing javadoc.

          JobQueueInfo.schedulerInfo should be a string, rather than an object. Since the serialization forces it to be a string, it should just be typed/stored that way. The QueueManager should probably have a map like:

            Map<String, Object> schedulerInfo; // map from queue name to scheduler specific object
          

          and just create the JobQueueInfo when the JobSubmissionProtocol methods are called. The constructor should take the two strings and don't bother with the setSchedulerInfo.

          I'm not very happy with ClientUtil. It seems like a weak abstraction. Is it really necessary, especially if you fold back into JobClient?

          Show
          Owen O'Malley added a comment - This is getting close, but I have a few suggestions. When I asked you to split the queue queries out of JobClient, I didn't think about the API. I think the API is better in JobClient and JobQueueClient is only about the main that supports the cli commands. JobQueueClient shouldn't be a public class, because otherwise it ends up in the public API. So the API access is still through JobClient and JobQueueClient implements Tool, etc. Let's make JobSubmissionProtocol.getJobQueueInfos to getQueues(). getJobQueueInfo should be getQueueInfo(queueName). The methods in JobQueueClient should be public, moved to JobClient, and renamed: getAllQueueSchedulingInfo -> JobClient.getQueues() getAllJobs -> JobClient.getJobsFromQueue(queueName) getQueueSchedulingInfo -> JobClient.getQueueInfo(queueName) mapred.JSPUtil should not be public. Several of the new public API classes and methods are missing javadoc. JobQueueInfo.schedulerInfo should be a string, rather than an object. Since the serialization forces it to be a string, it should just be typed/stored that way. The QueueManager should probably have a map like: Map< String , Object > schedulerInfo; // map from queue name to scheduler specific object and just create the JobQueueInfo when the JobSubmissionProtocol methods are called. The constructor should take the two strings and don't bother with the setSchedulerInfo. I'm not very happy with ClientUtil. It seems like a weak abstraction. Is it really necessary, especially if you fold back into JobClient?
          Hide
          Sreekanth Ramakrishnan added a comment -

          Attaching patch with following changes:

          • Made JobQueueClient to default level access instead of public access.
          • Renamed methods in JobSubmissionProtocol to getQueues(),getQueueInfo(queueName) and getJobsFromQueue(queueName)_
          • Mirrored these methods in JobClient.
          • Made change in JobQueueClient to use public methods in JobClient to use these methods for Job Queue querying.
          • Added javadoc for all Public classes and public methods introduced by the patch.
          • Removed the map which stored JobQueueInfo in QueueManager.
          • Changed type of SchedulingInfo in JobQueueInfo to String.
          • Constructing JobQueueInfo on fly when requested in QueueManager
          Show
          Sreekanth Ramakrishnan added a comment - Attaching patch with following changes: Made JobQueueClient to default level access instead of public access. Renamed methods in JobSubmissionProtocol to getQueues() , getQueueInfo(queueName) and getJobsFromQueue(queueName)_ Mirrored these methods in JobClient . Made change in JobQueueClient to use public methods in JobClient to use these methods for Job Queue querying. Added javadoc for all Public classes and public methods introduced by the patch. Removed the map which stored JobQueueInfo in QueueManager . Changed type of SchedulingInfo in JobQueueInfo to String. Constructing JobQueueInfo on fly when requested in QueueManager
          Hide
          Sreekanth Ramakrishnan added a comment -

          Adding Apache license to TestJobQueueInformation and JSPUtil according to Hemanth's comment.

          Show
          Sreekanth Ramakrishnan added a comment - Adding Apache license to TestJobQueueInformation and JSPUtil according to Hemanth's comment.
          Hide
          Sreekanth Ramakrishnan added a comment -

          I am pasting inline the output of ant test-patch on my local machine.

               [exec] +1 overall.
               [exec]
               [exec]     +1 @author.  The patch does not contain any @author tags.
               [exec]
               [exec]     +1 tests included.  The patch appears to include 3 new or modified tests.
               [exec]
               [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
               [exec]
               [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
               [exec]
               [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
          
          

          ant test-core and ant test-patch did not generate any build failure, on today's trunk on my local machine

          Show
          Sreekanth Ramakrishnan added a comment - I am pasting inline the output of ant test-patch on my local machine. [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 3 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. ant test-core and ant test-patch did not generate any build failure, on today's trunk on my local machine
          Hide
          Owen O'Malley added a comment -

          I just committed this. Thanks, Sreekanth.

          Show
          Owen O'Malley added a comment - I just committed this. Thanks, Sreekanth.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Seems that there is a NPE in LimitTasksPerJobTaskScheduler. See HADOOP-4213.

          Show
          Tsz Wo Nicholas Sze added a comment - Seems that there is a NPE in LimitTasksPerJobTaskScheduler. See HADOOP-4213 .
          Hide
          Hudson added a comment -
          Show
          Hudson added a comment - Integrated in Hadoop-trunk #611 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/611/ )

            People

            • Assignee:
              Sreekanth Ramakrishnan
              Reporter:
              Matei Zaharia
            • Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development