Hadoop Common
  1. Hadoop Common
  2. HADOOP-3930

Decide how to integrate scheduler info into CLI and job tracker web page

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.19.0
    • Fix Version/s: 0.19.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Changed TaskScheduler to expose API for Web UI and Command Line Tool.

      Description

      We need a way for job schedulers such as HADOOP-3445 and HADOOP-3476 to provide info to display on the JobTracker web interface and in the CLI. The main things needed seem to be:

      • A way for schedulers to provide info to show in a column on the web UI and in the CLI - something as simple as a single string, or a map<string, int> for multiple parameters.
      • Some sorting order for jobs - maybe a method to sort a list of jobs.

      Let's figure out what the best way to do this is and implement it in the existing schedulers.

      My first-order proposal at an API: Augment the TaskScheduler with

      • public Map<String, String> getSchedulingInfo(JobInProgress job) – returns key-value pairs which are displayed in columns on the web UI or the CLI for the list of jobs.
      • public Map<String, String> getSchedulingInfo(String queue) – returns key-value pairs which are displayed in columns on the web UI or the CLI for the list of queues.
      • public Collection<JobInProgress> getJobs(String queueName) – returns the list of jobs in a given queue, sorted by a scheduler-specific order (the order it wants to run them in / schedule the next task in / etc).
      • public List<String> getQueues();
      1. mockup.JPG
        175 kB
        Sreekanth Ramakrishnan
      2. HADOOP-3930-9.patch
        63 kB
        Sreekanth Ramakrishnan
      3. HADOOP-3930-8.patch
        63 kB
        Sreekanth Ramakrishnan
      4. HADOOP-3930-7.patch
        57 kB
        Sreekanth Ramakrishnan
      5. HADOOP-3930-6.patch
        48 kB
        Sreekanth Ramakrishnan
      6. HADOOP-3930-5.patch
        48 kB
        Sreekanth Ramakrishnan
      7. HADOOP-3930-4.patch
        46 kB
        Sreekanth Ramakrishnan
      8. HADOOP-3930-3.patch
        46 kB
        Sreekanth Ramakrishnan
      9. HADOOP-3930-2.patch
        38 kB
        Sreekanth Ramakrishnan
      10. HADOOP-3930-11.patch
        61 kB
        Sreekanth Ramakrishnan
      11. HADOOP-3930-10.patch
        60 kB
        Sreekanth Ramakrishnan
      12. 3930-1.patch
        12 kB
        Sreekanth Ramakrishnan

        Issue Links

          Activity

          Matei Zaharia created issue -
          Matei Zaharia made changes -
          Field Original Value New Value
          Description We need a way for job schedulers such as HADOOP-3445 and HADOOP-3476 to provide info to display on the JobTracker web interface and in the CLI. The main things needed seem to be:
          * A way for schedulers to provide info to show in a column on the web UI and in the CLI - something as simple as a single string, or a map<string, int> for multiple parameters.
          * Some sorting order for jobs - maybe a method to sort a list of jobs.

          Let's figure out what the best way to do this is and implement it in the existing schedulers.
          We need a way for job schedulers such as HADOOP-3445 and HADOOP-3476 to provide info to display on the JobTracker web interface and in the CLI. The main things needed seem to be:
          * A way for schedulers to provide info to show in a column on the web UI and in the CLI - something as simple as a single string, or a map<string, int> for multiple parameters.
          * Some sorting order for jobs - maybe a method to sort a list of jobs.

          Let's figure out what the best way to do this is and implement it in the existing schedulers.

          My first-order proposal at an API: Augment the TaskScheduler with

          * public Map<String, String> getSchedulingInfo(JobInProgress job) -- returns key-value pairs which are displayed in columns on the web UI or the CLI.
          * public Comparator<JobInProgress> getJobComparator() -- returns a comparator that can be used to determine the order in which jobs will be run, for sorting the jobs in the CLI.
          Matei Zaharia made changes -
          Description We need a way for job schedulers such as HADOOP-3445 and HADOOP-3476 to provide info to display on the JobTracker web interface and in the CLI. The main things needed seem to be:
          * A way for schedulers to provide info to show in a column on the web UI and in the CLI - something as simple as a single string, or a map<string, int> for multiple parameters.
          * Some sorting order for jobs - maybe a method to sort a list of jobs.

          Let's figure out what the best way to do this is and implement it in the existing schedulers.

          My first-order proposal at an API: Augment the TaskScheduler with

          * public Map<String, String> getSchedulingInfo(JobInProgress job) -- returns key-value pairs which are displayed in columns on the web UI or the CLI.
          * public Comparator<JobInProgress> getJobComparator() -- returns a comparator that can be used to determine the order in which jobs will be run, for sorting the jobs in the CLI.
          We need a way for job schedulers such as HADOOP-3445 and HADOOP-3476 to provide info to display on the JobTracker web interface and in the CLI. The main things needed seem to be:
          * A way for schedulers to provide info to show in a column on the web UI and in the CLI - something as simple as a single string, or a map<string, int> for multiple parameters.
          * Some sorting order for jobs - maybe a method to sort a list of jobs.

          Let's figure out what the best way to do this is and implement it in the existing schedulers.

          My first-order proposal at an API: Augment the TaskScheduler with

          * public Map<String, String> getSchedulingInfo(JobInProgress job) -- returns key-value pairs which are displayed in columns on the web UI or the CLI for the list of jobs.
          * public Map<String, String> getSchedulingInfo(String queue) -- returns key-value pairs which are displayed in columns on the web UI or the CLI for the list of queues.
          * public Comparator<JobInProgress> getJobComparator() -- returns a comparator that can be used to determine the order in which jobs will be run, for sorting the jobs in the CLI.
          Matei Zaharia made changes -
          Description We need a way for job schedulers such as HADOOP-3445 and HADOOP-3476 to provide info to display on the JobTracker web interface and in the CLI. The main things needed seem to be:
          * A way for schedulers to provide info to show in a column on the web UI and in the CLI - something as simple as a single string, or a map<string, int> for multiple parameters.
          * Some sorting order for jobs - maybe a method to sort a list of jobs.

          Let's figure out what the best way to do this is and implement it in the existing schedulers.

          My first-order proposal at an API: Augment the TaskScheduler with

          * public Map<String, String> getSchedulingInfo(JobInProgress job) -- returns key-value pairs which are displayed in columns on the web UI or the CLI for the list of jobs.
          * public Map<String, String> getSchedulingInfo(String queue) -- returns key-value pairs which are displayed in columns on the web UI or the CLI for the list of queues.
          * public Comparator<JobInProgress> getJobComparator() -- returns a comparator that can be used to determine the order in which jobs will be run, for sorting the jobs in the CLI.
          We need a way for job schedulers such as HADOOP-3445 and HADOOP-3476 to provide info to display on the JobTracker web interface and in the CLI. The main things needed seem to be:
          * A way for schedulers to provide info to show in a column on the web UI and in the CLI - something as simple as a single string, or a map<string, int> for multiple parameters.
          * Some sorting order for jobs - maybe a method to sort a list of jobs.

          Let's figure out what the best way to do this is and implement it in the existing schedulers.

          My first-order proposal at an API: Augment the TaskScheduler with

          * public Map<String, String> getSchedulingInfo(JobInProgress job) -- returns key-value pairs which are displayed in columns on the web UI or the CLI for the list of jobs.
          * public Map<String, String> getSchedulingInfo(String queue) -- returns key-value pairs which are displayed in columns on the web UI or the CLI for the list of queues.
          * public Comparator<JobInProgress> getJobComparator() -- returns a comparator that can be used to determine the order in which jobs will be run, for sorting the jobs in the CLI.
          * public List<String> getQueues();
          Matei Zaharia made changes -
          Description We need a way for job schedulers such as HADOOP-3445 and HADOOP-3476 to provide info to display on the JobTracker web interface and in the CLI. The main things needed seem to be:
          * A way for schedulers to provide info to show in a column on the web UI and in the CLI - something as simple as a single string, or a map<string, int> for multiple parameters.
          * Some sorting order for jobs - maybe a method to sort a list of jobs.

          Let's figure out what the best way to do this is and implement it in the existing schedulers.

          My first-order proposal at an API: Augment the TaskScheduler with

          * public Map<String, String> getSchedulingInfo(JobInProgress job) -- returns key-value pairs which are displayed in columns on the web UI or the CLI for the list of jobs.
          * public Map<String, String> getSchedulingInfo(String queue) -- returns key-value pairs which are displayed in columns on the web UI or the CLI for the list of queues.
          * public Comparator<JobInProgress> getJobComparator() -- returns a comparator that can be used to determine the order in which jobs will be run, for sorting the jobs in the CLI.
          * public List<String> getQueues();
          We need a way for job schedulers such as HADOOP-3445 and HADOOP-3476 to provide info to display on the JobTracker web interface and in the CLI. The main things needed seem to be:
          * A way for schedulers to provide info to show in a column on the web UI and in the CLI - something as simple as a single string, or a map<string, int> for multiple parameters.
          * Some sorting order for jobs - maybe a method to sort a list of jobs.

          Let's figure out what the best way to do this is and implement it in the existing schedulers.

          My first-order proposal at an API: Augment the TaskScheduler with

          * public Map<String, String> getSchedulingInfo(JobInProgress job) -- returns key-value pairs which are displayed in columns on the web UI or the CLI for the list of jobs.
          * public Map<String, String> getSchedulingInfo(String queue) -- returns key-value pairs which are displayed in columns on the web UI or the CLI for the list of queues.
          * public Collection<JobInProgress> getJobs(String queueName) -- returns the list of jobs in a given queue, sorted by a scheduler-specific order (the order it wants to run them in / schedule the next task in / etc).
          * public List<String> getQueues();
          Sreekanth Ramakrishnan made changes -
          Affects Version/s 0.17.2 [ 12313296 ]
          Release Note Changes to TaskScheduler to expose API which Web UI and Command Line Tool can use
          Status Open [ 1 ] Patch Available [ 10002 ]
          Sreekanth Ramakrishnan made changes -
          Attachment 3930-1.patch [ 12388500 ]
          Hemanth Yamijala made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Sreekanth Ramakrishnan made changes -
          Attachment mockup.JPG [ 12388501 ]
          Hemanth Yamijala made changes -
          Assignee Sreekanth Ramakrishnan [ sreekanth ]
          Owen O'Malley made changes -
          Affects Version/s 0.17.2 [ 12313296 ]
          Affects Version/s 0.19.0 [ 12313211 ]
          Priority Minor [ 4 ] Major [ 3 ]
          Component/s mapred [ 12310690 ]
          Owen O'Malley made changes -
          Link This issue blocks HADOOP-3746 [ HADOOP-3746 ]
          Owen O'Malley made changes -
          Link This issue blocks HADOOP-3445 [ HADOOP-3445 ]
          Sameer Paranjpye made changes -
          Link This issue is part of HADOOP-3444 [ HADOOP-3444 ]
          Sreekanth Ramakrishnan made changes -
          Attachment HADOOP-3930-2.patch [ 12389820 ]
          Sreekanth Ramakrishnan made changes -
          Attachment HADOOP-3930-3.patch [ 12390104 ]
          Hemanth Yamijala made changes -
          Link This issue duplicates HADOOP-3699 [ HADOOP-3699 ]
          Sreekanth Ramakrishnan made changes -
          Attachment HADOOP-3930-4.patch [ 12390181 ]
          Sreekanth Ramakrishnan made changes -
          Attachment HADOOP-3930-5.patch [ 12390202 ]
          Sreekanth Ramakrishnan made changes -
          Attachment HADOOP-3930-6.patch [ 12390208 ]
          Hemanth Yamijala made changes -
          Fix Version/s 0.19.0 [ 12313211 ]
          Sreekanth Ramakrishnan made changes -
          Attachment HADOOP-3930-7.patch [ 12390247 ]
          Sreekanth Ramakrishnan made changes -
          Attachment HADOOP-3930-8.patch [ 12390254 ]
          Sreekanth Ramakrishnan made changes -
          Attachment HADOOP-3930-9.patch [ 12390258 ]
          Sreekanth Ramakrishnan made changes -
          Attachment HADOOP-3930-10.patch [ 12390344 ]
          Sreekanth Ramakrishnan made changes -
          Attachment HADOOP-3930-11.patch [ 12390356 ]
          Owen O'Malley made changes -
          Resolution Fixed [ 1 ]
          Hadoop Flags [Reviewed]
          Status Open [ 1 ] Resolved [ 5 ]
          Tsz Wo Nicholas Sze made changes -
          Link This issue is related to HADOOP-4213 [ HADOOP-4213 ]
          Robert Chansler made changes -
          Release Note Changes to TaskScheduler to expose API which Web UI and Command Line Tool can use Changed TaskScheduler to expose API for Web UI and Command Line Tool.
          Nigel Daley made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Owen O'Malley made changes -
          Component/s mapred [ 12310690 ]

            People

            • Assignee:
              Sreekanth Ramakrishnan
              Reporter:
              Matei Zaharia
            • Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development