Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-215

Improve facilities for job-control, job-queues etc.

    Details

    • Type: New Feature New Feature
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      Today, Map-Reduce has some support for job-control - basically JobClient provides a facility to monitor jobs, one can setup a job-ending notification and there is JobControl.

      Links:
      http://lucene.apache.org/hadoop/docs/r0.15.1/mapred_tutorial.html#Job+Control
      http://lucene.apache.org/hadoop/docs/r0.15.1/mapred_tutorial.html#JobControl

      Looks like users could do more with better facilities for job-control and maybe more advanced features like job-queues etc.

      Lets discuss...

        Issue Links

          Activity

          Hide
          Amar Kamat added a comment -

          There should be some way to pass a DAG like structure. Look at the DAG node as a configuration. So each node will represent a conf file. Users can generate conf files, arrange then in a DAG and pass it to the JobControl. User should specify just the input folder(s) and the output folder(s) and JC should generate/manage in-between output folders. The users can provide hints if the intermediate job outputs are desired or not. Also some level of fault-tolerance feature should be provided so that in case of some failures the meta job should complete.

          Show
          Amar Kamat added a comment - There should be some way to pass a DAG like structure. Look at the DAG node as a configuration. So each node will represent a conf file. Users can generate conf files, arrange then in a DAG and pass it to the JobControl . User should specify just the input folder(s) and the output folder(s) and JC should generate/manage in-between output folders. The users can provide hints if the intermediate job outputs are desired or not. Also some level of fault-tolerance feature should be provided so that in case of some failures the meta job should complete.
          Hide
          Arun C Murthy added a comment -

          More features:

          • Ability to submit and monitor more than one job in a parallel-manner (independent sub-graphs in a DAG)
          • Ability to have custom notifications at various states
          • Features to re-submit jobs on failures
          • Ability to 'cron' jobs
          • Allow users to track 'progress' of the DAG
          • Allow users to submit templatized configs (same set of jobs run daily)
          • Heh, fancy UI
          Show
          Arun C Murthy added a comment - More features: Ability to submit and monitor more than one job in a parallel-manner (independent sub-graphs in a DAG) Ability to have custom notifications at various states Features to re-submit jobs on failures Ability to 'cron' jobs Allow users to track 'progress' of the DAG Allow users to submit templatized configs (same set of jobs run daily) Heh, fancy UI
          Hide
          Pi Song added a comment -

          A newbie question.
          How do you chain up input output files between jobs? By manipulating input/output in JobConf manually?

          Show
          Pi Song added a comment - A newbie question. How do you chain up input output files between jobs? By manipulating input/output in JobConf manually?
          Hide
          Amar Kamat added a comment -

          How do you chain up input output files between jobs? By manipulating input/output in JobConf manually?

          Yes.

          Show
          Amar Kamat added a comment - How do you chain up input output files between jobs? By manipulating input/output in JobConf manually? Yes.
          Hide
          Amar Kamat added a comment -
          • Facility to re-execute a subgraph. This will fasten the process of incorporating dynamic config changes since only the related nodes/jobs will be re-run with the changed environment.
          Show
          Amar Kamat added a comment - Facility to re-execute a subgraph. This will fasten the process of incorporating dynamic config changes since only the related nodes/jobs will be re-run with the changed environment.
          Hide
          Shravan Matthur Narayanamurthy added a comment -

          How do I query for the progress of a jobcontrol object that has been submitted?

          Show
          Shravan Matthur Narayanamurthy added a comment - How do I query for the progress of a jobcontrol object that has been submitted?
          Hide
          Jeff Hammerbacher added a comment -

          With HOD, the Capacity Scheduler, and the Fair Scheduler, can we close this ticket now?

          Show
          Jeff Hammerbacher added a comment - With HOD, the Capacity Scheduler, and the Fair Scheduler, can we close this ticket now?
          Hide
          Allen Wittenauer added a comment -

          I'm going to close as fixed for a variety of reasons:

          • some of these features are now native to Hadoop
          • some of these features are now part of Oozie, Azkaban, etc
          • some of these features are part of Tez
          Show
          Allen Wittenauer added a comment - I'm going to close as fixed for a variety of reasons: some of these features are now native to Hadoop some of these features are now part of Oozie, Azkaban, etc some of these features are part of Tez

            People

            • Assignee:
              Unassigned
              Reporter:
              Arun C Murthy
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development