Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-3574

Allow Hive to Submit MapReduce jobs via the MapReduce API (instead of using Hadoop BIN)

Log workAgile BoardRank to TopRank to BottomBulk Copy AttachmentsBulk Move AttachmentsAdd voteVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 0.3.0, 0.4.0, 0.4.1, 0.5.0, 0.6.0, 0.7.0, 0.7.1, 0.8.0, 0.8.1, 0.9.0, 0.9.1, 0.10.0
    • None
    • Query Processor, SQL
    • All environments would be affected by this

    • mapreduce

    Description

      The current behavior of the MapRedTask is to start a process that invokes the "hadoop jar" command, passing each additional jobconf property as an argument to this Hadoop CLI.

      Having Hive to submit generated jobs to an M/R cluster via the MapReduce API would allow for potentially greater compatibility across platforms, in addition to allowing for these jobs to be run easily against pseudo-clusters in tests (think MiniMRCluster).

      This kind of change could involve something as simple as using a Hadoop Configuration object with a generic ToolRunner or something similar to run jobs.

      Specifically, this kind of change would most likely occur in the execute() method of org.apache.hadoop.hive.ql.exec.MapRedTask.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned Assign to me
            jerluc Jeremy A. Lucas

            Dates

              Created:
              Updated:

              Slack

                Issue deployment