Details
-
Improvement
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
0.3.0, 0.4.0, 0.4.1, 0.5.0, 0.6.0, 0.7.0, 0.7.1, 0.8.0, 0.8.1, 0.9.0, 0.9.1, 0.10.0
-
None
-
All environments would be affected by this
-
mapreduce
Description
The current behavior of the MapRedTask is to start a process that invokes the "hadoop jar" command, passing each additional jobconf property as an argument to this Hadoop CLI.
Having Hive to submit generated jobs to an M/R cluster via the MapReduce API would allow for potentially greater compatibility across platforms, in addition to allowing for these jobs to be run easily against pseudo-clusters in tests (think MiniMRCluster).
This kind of change could involve something as simple as using a Hadoop Configuration object with a generic ToolRunner or something similar to run jobs.
Specifically, this kind of change would most likely occur in the execute() method of org.apache.hadoop.hive.ql.exec.MapRedTask.