Issue Details (XML | Word | Printable)

Key: MAPREDUCE-454
Type: New Feature New Feature
Status: Open Open
Priority: Major Major
Assignee: Unassigned
Reporter: Jeff Hammerbacher
Votes: 2
Watchers: 14
Operations

If you were logged in you would be able to see more operations.
Hadoop Map/Reduce

Create service interface for job submission

Created: 07/Apr/09 04:20 AM   Updated: 25/Jun/09 04:33 PM
Return to search
Component/s: None
Affects Version/s: None
Fix Version/s: None

Time Tracking:
Not Specified

Issue Links:
Dependants
 
Duplicate
 
Reference
 


 Description  « Hide
Porting a discussion from the LinkedIn Hadoop group to the Hadoop JIRA: http://www.linkedin.com/groupAnswers?viewQuestionAndAnswers=&gid=988957&discussionID=2156671&sik=1239077959330

 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Jeff Hammerbacher added a comment - 07/Apr/09 04:21 AM
A comment from Brian Starke on the LinkedIn group:

There's nothing that we found, so we ended up making our own service layer that implements ServiceTool and Configurable for job creation and submission.

It was a little bit of a pain in the ass, but now we have a full on service layer on top of our Hadoop clusters, complete with queuing, email alerting, scheduling, and even a dashboard. Well worth the effort, although I'm a little surprised no one has a project like this open sourced yet. Ours isn't generic enough to open source just yet, but now that I know at least one other person might find this useful - I'll spend some cycles seeing if we can get around to getting this out there.


Steve Loughran added a comment - 23/Apr/09 11:17 AM
+1 to a remote job submission interface. My requirements
  1. RESTful, long-haul, stable interface
  2. Some form of notification on job completion. This could be polling the Atom feed of state changes.

HADOOP-4559 relates to this


Joydeep Sen Sarma added a comment - 19/Jun/09 07:26 PM - edited
+1. I would love to see something EMR style api that abstracts away some of the exact details of the cluster being used - specifically the mapred jobtrcaker being used. the idea being that if there are multiple mapred clusters - we can have some control over what jobs go where.

i don't think this is a dup of 5821. would love to help out if there was a patch available ..