Issue Details (XML | Word | Printable)

Key: HADOOP-3924
Type: New Feature New Feature
Status: Closed Closed
Resolution: Fixed
Priority: Critical Critical
Assignee: Subramaniam Krishnan
Reporter: Alejandro Abdelnur
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
Hadoop Common

Add a 'Killed' job status

Created: 08/Aug/08 05:01 AM   Updated: 08/Jul/09 04:52 PM
Return to search
Component/s: None
Affects Version/s: None
Fix Version/s: 0.19.0

Time Tracking:
Not Specified

File Attachments:
  Size
Text File Licensed for inclusion in ASF works patch-3924_v1.txt 2008-09-16 03:28 AM Subramaniam Krishnan 15 kB
Text File Licensed for inclusion in ASF works patch-3924_v2.txt 2008-09-16 12:22 PM Subramaniam Krishnan 19 kB
Text File Licensed for inclusion in ASF works patch-3924_v3.txt 2008-09-16 12:27 PM Subramaniam Krishnan 19 kB
Text File Licensed for inclusion in ASF works patch-3924_v4.txt 2008-09-18 06:14 AM Subramaniam Krishnan 19 kB
Text File Licensed for inclusion in ASF works patch-3924_v5.txt 2008-09-19 07:04 AM Subramaniam Krishnan 21 kB
Text File Licensed for inclusion in ASF works patch-3924_v6.txt 2008-09-19 07:36 AM Subramaniam Krishnan 22 kB
Text File Licensed for inclusion in ASF works patch-3924_v7.txt 2008-09-19 08:10 AM Subramaniam Krishnan 22 kB
Text File Licensed for inclusion in ASF works patch-3924_v8.txt 2008-09-19 10:22 AM Subramaniam Krishnan 21 kB
Text File Licensed for inclusion in ASF works patch-3924_v9.txt 2008-09-19 11:39 PM Arun C Murthy 20 kB
Environment: all

Hadoop Flags: Reviewed
Resolution Date: 19/Sep/08 11:42 PM


 Description  « Hide
It is not possible, via APIs, to know if a job has been killed, only that has not been successful.

 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Alejandro Abdelnur added a comment - 08/Aug/08 05:01 AM
JobStatus should have a KILLED status.

RunnningJob should expose either the JobStatus or a new method isKilled

Maybe making JobStatus and enum and exposing that via the RunningJob would be a more elegant solution.


Subramaniam Krishnan added a comment - 16/Sep/08 03:34 AM
Attached the patch with the following changes:

1) Added a new Job Status - KILLED.
1) Updated killJob() & kill() to terminateJob(boolean isFailed) & terminate(boolean isFailed).
2) Added fail() & kill() which call the above methods with the appropriate flag.
3) Added isKilled() & getJobStatu() which returns an int status in RunningJob
4) Of course, a TC that checks whether the Status of a Failed Job & Killed Job are set correctly.


Amareshwari Sriramadasu added a comment - 16/Sep/08 03:49 AM
Some comments:
1. There is some commented code in the testcase.
2. I think private methods killJob and failJob are not needed. terminateJob(boolean) can be called inplace of them directly, though it makes sense to have public methods kill() and fail().
3. Should this change JobSubmissionProtocol version number for the change in JobStatus?
4. Change in JobTracker is not needed in this patch.

Subramaniam Krishnan added a comment - 16/Sep/08 12:22 PM
Attaching the updated file based on Amareshwari's comments:

1) Removed commented code from TC
2) Bumped up JobSubmissionProtocol version
3) Add a logKilled function in Job History logger to log killed jobs
4) Updated JobDetails.jsp to show Killed Jobs also.
5) Cleaned-up the import mismatch in JT.java


Subramaniam Krishnan added a comment - 16/Sep/08 12:27 PM

Sync-ed with trunk

Hadoop QA added a comment - 17/Sep/08 07:17 AM
+1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12390190/patch-3924_v3.txt
against trunk revision 696149.

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 6 new or modified tests.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac compiler warnings.

+1 findbugs. The patch does not introduce any new Findbugs warnings.

+1 core tests. The patch passed core unit tests.

+1 contrib tests. The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3281/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3281/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3281/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3281/console

This message is automatically generated.


Amareshwari Sriramadasu added a comment - 18/Sep/08 04:12 AM
Minor comments:
1. In terminateJob method, two if-else checks can be collapsed into one if-else.
2. There are lines exceeding 80 characters in JobTracker.java and JobHistory.java. And JobClient has unnecessary java.util.Date and DateFormat imports added.

Otherthan that, the patch looks good.


Subramaniam Krishnan added a comment - 18/Sep/08 06:14 AM
Will upload a updated patch that incorporates Amareshwari's comments.

Subramaniam Krishnan added a comment - 18/Sep/08 06:15 AM

Thanks Amareshwari for the comments.
Uploaded a updated patch that incorporates all the comments.


Amareshwari Sriramadasu added a comment - 18/Sep/08 06:20 AM
+1 Patch looks good

Amareshwari Sriramadasu added a comment - 18/Sep/08 09:29 AM
Ran test-patch on behalf of Subramaniam. Here is the result:
     [exec] +1 overall.
     [exec]
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec]
     [exec]     +1 tests included.  The patch appears to include 6 new or modified tests.
     [exec]
     [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
     [exec]
     [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
     [exec]
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.

Sharad Agarwal added a comment - 18/Sep/08 10:41 AM
I noticed that there is a contrib class org.apache.hadoop.eclipse.server.HadoopJob which uses JobStatus. Should we add Killed status to JobState enum as well, to be consistent ? Also getState() and update(JobStatus status) needs to be taken care of for the possible impact.

minor: TestJobKillAndFail doesn't need to have reducer defined, as it is never used. just set the no of reducers to 0. also there is some duplicate code in runJobFail and runJobKill which becomes the candidate to move to a single method, something like:
private RunningJob runJob(String jobName, Class<? extends Mapper> mapper)


Sharad Agarwal added a comment - 18/Sep/08 11:21 AM
Since this patch forks the FAILED status into FAILED and KILLED, there may be potential impact at places where FAILED is currently being used, so doing a grep on JobStatus.FAILED on trunk reveals some more places which needs to be aware of KILLED status:
FairScheduler.update()
TestFairScheduler.testNonRunningJobsAreIgnored()

Arun C Murthy added a comment - 18/Sep/08 10:47 PM
Some comments:

1. As Sharad pointed out, this still has to fix the FairScheduler.
2. If possible, can you please make a JobStatus.State enum? Currently we have a bunch of 'private static final int's.
3. JobInProgress.terminateJob should take the actual state rather than a 'boolean' i.e. KILLED/FAILED.
4. We should not introduce a public API in JobInProgress to 'fail' the job. The only one necessary is the one to 'kill' it. JobInProgress.failJob should be package private.


Subramaniam Krishnan added a comment - 19/Sep/08 08:13 AM

Updated with Arun's & Sharad's comments..thanks.

Not updating JobStatus to enum right now as it affects too many files, have raised a separate JIRA ([Hadoop-4214]) for the same.


Subramaniam Krishnan added a comment - 19/Sep/08 10:32 AM
Fixed ctrl-c, ctrl-v errors in the Test Case.

All the TCs passed and the Output of test-patch is:
[exec] +1 overall.
[exec]
[exec] +1 @author. The patch does not contain any @author tags.
[exec]
[exec] +1 tests included. The patch appears to include 9 new or modified tests.
[exec]
[exec] +1 javadoc. The javadoc tool did not generate any warning messages.
[exec]
[exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
[exec]
[exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.
[exec]


Amareshwari Sriramadasu added a comment - 19/Sep/08 11:39 AM
+1 for the patch

Devaraj Das added a comment - 19/Sep/08 12:10 PM
I just committed this. Thanks Subramaniam!

Nigel Daley added a comment - 19/Sep/08 05:10 PM
This commit has broken trunk! I am going to revert it.

All the patches waiting to be tested have failed due to the eclipse plugin compilation failing.


Nigel Daley added a comment - 19/Sep/08 05:25 PM
Patch has been reverted.

When you make changes to a contrib component, please ensure you compile it! The eclipse-plugin requires eclipse.home to be defined otherwise it doesn't get compiled:

ant -Declipse.home=/path/to/eclipse tar


Arun C Murthy added a comment - 19/Sep/08 09:18 PM
The patch isn't there yet:

1. JobInProgress.terminateJob should only be a private method, and JIP.kill and JIP.fail (new method) should use it.
2. There are too many pieces of newly commented code.
3. We should remove the change to the eclipse plugin for now.


Arun C Murthy added a comment - 19/Sep/08 09:20 PM
4. It doesn't bump the JobSubmissionProtocol version, although there is a comment for that.

Arun C Murthy added a comment - 19/Sep/08 09:36 PM
Unfortunately this patch conflicts horribly with recent changes to trunk too... sigh.

Arun C Murthy added a comment - 19/Sep/08 11:27 PM
I'm sorry, I've been looking at the wrong version of this patch! I take back my comments here:
http://issues.apache.org/jira/browse/HADOOP-3924?focusedCommentId=12632825#action_12632825

But this patch doesn't apply to trunk...


Arun C Murthy added a comment - 19/Sep/08 11:39 PM
Updated to reflect recent changes to trunk, I believe Amareshwari has run 'ant test test-patch' on this.

Arun C Murthy added a comment - 19/Sep/08 11:40 PM
Subramaniam's patch without changes to eclipse plugin.

Arun C Murthy added a comment - 19/Sep/08 11:42 PM
I just committed this. Thanks, Subramaniam!

Hudson added a comment - 22/Sep/08 03:18 PM