[MAPREDUCE-2384] The job submitter should make sure to validate jobs before creation of necessary files - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 0.21.0
Fix Version/s: 3.0.0-alpha1
Component/s: job submission, test
Labels:
None

Tags:
test

Description

In 0.20.x/1.x, 0.21, 0.22 the MapReduce job submitter writes some job-necessary files to the JT FS before checking for output specs or other job validation items. This appears unnecessary to do.

This has since been silently fixed in the rewrite of the MRApp (called MRv2) in the ~~MAPREDUCE-279~~ dump thats now replaced the older MR (or, MRv1 now). However, we can still do with a test case to prevent regressing again.

Original description below:

When I read the source code of MapReduce in Hadoop 0.21.0, sometimes it made me confused about error response. For example:
1. JobSubmitter checking output for each job. MapReduce makes rule to limit that each job output must be not exist to avoid fault overwrite. In my opinion, MR should verify output at the point of client submitting. Actually, it copies related files to specified target and then, doing the verifying.
2. JobTracker. Job has been submitted to JobTracker. In first step, JT create JIT object that is very "huge" . Next step, JT start to verify job queue authority and memory requirements.

In normal case, verifying client input then response immediately if any cases in fault. Regular logic can be performed if all the inputs have passed.
It seems like that those code does not make sense for understanding. Is only my personal opinion? Wish someone help me to explain the details. Thanks!

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

MAPREDUCE-2384.r4.diff
12/May/12 12:43
3 kB
Harsh J
MAPREDUCE-2384.r3.diff
03/Sep/11 09:23
4 kB
Harsh J
MAPREDUCE-2384.r2.diff
22/Jul/11 23:35
3 kB
Harsh J
MAPREDUCE-2384.r1.diff
21/May/11 18:39
0.9 kB
Harsh J

Issue Links

is duplicated by

MAPREDUCE-432 JobClient should check input/output specifications before copying the job files on the DFS

Resolved

relates to

MAPREDUCE-3154 Validate the Jobs Output Specification as the first statement in JobSubmitter.submitJobInternal(Job, Cluster) method

Closed

Activity

People

Assignee:: Harsh J

Reporter:: Denny Ye

Votes:: 1 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 15/Mar/11 04:42

Updated:: 12/May/16 18:23

Resolved:: 28/May/12 13:16