[SPARK-33402] Jobs launched in same second have duplicate MapReduce JobIDs - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 2.4.8, 3.0.1, 3.1.0
Fix Version/s: 3.0.2, 3.1.0
Component/s: Spark Core
Labels:
None

Description

Spark uses the current timestamp to generate a MapReduce JobID.
If > 1 job attempt is generated in the same second, these can clash

Committers which expect this to be unique can conflict with the other jobs

S3A staging committer (cluster FS staging dir and local task output dir)
Any committer which supports parallel jobs writing to the same destination
directory and requires unique names for the attempts
Code which uses the jobID as part of its algorithm to generate unique filenames

Note: HadoopMapReduceCommitProtocol.getFilename() doesn't use this JobID for
uniqueness, it uses task attempt ID and stage ID. It probably deserves its own
audit.

Attachments

Issue Links

relates to

HADOOP-17318 S3A committer to support concurrent jobs with same app attempt ID & dest dir

Resolved

SPARK-33230 FileOutputWriter jobs have duplicate JobIDs if launched in same second

Resolved

links to

[Github] Pull Request #30319 (steveloughran)

Activity

People

Assignee:: Steve Loughran

Reporter:: Steve Loughran

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 09/Nov/20 20:01

Updated:: 11/Nov/20 22:29

Resolved:: 11/Nov/20 22:29