Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
0.23.0
-
None
-
None
-
Reviewed
Description
MultipleOutputs.write creates a new TaskAttemptContext, which we've seen to take a significant amount of CPU. The TaskAttemptContext constructor creates a JobConf, gets current UGI, etc. I don't see any reason it needs to do this, instead of just creating a single TaskAttemptContext when the InputFormat is created (or lazily but cached as a member)
Attachments
Attachments
Issue Links
- relates to
-
MAPREDUCE-1853 MultipleOutputs does not cache TaskAttemptContext
- Closed