Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
0.20.205.0, 0.22.0, 0.23.0
-
None
-
Reviewed
Description
We use cascading MultiInputFormat. MultiInputFormat sometimes generates big job.split used internally by hadoop, sometimes it can go beyond 2GB.
In JobSplitWriter.java, the function that generates such file uses 32bit signed integer to compute offset into job.split.
writeNewSplits
...
int prevCount = out.size();
...
int currCount = out.size();
writeOldSplits
...
long offset = out.size();
...
int currLen = out.size();
Attachments
Attachments
Issue Links
- is cloned by
-
MAPREDUCE-4434 Backport MR-2779 (JobSplitWriter.java can't handle large job.split file) to branch-1
- Closed