Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
-
None
Description
To generate the most efficient on disk file, the size of the file is important to control. It would be nice if we could configure the ouputformat to roll over new files when it reaches the right size and start a new file.
There's currently no easy way to tune this and requires indirect tuning (number of reduces, map input size).