Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
As part of HADOOP-2027, I discovered that we create input splits for 0 byte files. (In theory this is for both sequence file and text files, but in practice sequence files can't be 0 bytes.) I think 0 byte files can and should be dropped, since they have no input to process.