Details
-
Improvement
-
Status: Closed
-
Critical
-
Resolution: Fixed
-
0.2.0
-
None
-
None
Description
Trying to change the DFS block size, led the realization that the 32,000,000 was hard coded into the source code. I propose:
1. Change the default block size to 64 * 1024 * 1024.
2. Add the config variable dfs.block.size that sets the default block size.
3. Add a parameter to the FileSystem, DFSClient, and ClientProtocol create method that let's the user control the block size.
4. Rename the FileSystem.getBlockSize to getDefaultBlockSize.
5. Add a new method to FileSytem.getBlockSize that takes a pathname.
6. Use long for the block size in the API, which is what was used before. However, the implementation will not work if block size is set bigger than 2**31.
7. Have the InputFormatBase use the blocksize of each file to determine the split size.
Thoughts?