Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-212

allow changes to dfs block size

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • 0.2.0
    • 0.3.0
    • None
    • None

    Description

      Trying to change the DFS block size, led the realization that the 32,000,000 was hard coded into the source code. I propose:
      1. Change the default block size to 64 * 1024 * 1024.
      2. Add the config variable dfs.block.size that sets the default block size.
      3. Add a parameter to the FileSystem, DFSClient, and ClientProtocol create method that let's the user control the block size.
      4. Rename the FileSystem.getBlockSize to getDefaultBlockSize.
      5. Add a new method to FileSytem.getBlockSize that takes a pathname.
      6. Use long for the block size in the API, which is what was used before. However, the implementation will not work if block size is set bigger than 2**31.
      7. Have the InputFormatBase use the blocksize of each file to determine the split size.

      Thoughts?

      Attachments

        1. dfs-blocksize.patch
          35 kB
          Owen O'Malley
        2. dfs-blocksize-2.patch
          35 kB
          Owen O'Malley
        3. TEST-org.apache.hadoop.fs.TestCopyFiles.txt
          48 kB
          Doug Cutting

        Activity

          People

            omalley Owen O'Malley
            omalley Owen O'Malley
            Votes:
            1 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: