Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 0.2.0
    • Fix Version/s: 0.3.0
    • Component/s: None
    • Labels:
      None

      Description

      Trying to change the DFS block size, led the realization that the 32,000,000 was hard coded into the source code. I propose:
      1. Change the default block size to 64 * 1024 * 1024.
      2. Add the config variable dfs.block.size that sets the default block size.
      3. Add a parameter to the FileSystem, DFSClient, and ClientProtocol create method that let's the user control the block size.
      4. Rename the FileSystem.getBlockSize to getDefaultBlockSize.
      5. Add a new method to FileSytem.getBlockSize that takes a pathname.
      6. Use long for the block size in the API, which is what was used before. However, the implementation will not work if block size is set bigger than 2**31.
      7. Have the InputFormatBase use the blocksize of each file to determine the split size.

      Thoughts?

      1. dfs-blocksize.patch
        35 kB
        Owen O'Malley
      2. dfs-blocksize-2.patch
        35 kB
        Owen O'Malley
      3. TEST-org.apache.hadoop.fs.TestCopyFiles.txt
        48 kB
        Doug Cutting

        Activity

        Hide
        alan wootton added a comment -

        Are there two issues here?

        I can see a need to change the default block size for a DFS. In my case I'd like to write unit tests with small block sizes to check dfs code for bugs.

        I don't see the need for files to have their own sizes. Does this not introduce another 'moving part' to the dfs, and even more possibilities for bugs?

        Show
        alan wootton added a comment - Are there two issues here? I can see a need to change the default block size for a DFS. In my case I'd like to write unit tests with small block sizes to check dfs code for bugs. I don't see the need for files to have their own sizes. Does this not introduce another 'moving part' to the dfs, and even more possibilities for bugs?
        Hide
        Owen O'Malley added a comment -

        As you point out, it is possible to just make configuration variable and use it everywhere. The problem is that you become very sensititve to differences in the configuration between nodes. It seemed less error prone to leave the client in charge of block size and consistently use their setting.

        Under the hood, dfs currently supports variable block sizes within a file, but I certainly do not want to expose that to the user-visible APIs.

        Show
        Owen O'Malley added a comment - As you point out, it is possible to just make configuration variable and use it everywhere. The problem is that you become very sensititve to differences in the configuration between nodes. It seemed less error prone to leave the client in charge of block size and consistently use their setting. Under the hood, dfs currently supports variable block sizes within a file, but I certainly do not want to expose that to the user-visible APIs.
        Hide
        alan wootton added a comment -

        Variable block sizes within a file? I don't see that at all. I can see some checks being made to ensure that the size being sent matches what a datanode thinks is the correct size, but don't see any way, at all, that all Block's don't have a max size of 32 mb everywhere.

        Isn't adding a 'blocksize' param to the create method of DFSClient exposing variable block sizes to the client?

        Are we talking about the same thing?

        Show
        alan wootton added a comment - Variable block sizes within a file? I don't see that at all. I can see some checks being made to ensure that the size being sent matches what a datanode thinks is the correct size, but don't see any way, at all, that all Block's don't have a max size of 32 mb everywhere. Isn't adding a 'blocksize' param to the create method of DFSClient exposing variable block sizes to the client? Are we talking about the same thing?
        Hide
        Owen O'Malley added a comment -

        Sure, look at DFSInputStream.blockSeekTo. You can't generate files that look like that, but the infrastructure supports them.

        I'm only trying to expose setting the block size for a given file when it is being created. I don't want to expose changing it within a file.

        Show
        Owen O'Malley added a comment - Sure, look at DFSInputStream.blockSeekTo. You can't generate files that look like that, but the infrastructure supports them. I'm only trying to expose setting the block size for a given file when it is being created. I don't want to expose changing it within a file.
        Hide
        alan wootton added a comment -

        Ok, I get it now. Even though it's currently impossible for any block, except the last block of a file, to be anything other than 32mb it looks like the system would support it.

        We need to remove all references to BLOCK_SIZE.

        I see some problems. FSSataSet doesn't know which file it's working with. It always uses BLOCK_SIZE. DFSClient.DFSOutputStream.write() has same problem.

        I'll vote yes.

        Show
        alan wootton added a comment - Ok, I get it now. Even though it's currently impossible for any block, except the last block of a file, to be anything other than 32mb it looks like the system would support it. We need to remove all references to BLOCK_SIZE. I see some problems. FSSataSet doesn't know which file it's working with. It always uses BLOCK_SIZE. DFSClient.DFSOutputStream.write() has same problem. I'll vote yes.
        Hide
        Owen O'Malley added a comment -

        Ok, here is the patch.

        Changes dfs block size from a compile time constant to a parameter that is set when a file is created.

        1. FileSystem.getBlockSize becomes getDefaultBlockSize
        2. A new method FileSystem.getBlockSize(path) finds the blocksize of a file.
        3. Block size is added to FileSystem.create
        4. InputFormatBase uses the block size of each file rather than the global constant.
        5. I followed the convention of using DfsPath to cache meta information values associatied with the dfs file.
        6. FileUnderConstruction records the block size
        7. Removed check to make sure that the block size was shorter than the global value.
        8. Add a new value

        Show
        Owen O'Malley added a comment - Ok, here is the patch. Changes dfs block size from a compile time constant to a parameter that is set when a file is created. 1. FileSystem.getBlockSize becomes getDefaultBlockSize 2. A new method FileSystem.getBlockSize(path) finds the blocksize of a file. 3. Block size is added to FileSystem.create 4. InputFormatBase uses the block size of each file rather than the global constant. 5. I followed the convention of using DfsPath to cache meta information values associatied with the dfs file. 6. FileUnderConstruction records the block size 7. Removed check to make sure that the block size was shorter than the global value. 8. Add a new value
        Hide
        Doug Cutting added a comment -

        Overall, this looks great and is much needed. Unfortunately I'm getting some null pointer exceptions running unit tests with this patch. I've not yet tried to debug these...

        Show
        Doug Cutting added a comment - Overall, this looks great and is much needed. Unfortunately I'm getting some null pointer exceptions running unit tests with this patch. I've not yet tried to debug these...
        Hide
        Milind Bhandarkar added a comment -

        Doug, the problem is occuring much earlier before NPE:

        060515 123400 DIR* FSDirectory.mkdirs: failed to create directory /srcdat

        Looks like your DFS config has a root directory not writable for you. Because the test is trying to create dfs://localhost:65314/srcdat. I am using org.apache.hadoop.dfs.MiniDFSCluster for testing copying across dfs.

        Show
        Milind Bhandarkar added a comment - Doug, the problem is occuring much earlier before NPE: 060515 123400 DIR* FSDirectory.mkdirs: failed to create directory /srcdat Looks like your DFS config has a root directory not writable for you. Because the test is trying to create dfs://localhost:65314/srcdat. I am using org.apache.hadoop.dfs.MiniDFSCluster for testing copying across dfs.
        Hide
        Doug Cutting added a comment -

        Milind, I don't think these 'failed to create directory' messages are the problem. That unit test succeeeds w/o this patch and fails with it. In either case the unit test prints these messages. I think these messages are because the directories already exist, so new attempts to create them fail, but I have not yet looked closely at that.

        Show
        Doug Cutting added a comment - Milind, I don't think these 'failed to create directory' messages are the problem. That unit test succeeeds w/o this patch and fails with it. In either case the unit test prints these messages. I think these messages are because the directories already exist, so new attempts to create them fail, but I have not yet looked closely at that.
        Hide
        Owen O'Malley added a comment -

        Ok, I found the problem. I'll create a new patch.

        Show
        Owen O'Malley added a comment - Ok, I found the problem. I'll create a new patch.
        Hide
        Owen O'Malley added a comment -

        This adds an additional check for null on a file's block list.

        Show
        Owen O'Malley added a comment - This adds an additional check for null on a file's block list.
        Hide
        Doug Cutting added a comment -

        I just committed this. Thanks, Owen!

        Show
        Doug Cutting added a comment - I just committed this. Thanks, Owen!

          People

          • Assignee:
            Owen O'Malley
            Reporter:
            Owen O'Malley
          • Votes:
            1 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development