Details

    • Sub-task
    • Status: Patch Available
    • Major
    • Resolution: Unresolved
    • None
    • None
    • namenode
    • None

    Description

      Currently the logic in choosing storage for blocks is not a good way. It always uses the first valid storage of a given StorageType (see DataNodeDescriptor#chooseStorage4Block). This should not be a good selection. That means blcoks will always be written to the same volume (first volume) and other valid volumes have no choices. This problem is brought up by this comment ( https://issues.apache.org/jira/browse/HDFS-9807?focusedCommentId=15878382&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15878382 )

      There is one solution from me:

      • First, based on existing storages in one node, extract all the valid storages into a collection.
      • Then, disrupt the order of these vaild storages, get a new collection.
      • Finally, get the first storage from the new storages collection.

      These steps will be executed in DataNodeDescriptor#chooseStorage4Block and replace current logic. I think this improvement can be done as a subtask under HDFS-11419. Any further comments are welcomed.

      Attachments

        1. HDFS-11464.001.patch
          4 kB
          Yiqun Lin
        2. HDFS-11464.002.patch
          7 kB
          Yiqun Lin
        3. HDFS-11464.003.patch
          11 kB
          Yiqun Lin
        4. HDFS-11464.004.patch
          17 kB
          Yiqun Lin
        5. HDFS-11464.005.patch
          17 kB
          Yiqun Lin

        Issue Links

          Activity

            People

              linyiqun Yiqun Lin
              linyiqun Yiqun Lin
              Votes:
              0 Vote for this issue
              Watchers:
              15 Start watching this issue

              Dates

                Created:
                Updated: