• Sub-task
    • Status: Patch Available
    • Major
    • Resolution: Unresolved
    • None
    • None
    • namenode
    • None


      Currently the logic in choosing storage for blocks is not a good way. It always uses the first valid storage of a given StorageType (see DataNodeDescriptor#chooseStorage4Block). This should not be a good selection. That means blcoks will always be written to the same volume (first volume) and other valid volumes have no choices. This problem is brought up by this comment ( )

      There is one solution from me:

      • First, based on existing storages in one node, extract all the valid storages into a collection.
      • Then, disrupt the order of these vaild storages, get a new collection.
      • Finally, get the first storage from the new storages collection.

      These steps will be executed in DataNodeDescriptor#chooseStorage4Block and replace current logic. I think this improvement can be done as a subtask under HDFS-11419. Any further comments are welcomed.


        1. HDFS-11464.001.patch
          4 kB
          Yiqun Lin
        2. HDFS-11464.002.patch
          7 kB
          Yiqun Lin
        3. HDFS-11464.003.patch
          11 kB
          Yiqun Lin
        4. HDFS-11464.004.patch
          17 kB
          Yiqun Lin
        5. HDFS-11464.005.patch
          17 kB
          Yiqun Lin

        Issue Links



              linyiqun Yiqun Lin
              linyiqun Yiqun Lin
              0 Vote for this issue
              16 Start watching this issue