Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-5682 Heterogeneous Storage phase 2 - APIs to expose Storage Types
  3. HDFS-5183

Combine ReplicaPlacementPolicy with VolumeChoosingPolicy together to have a global view in choosing DN storage for replica.

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Implemented
    • Heterogeneous Storage (HDFS-2832)
    • None
    • datanode, namenode, performance
    • None

    Description

      Per discussion in HDFS-5157, There are two different ways to handle BlockPlacementPolicy and ReplicaChoosingPolicy in case of multiple storage types:
      1. Client specifies the required storage type when calling addBlock(..) to NN. BlockPlacementPolicy in NN chooses a set of datanodes accounting for the storage type. Then, client passes the required storage type to the datanode set and each datanode chooses a particular storage using a VolumeChoosingPolicy.
      2. Same as before, client specifies the required storage type when calling addBlock(..) to NN. Now, BlockPlacementPolicy in NN chooses a set of storages (instead of datanodes). Then, client writes to the corresponding storages. VolumeChoosingPolicy is no longer needed and it should be removed.
      We think #2 is more powerful as it will bring global view to volume choosing or bring storage status into consideration in replica choosing, so we propose to combine two polices together.
      One concern here is it may increase the load of NameNode as previously volume choosing is decided by DN. We may verify it later (that's why I put performance in component).

      Attachments

        Activity

          People

            Unassigned Unassigned
            junping_du Junping Du
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: