Details

    • Type: New Feature
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.6.0
    • Component/s: balancer & mover, namenode
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      In most of the Hadoop clusters, as more and more data is stored for longer time, the demand for storage is outstripping the compute. Hadoop needs a cost effective and easy to manage solution to meet this demand for storage. Current solution is:

      • Delete the old unused data. This comes at operational cost of identifying unnecessary data and deleting them manually.
      • Add more nodes to the clusters. This adds along with storage capacity unnecessary compute capacity to the cluster.

      Hadoop needs a solution to decouple growing storage capacity from compute capacity. Nodes with higher density and less expensive storage with low compute power are becoming available and can be used as cold storage in the clusters. Based on policy the data from hot storage can be moved to cold storage. Adding more nodes to the cold storage can grow the storage independent of the compute capacity in the cluster.

        Attachments

        1. HDFSArchivalStorageDesign20140623.pdf
          152 kB
          Tsz Wo Nicholas Sze
        2. HDFSArchivalStorageDesign20140715.pdf
          166 kB
          Tsz Wo Nicholas Sze
        3. HDFS-6584.000.patch
          344 kB
          Jing Zhao
        4. h6584_20140907.patch
          372 kB
          Tsz Wo Nicholas Sze
        5. h6584_20140908.patch
          373 kB
          Tsz Wo Nicholas Sze
        6. archival-storage-testplan.pdf
          45 kB
          Jing Zhao
        7. h6584_20140908b.patch
          388 kB
          Jing Zhao
        8. h6584_20140911.patch
          395 kB
          Tsz Wo Nicholas Sze
        9. h6584_20140911b.patch
          394 kB
          Jing Zhao
        10. h6584_20140915.patch
          397 kB
          Jing Zhao
        11. h6584_20140916.patch
          402 kB
          Jing Zhao
        12. h6584_20140916.patch
          400 kB
          Jing Zhao
        13. h6584_20140917.patch
          420 kB
          Jing Zhao
        14. h6584_20140917b.patch
          420 kB
          Jing Zhao
        15. h6584_20140918.patch
          425 kB
          Tsz Wo Nicholas Sze
        16. h6584_20140918b.patch
          423 kB
          Tsz Wo Nicholas Sze

          Issue Links

          1.
          Archival Storage: Add block storage policy Sub-task Resolved Tsz Wo Nicholas Sze
          2.
          Archival Storage: Consider block storage policy in replicaiton Sub-task Resolved Tsz Wo Nicholas Sze
          3.
          Archival Storage: Change INodeFile and FSImage to support storage policy Sub-task Resolved Tsz Wo Nicholas Sze
          4.
          Archival Storage: Bump NameNodeLayoutVersion and update editsStored test files Sub-task Resolved Vinayakumar B
          5.
          Archival Storage: Use fallback storage types Sub-task Resolved Tsz Wo Nicholas Sze
          6.
          Archival Storage: Consider block storage policy in replica deletion Sub-task Resolved Tsz Wo Nicholas Sze
          7.
          Archival Storage: Add a new data migration tool Sub-task Resolved Tsz Wo Nicholas Sze
          8.
          Archival Storage: Add a new API to set storage policy Sub-task Resolved Jing Zhao
          9.
          Archival Storage: Extend HdfsFileStatus to get storage policy Sub-task Resolved Jing Zhao
          10.
          Archival Storage: Support storage policy on directories Sub-task Resolved Jing Zhao
          11.
          Archival Storage: Support migration for snapshot paths Sub-task Resolved Jing Zhao
          12.
          Archival Storage: add user documentation Sub-task Resolved Tsz Wo Nicholas Sze
          13.
          Archival Storage: support migration for a list of specified paths Sub-task Resolved Jing Zhao
          14.
          Archival Storage: support set/get storage policy in DFSAdmin Sub-task Resolved Jing Zhao
          15.
          Archival Storage: Add more tests for BlockStoragePolicy Sub-task Resolved Tsz Wo Nicholas Sze
          16.
          Archival Storage: check if a block is already scheduled in Mover Sub-task Resolved Tsz Wo Nicholas Sze
          17.
          Archival Storage: check the storage type of delNodeHintStorage when deleting a replica Sub-task Resolved Tsz Wo Nicholas Sze
          18.
          Archival Storage: add retry and termination logic for Mover Sub-task Resolved Jing Zhao
          19.
          Archival Storage: BlockPlacementPolicy#chooseTarget should check each valid storage type in each choosing round Sub-task Resolved Jing Zhao
          20.
          Archival Storage: INode#getStoragePolicyID should always return the latest storage policy Sub-task Resolved Jing Zhao
          21.
          Archival Storage: add more tests for data migration and replicaion Sub-task Resolved Tsz Wo Nicholas Sze
          22.
          Archival Storage: Mover does not terminate when some storage type is out of space Sub-task Resolved Tsz Wo Nicholas Sze
          23.
          Archival Storage: FSDirectory should not get storage policy id from symlinks Sub-task Resolved Tsz Wo Nicholas Sze
          24.
          Archival Storage: fix TestDFSInotifyEventInputStream and TestDistributedFileSystem Sub-task Resolved Tsz Wo Nicholas Sze
          25.
          Archival Storage: Fix TestBlockPlacement and TestStorageMover Sub-task Resolved Jing Zhao
          26.
          Archival Storage: fix Balancer tests Sub-task Resolved Tsz Wo Nicholas Sze
          27.
          Archival Storage: Add Mover into hdfs script Sub-task Resolved Jing Zhao
          28.
          Archival Storage: skip under construction block for migration Sub-task Resolved Jing Zhao
          29.
          Fix TestBlockManager#testUseDelHint Sub-task Resolved Jing Zhao
          30.
          Add new DistributedFileSystem API for getting all the existing storage policies Sub-task Closed Jing Zhao
          31.
          Archival Storage: fix TestBalancer and TestBalancerWithMultipleNameNodes Sub-task Resolved Tsz Wo Nicholas Sze
          32.
          TestStorageMover often fails in Jenkins Sub-task Closed Jing Zhao
          33.
          Add a tool to list all the existing block storage policies Sub-task Closed Jing Zhao
          34.
          Archival Storage: getStoragePolicy should not need superuser privilege Sub-task Resolved Brahma Reddy Battula

            Activity

              People

              • Assignee:
                szetszwo Tsz Wo Nicholas Sze
                Reporter:
                szetszwo Tsz Wo Nicholas Sze
              • Votes:
                0 Vote for this issue
                Watchers:
                46 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: