Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-6584

Support Archival Storage

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 2.6.0
    • balancer & mover, namenode
    • None
    • Reviewed

    Description

      In most of the Hadoop clusters, as more and more data is stored for longer time, the demand for storage is outstripping the compute. Hadoop needs a cost effective and easy to manage solution to meet this demand for storage. Current solution is:

      • Delete the old unused data. This comes at operational cost of identifying unnecessary data and deleting them manually.
      • Add more nodes to the clusters. This adds along with storage capacity unnecessary compute capacity to the cluster.

      Hadoop needs a solution to decouple growing storage capacity from compute capacity. Nodes with higher density and less expensive storage with low compute power are becoming available and can be used as cold storage in the clusters. Based on policy the data from hot storage can be moved to cold storage. Adding more nodes to the cold storage can grow the storage independent of the compute capacity in the cluster.

      Attachments

        1. HDFSArchivalStorageDesign20140715.pdf
          166 kB
          Tsz-wo Sze
        2. HDFSArchivalStorageDesign20140623.pdf
          152 kB
          Tsz-wo Sze
        3. HDFS-6584.000.patch
          344 kB
          Jing Zhao
        4. h6584_20140918b.patch
          423 kB
          Tsz-wo Sze
        5. h6584_20140918.patch
          425 kB
          Tsz-wo Sze
        6. h6584_20140917b.patch
          420 kB
          Jing Zhao
        7. h6584_20140917.patch
          420 kB
          Jing Zhao
        8. h6584_20140916.patch
          402 kB
          Jing Zhao
        9. h6584_20140916.patch
          400 kB
          Jing Zhao
        10. h6584_20140915.patch
          397 kB
          Jing Zhao
        11. h6584_20140911b.patch
          394 kB
          Jing Zhao
        12. h6584_20140911.patch
          395 kB
          Tsz-wo Sze
        13. h6584_20140908b.patch
          388 kB
          Jing Zhao
        14. h6584_20140908.patch
          373 kB
          Tsz-wo Sze
        15. h6584_20140907.patch
          372 kB
          Tsz-wo Sze
        16. archival-storage-testplan.pdf
          45 kB
          Jing Zhao

        Issue Links

          1.
          Archival Storage: Add block storage policy Sub-task Resolved Tsz-wo Sze
          2.
          Archival Storage: Consider block storage policy in replicaiton Sub-task Resolved Tsz-wo Sze
          3.
          Archival Storage: Change INodeFile and FSImage to support storage policy Sub-task Resolved Tsz-wo Sze
          4.
          Archival Storage: Bump NameNodeLayoutVersion and update editsStored test files Sub-task Resolved Vinayakumar B
          5.
          Archival Storage: Use fallback storage types Sub-task Resolved Tsz-wo Sze
          6.
          Archival Storage: Consider block storage policy in replica deletion Sub-task Resolved Tsz-wo Sze
          7.
          Archival Storage: Add a new data migration tool Sub-task Resolved Tsz-wo Sze
          8.
          Archival Storage: Add a new API to set storage policy Sub-task Resolved Jing Zhao
          9.
          Archival Storage: Extend HdfsFileStatus to get storage policy Sub-task Resolved Jing Zhao
          10.
          Archival Storage: Support storage policy on directories Sub-task Resolved Jing Zhao
          11.
          Archival Storage: Support migration for snapshot paths Sub-task Resolved Jing Zhao
          12.
          Archival Storage: add user documentation Sub-task Resolved Tsz-wo Sze
          13.
          Archival Storage: support migration for a list of specified paths Sub-task Resolved Jing Zhao
          14.
          Archival Storage: support set/get storage policy in DFSAdmin Sub-task Resolved Jing Zhao
          15.
          Archival Storage: Add more tests for BlockStoragePolicy Sub-task Resolved Tsz-wo Sze
          16.
          Archival Storage: check if a block is already scheduled in Mover Sub-task Resolved Tsz-wo Sze
          17.
          Archival Storage: check the storage type of delNodeHintStorage when deleting a replica Sub-task Resolved Tsz-wo Sze
          18.
          Archival Storage: add retry and termination logic for Mover Sub-task Resolved Jing Zhao
          19.
          Archival Storage: BlockPlacementPolicy#chooseTarget should check each valid storage type in each choosing round Sub-task Resolved Jing Zhao
          20.
          Archival Storage: INode#getStoragePolicyID should always return the latest storage policy Sub-task Resolved Jing Zhao
          21.
          Archival Storage: add more tests for data migration and replicaion Sub-task Resolved Tsz-wo Sze
          22.
          Archival Storage: Mover does not terminate when some storage type is out of space Sub-task Resolved Tsz-wo Sze
          23.
          Archival Storage: FSDirectory should not get storage policy id from symlinks Sub-task Resolved Tsz-wo Sze
          24.
          Archival Storage: fix TestDFSInotifyEventInputStream and TestDistributedFileSystem Sub-task Resolved Tsz-wo Sze
          25.
          Archival Storage: Fix TestBlockPlacement and TestStorageMover Sub-task Resolved Jing Zhao
          26.
          Archival Storage: fix Balancer tests Sub-task Resolved Tsz-wo Sze
          27.
          Archival Storage: Add Mover into hdfs script Sub-task Resolved Jing Zhao
          28.
          Archival Storage: skip under construction block for migration Sub-task Resolved Jing Zhao
          29.
          Fix TestBlockManager#testUseDelHint Sub-task Resolved Jing Zhao
          30.
          Add new DistributedFileSystem API for getting all the existing storage policies Sub-task Closed Jing Zhao
          31.
          Archival Storage: fix TestBalancer and TestBalancerWithMultipleNameNodes Sub-task Resolved Tsz-wo Sze
          32.
          TestStorageMover often fails in Jenkins Sub-task Closed Jing Zhao
          33.
          Add a tool to list all the existing block storage policies Sub-task Closed Jing Zhao
          34.
          Archival Storage: getStoragePolicy should not need superuser privilege Sub-task Resolved Brahma Reddy Battula

          Activity

            People

              szetszwo Tsz-wo Sze
              szetszwo Tsz-wo Sze
              Votes:
              0 Vote for this issue
              Watchers:
              39 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: