Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-14627

Improvements to make slow archive storage works on HDFS

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      In our setup, we mount archival storage from remote. the write speed is about 20M/Sec, the read speed is about 40M/Sec, and the normal file operations, for example 'ls', are time consuming.
      we add some improvements to make this kind of archive storage works in currrent hdfs system.

      1. Add multiply to read/write timeout if block saved on archive storage.
      2. Save replica cache file of archive storage to other fast disk for quick restart datanode, shutdownHook may does not execute if the saving takes too long time.
      3. Check mount file system before using mounted archive storage.
      4. Reduce or avoid call DF during generating heartbeat report for archive storage.
      5. Add option to skip archive block during decommission.
      6. Use multi-threads to scan archive storage.
      7. Check archive storage error with retry times.
      8. Add option to disable scan block on archive storage.
      9. Sleep a heartBeat time if there are too many difference when call checkAndUpdate in DirectoryScanner
      10. An auto-service to scan fsimage and set the storage policy of files according to policy.
      11. An auto-service to call mover to move the blocks to right storage.
      12. Dedup files on remote storage if the storage is reliable.

      Attachments

        1. HDFS-14627.patch
          24 kB
          Yang Yun
        2. data_flow_between_datanode_and_aws_s3.jpg
          17 kB
          Yang Yun

        Issue Links

          Activity

            People

              hadoop_yangyun Yang Yun
              hadoop_yangyun Yang Yun
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated: