Details
-
New Feature
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
None
-
None
-
None
-
None
Description
In our setup, we mount archival storage from remote. the write speed is about 20M/Sec, the read speed is about 40M/Sec, and the normal file operations, for example 'ls', are time consuming.
we add some improvements to make this kind of archive storage works in currrent hdfs system.
1. Add multiply to read/write timeout if block saved on archive storage.
2. Save replica cache file of archive storage to other fast disk for quick restart datanode, shutdownHook may does not execute if the saving takes too long time.
3. Check mount file system before using mounted archive storage.
4. Reduce or avoid call DF during generating heartbeat report for archive storage.
5. Add option to skip archive block during decommission.
6. Use multi-threads to scan archive storage.
7. Check archive storage error with retry times.
8. Add option to disable scan block on archive storage.
9. Sleep a heartBeat time if there are too many difference when call checkAndUpdate in DirectoryScanner
10. An auto-service to scan fsimage and set the storage policy of files according to policy.
11. An auto-service to call mover to move the blocks to right storage.
12. Dedup files on remote storage if the storage is reliable.
Attachments
Attachments
Issue Links
- is a parent of
-
HDFS-15033 Support to save replica cached files to other place and make expired time configurable
- Resolved
-
HDFS-15221 Add checking of effective filesystem during initializing storage locations
- Resolved
-
HDFS-15022 Add new RPC to transfer data block with external shell script across Datanode
- Patch Available
-
HDFS-15059 Cache finalized replica info during datanode shutdown for fast restarting
- Patch Available
-
HDFS-15188 Add option to set Write/Read timeout extension for different StorageType
- Patch Available
-
HDFS-15412 Add options to set different block scan period for diffrent StorageType
- Patch Available