Details
-
Brainstorming
-
Status: Closed
-
Major
-
Resolution: Implemented
-
None
-
None
-
None
-
None
Description
Consider how we might enable tiered HFile storage. If HDFS has the capability, we could create certain files on solid state devices where they might be frequently accessed, especially for random reads; and others (and by default) on spinning media as before. We could support the move of frequently read HFiles from spinning media to solid state. We already have CF statistics for this, would only need to add requisite admin interface; could even consider an autotiering option.
Dhruba Borthakur did some early work in this area and wrote up his findings: http://hadoopblog.blogspot.com/2012/05/hadoop-and-solid-state-drives.html . It is important to note the findings but I suggest most of the recommendations are out of scope of this JIRA. This JIRA seeks to find an initial use case that produces a reasonable benefit, and serves as a testbed for further improvements. If I may paraphrase Dhruba's findings (any misstatements and errors are mine): First, the DFSClient code paths introduce significant latency, so the HDFS client (and presumably the DataNode, as the next bottleneck) will need significant work to knock that down. Need to investigate optimized (perhaps read-only) DFS clients, server side read and caching strategies. Second, RegionServers are heavily threaded and this imposes a lot of monitor contention and context switching cost. Need to investigate reducing the number of threads in a RegionServer, nonblocking IO and RPC.
Attachments
Issue Links
- depends upon
-
HDFS-3672 Expose disk-location information for blocks to enable better scheduling
- Closed
- duplicates
-
HBASE-12935 Does any one consider the performance of HBase on SSD?
- Closed
- is related to
-
HDFS-6875 Archival Storage: support migration for a list of specified paths
- Resolved
-
HDFS-4672 Support tiered storage policies
- Resolved
-
HBASE-4755 HBase based block placement in DFS
- Closed
-
HBASE-7404 Bucket Cache:A solution about CMS,Heap Fragment and Big Cache on HBASE
- Closed
-
HDFS-2832 Enable support for heterogeneous storages in HDFS - DN as a collection of storages
- Closed
-
HDFS-5682 Heterogeneous Storage phase 2 - APIs to expose Storage Types
- Closed
-
HDFS-6584 Support Archival Storage
- Closed
-
HBASE-6799 Store more metadata in HFiles
- Closed
- relates to
-
HBASE-18118 Default storage policy if not configured cannot be "NONE"
- Resolved
-
HDFS-4672 Support tiered storage policies
- Resolved
-
HDFS-3672 Expose disk-location information for blocks to enable better scheduling
- Closed
1.
|
Utilize Flash storage for WAL | Closed | Ted Yu | |
2.
|
Utilize Flash storage for flushing | Closed | Unassigned | |
3.
|
Add more docs and a basic check for storage policy handling | Closed | Sean Busbey | |
4.
|
Support CF-level Storage Policy | Closed | Yu Li | |
5.
|
Support setting storage policy in bulkload | Closed | Yu Li | |
6.
|
Add document about Heterogeneous Storage Management (HSM) in hbase | Open | Yu Li |