Thanks for filing this one Sanjay. Here's a bit of input if it'll help.
HDFS-918 is an attempt at moving datanode away from (2?) threads per open file – which is just a killer for HBase loadings (Mozilla had datanodes that had 8k plus threads running in them because they had about 1k regions up on each of their cluster of 20 odd nodes). HBase keeps open all files to save on trip to Namenode inline with a random-read. The patch that has been posted has been through many iterations, does the read path only currently (the important one as far as hbase is concerned), seems to work in basic testing done by me and others, and holds lots of promise (Or, lets just rewrite the datanode – smile). The patch is pretty big and Todd is suggesting we get it in in smaller pieces but also argument for dropping the big patch in (Related: HDFS-223,
HDFS-285, HDFS-374 which I think can now be closed).
Next up would be some kinda keepalive on pread. At the moment, we'll set up the socket on each pread (hbase uses pread doing random lookups) EVEN though we are seeking the same block as just read from (See
HDFS-380). Chatting w/ some of the lads, fixing this – HDFS-941 – is probably the least intrusive of the issues attached but it'll get us a pretty nice improvement.
HDFS-347 is radical but in hackups, its already been demo'd that it can make for a massive improvement in both latency AND in CPU use (Nathan in a chat on Thursday asked why does this make for such a big win? What is the network version doing that is causing such a slowdown. I think Dhruba makes the same comment inline in the issue IIRC).
HDFS-1034 looks good.
HDFS-236 looks like an effort worth reviving.
Thats enough for now.