CanUnbuffer ain't too pretty. Unbufferable is about as ugly. Its fine I suppose as is.
It's consistent with our other "input stream extension" interfaces such as Syncable, CanSetReadahead, etc. The problem is that we can't add the new APIs to FSInputStream, or else we'd break a bunch of non-HDFS streams (in and out of the tree) that don't implement the new API. I guess Java is adding default implementations for interface functions in some future version... too bad we're not there yet.
l In DFSIS#unbuffer, should we be resetting data members back to zero, etc?
I'm not sure what else we'd reset. This isn't changing the closed state, it's not a seek so the pos is not affected, it's not changing the cachingStrategy or fileEncryptionInfo... we certainly don't want to clear the block location info because then we need to do an RPC to the NN to get it again...
Actually I do see one thing we should change. We should set blockEnd to -1. Otherwise, seek may attempt to use blockReader even though it's null. It seems like this is also a problem in closeCurrentBlockReader. And let me add a seek after the unbuffer in testUnbufferClosesSockets to make sure that this doesn't regress.
In testOpenManyFilesViaTcp, we assert we can read but is there a reason why we would not be able to that unbuffer enables? (pardon if dumb question)
Not a dumb question at all. What I was testing here was that opening a lot of files didn't consume too many resources. In my local test environment, I increased NUM_OPENS to be a really big number... I didn't want to burden Jenkins too much, though. testUnbufferClosesSockets is a more "direct" and straightforward test than testOpenManyFilesViaTcp... the latter is perhaps more of a stress test.