remove dead code
in LocalBlockReader.newBlockReader, if the checksum file open fails, then the dataIn stream will be leaked. You need a try..finally to close these if the open wasn't successful
why does LocalBlockReader extend RemoteBlockReader2? It should be its own implementation of BlockReader - it doesn't really share much code with its superclass (looks like only skip, which is trivial anyway). In terms of member variables, it uses some, but ignores the rest.
The cached proxies to the local DN currently don't have the UGI as part of the key. So if a client needs to impersonate different users, the cache will end up caching client proxies associated with previous users, which would be incorrect.
Same is true of different timeouts, right?
The implementation of read is fairly inefficient - reading and verifying checksum on one chunk at a time does a poor job of amortizing JNI/syscall overhead. Should probably read 64KB or 128KB at a time. You can address this in a followup.
+ checksum.verifyChunkedSums(dataBuff, checksumBuff, filename, off);
Wrong offset is passed here - you should be passing the position of dataIn (the block offset), not the offset into the target byte array
The implementation of read looks incorrect to me. If you read less than one checksum chunk, and you want to verify checksums then it will always read an entire checksum chunk, so the next read will have incorrect results.
In DFSClient, it looks very inefficient to be calling isLocalAddress without any caching.. iterating over the network interfaces is not cheap (lots of syscalls).
Also, the way this is currently designed, each seek will result in the files being re-opened and the metadata re-parsed. I imagine this is bad for efficiency.
Why not integrate the short circuit construction into getBlockReader instead of separately at the two call-sites?
please expand imports
BlockLocalPathInfo needs audience/stability annotations. So does BlockMetadataHeader now that it's public.
Also, since we now have the protocol-independence layer, does this .protocol class need to be Writable? Or can it be handled by the translation layer?
Rather than a single user for the local path access, why not use an ACL, given we already have support for parsing/checking ACLs? I can imagine some people may have multiple applications that do local random-read access that could benefit from this.
checkKerberosAuthMethod purports to be a general check, but the error message specifically references getBlockPathInfo.
getBlockLocalPathInfo has some bad indentation. I'd also change those log messages to TRACE level.
In FSDataset.getBlockLocalPathInfo, no need to qualify FSDataset.getMetaFile.
Test wise, you should cover positional reads and differently sized reads. I think this would turn up a bug as pointed out above with reads smaller than a checksum chunk size.