RCFile::sync(long) takes approx ~1 second everytime it gets called because of the inner loops in the function.
From what was observed with
HDFS-4710, single byte reads are an order of magnitude slower than larger 512 byte buffer reads.
Even when disk I/O is buffered to this size, there is overhead due to the synchronized read() methods in BlockReaderLocal & RemoteBlockReader classes.
Removing the readByte() calls in RCFile.sync(long) with a readFully(512 byte) call will speed this function >10x.