The tracing functionality used by Accumulo provides nested regions of timing information for regular operations occurring inside of Accumulo, e.g. scans and compactions. There is basic functionality to view this information through the Accumulo monitor. This can be thought of as a distributed timing infrastructure for Accumulo which uses Accumulo to store its data.
Currently, this tracing doesn't fall through into HDFS. It would be awesome to actually introspect through the DFSClient, all the way down to the datanode writing to local disk. A large portion of the task would be investigating ways to inject the ability for calling applications (Accumulo, in this case) to provide the trace client through the Hadoop datanode code and record the necessary timings
Skills required would be a good understanding of Java. Some basic knowledge about Apache Hadoop would also be helpful.