[HDFS-8069] Tracing implementation on DFSInputStream seriously degrades performance - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Open
Priority: Critical
Resolution: Unresolved
Affects Version/s: 2.7.0
Fix Version/s: None
Component/s: hdfs-client
Labels:
None

Description

I've been doing some testing of Accumulo with HDFS 2.7.0 and have noticed a serious performance impact when Accumulo registers itself as a SpanReceiver.

The context of the test which I noticed the impact is that an Accumulo process reads a series of updates from a write-ahead log. This is just reading a series of Writable objects from a file in HDFS. With tracing enabled, I waited for at least 10 minutes and the server still hadn't read a ~300MB file.

Doing a poor-man's inspection via repeated thread dumps, I always see something like the following:

"replication task 2" daemon prio=10 tid=0x0000000002842800 nid=0x794d runnable [0x00007f6c7b1ec000]
   java.lang.Thread.State: RUNNABLE
        at java.util.concurrent.CopyOnWriteArrayList.iterator(CopyOnWriteArrayList.java:959)
        at org.apache.htrace.Tracer.deliver(Tracer.java:80)
        at org.apache.htrace.impl.MilliSpan.stop(MilliSpan.java:177)
        - locked <0x000000077a770730> (a org.apache.htrace.impl.MilliSpan)
        at org.apache.htrace.TraceScope.close(TraceScope.java:78)
        at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:898)
        - locked <0x000000079fa39a48> (a org.apache.hadoop.hdfs.DFSInputStream)
        at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:697)
        - locked <0x000000079fa39a48> (a org.apache.hadoop.hdfs.DFSInputStream)
        at java.io.DataInputStream.readByte(DataInputStream.java:265)
        at org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:308)
        at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:329)
        at org.apache.accumulo.core.data.Mutation.readFields(Mutation.java:951)
       ... more accumulo code omitted...

What I'm seeing here is that reading a single byte (in WritableUtils.readVLong) is causing a new Span creation and close (which includes a flush to the SpanReceiver). This results in an extreme amount of spans for DFSInputStream.byteArrayRead just for reading a file from HDFS – over 700k spans for just reading a few hundred MB file.

Perhaps there's something different we need to do for the SpanReceiver in Accumulo? I'm not entirely sure, but this was rather unexpected.

cc/ cmccabe

Attachments

Issue Links

is related to

HDFS-7055 Add tracing to DFSInputStream

Closed

HDFS-8088 Reduce the number of HTrace spans generated by HDFS reads

Resolved

Activity

People

Assignee:: Unassigned

Reporter:: Josh Elser

Votes:: 0 Vote for this issue

Watchers:: 9 Start watching this issue

Dates

Created:: 06/Apr/15 19:43

Updated:: 08/Apr/15 23:22