Details

    • Type: Sub-task Sub-task
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.4.0
    • Component/s: java
    • Labels:
      None
    1. AVRO-595.patch.v3.txt
      87 kB
      Patrick Wendell
    2. AVRO-595.patch.v2.txt
      79 kB
      Patrick Wendell
    3. AVRO-595.patch.v1.txt
      31 kB
      Patrick Wendell

      Issue Links

        Activity

        Hide
        Patrick Wendell added a comment -

        Some initial design thoughts are listed here... interested in hearing feedback. The high level goal is to create a plugin that can be added to Avro clients which performs tracing. The design is largely based on Google's Dapper paper.

        1. What to trace
        Similar to dapper, we will use spans as our tracing primitive. A span is an avro data record that includes:
        long traceID (consistent across an entire RPC call tree)
        long spanID (unique to each host in a call tree)
        long parentSpanID
        Map<Timestamp, eventEnum> events

        When the RPC plugin is called at critical points in the exchange, span event data will be filled in, keyed by timestamp. Trace and span id's are propagated as avro meta-data when one RPC triggers another (allowing us to back out a recursive call tree).

        2. Span storage and collection
        The plugin will store span data locally in an avro data file. To expose this data, it will run an avro server which can be queried.

        To aggregate spans, every trace plugin will start a user-facing control applet. This applet will accept a list of hosts, which will each be probed for log data. That data will be collected from each span, aggregated, and summarized to the user.

        3. Turning on tracing
        The plugin will include options that dictate the level of trace sampling (0 --> 1) and an upper bound on the amount of storage the tracing component can use. The sampling level will be used to decide whether to inject tracing meta-data into a new RPC (the root of a call tree).

        Alternatively, if an RPC request comes in which already has tracing turned on the plugin must always trace.

        Show
        Patrick Wendell added a comment - Some initial design thoughts are listed here... interested in hearing feedback. The high level goal is to create a plugin that can be added to Avro clients which performs tracing. The design is largely based on Google's Dapper paper. 1. What to trace Similar to dapper, we will use spans as our tracing primitive. A span is an avro data record that includes: long traceID (consistent across an entire RPC call tree) long spanID (unique to each host in a call tree) long parentSpanID Map<Timestamp, eventEnum> events When the RPC plugin is called at critical points in the exchange, span event data will be filled in, keyed by timestamp. Trace and span id's are propagated as avro meta-data when one RPC triggers another (allowing us to back out a recursive call tree). 2. Span storage and collection The plugin will store span data locally in an avro data file. To expose this data, it will run an avro server which can be queried. To aggregate spans, every trace plugin will start a user-facing control applet. This applet will accept a list of hosts, which will each be probed for log data. That data will be collected from each span, aggregated, and summarized to the user. 3. Turning on tracing The plugin will include options that dictate the level of trace sampling (0 --> 1) and an upper bound on the amount of storage the tracing component can use. The sampling level will be used to decide whether to inject tracing meta-data into a new RPC (the root of a call tree). Alternatively, if an RPC request comes in which already has tracing turned on the plugin must always trace.
        Hide
        Doug Cutting added a comment -

        Sounds great!

        What's the plan for high-level access to traces? Will there be a master server that queries all nodes in a cluster? Will each node have indexes, to reduce the amount of data transmitted? Or is this something that will be added later?

        Show
        Doug Cutting added a comment - Sounds great! What's the plan for high-level access to traces? Will there be a master server that queries all nodes in a cluster? Will each node have indexes, to reduce the amount of data transmitted? Or is this something that will be added later?
        Hide
        Patrick Wendell added a comment -

        Here is a patch with some initial design ideas for the tracing plugin. So far, I've implemented/tested the propagation of trace data and defined base data types.

        The interface for querying is defined but unimplemented. Wouldn't mind feedback on that as well.

        Show
        Patrick Wendell added a comment - Here is a patch with some initial design ideas for the tracing plugin. So far, I've implemented/tested the propagation of trace data and defined base data types. The interface for querying is defined but unimplemented. Wouldn't mind feedback on that as well.
        Hide
        Doug Cutting added a comment -

        Looks like a good start. Here are some comments. I hope they're not to pedantic for this stage of the implementation.

        avroTrace.avdl

        • spurious //foo comment
        • why is the result of getSpanBlock a byte array rather than an array of spans?

        SpanStorage.java

        • i don't understand the need for the query handles, and they don't appear to be used for anything yet.
        • Why expose spans as ByteBuffers rather than as an iterator or a list or something? Is this an optimization, so they can be efficiently re-serialized into RPCs? I wonder if this is premature.
        • if getAllSpans() is just for testing, does it need to be public, or would package-private suffice?

        TracePlugin.java

        • creating a new encoder or decoder each time you read/write a long is heavyweight. you could instead reuse a single encoder & decoder, switching the bytes they read or write.
        • payload size should be a long instead of an int
        • do we want to serve traces on a separate port, or simply on a different url? this seems perhaps outside the scope of the plugin, which should just record the stats, and how they're published might be configured elsewhere?
        • how bad would a trace ID collision be? RANDOM.nextLong() isn't ideal, but is fast and might be good enough if collisions aren't too terrible. if collisions are to be avoided then it might use something like SecureRandom.nextBytes(new byte[16]).
        • just catch IOException once, rather than around each of the metadata sets. it should never happen anyway.
        • hostname should be computed once and stored in a field, not per event
        • why create 100-element event vectors? shouldn't these grow as events are added?
        Show
        Doug Cutting added a comment - Looks like a good start. Here are some comments. I hope they're not to pedantic for this stage of the implementation. avroTrace.avdl spurious //foo comment why is the result of getSpanBlock a byte array rather than an array of spans? SpanStorage.java i don't understand the need for the query handles, and they don't appear to be used for anything yet. Why expose spans as ByteBuffers rather than as an iterator or a list or something? Is this an optimization, so they can be efficiently re-serialized into RPCs? I wonder if this is premature. if getAllSpans() is just for testing, does it need to be public, or would package-private suffice? TracePlugin.java creating a new encoder or decoder each time you read/write a long is heavyweight. you could instead reuse a single encoder & decoder, switching the bytes they read or write. payload size should be a long instead of an int do we want to serve traces on a separate port, or simply on a different url? this seems perhaps outside the scope of the plugin, which should just record the stats, and how they're published might be configured elsewhere? how bad would a trace ID collision be? RANDOM.nextLong() isn't ideal, but is fast and might be good enough if collisions aren't too terrible. if collisions are to be avoided then it might use something like SecureRandom.nextBytes(new byte [16] ). just catch IOException once, rather than around each of the metadata sets. it should never happen anyway. hostname should be computed once and stored in a field, not per event why create 100-element event vectors? shouldn't these grow as events are added?
        Hide
        Patrick Wendell added a comment -

        >> * why is the result of getSpanBlock a byte array rather than an array of spans?

        We discussed off-line - going to change this, but may change back if there are performance benefits.

        >> * i don't understand the need for the query handles, and they don't appear to be used for anything yet.

        This is just a thought on how we might deal with queries that may persist over multiple calls. I'll settle this interface as we continue to work on this.

        >> * do we want to serve traces on a separate port, or simply on a different url? this seems perhaps outside the scope of the plugin, which should just record the stats, and how they're published might be configured elsewhere?

        I think our idea is to have some basic aggregation/viewing of stats included in the plugin, but also keep things extensible if people want to run their own analyses.

        >> * how bad would a trace ID collision be? RANDOM.nextLong() isn't ideal, but is fast and might be good enough if collisions aren't too terrible. if collisions are to be avoided then it might use something like SecureRandom.nextBytes(new byte[16]).

        Collisions won't break the app, but (sampling is only probabilistic, so we could just toss out traces where collisions happen) - however maybe will just do this anyways.

        Show
        Patrick Wendell added a comment - >> * why is the result of getSpanBlock a byte array rather than an array of spans? We discussed off-line - going to change this, but may change back if there are performance benefits. >> * i don't understand the need for the query handles, and they don't appear to be used for anything yet. This is just a thought on how we might deal with queries that may persist over multiple calls. I'll settle this interface as we continue to work on this. >> * do we want to serve traces on a separate port, or simply on a different url? this seems perhaps outside the scope of the plugin, which should just record the stats, and how they're published might be configured elsewhere? I think our idea is to have some basic aggregation/viewing of stats included in the plugin, but also keep things extensible if people want to run their own analyses. >> * how bad would a trace ID collision be? RANDOM.nextLong() isn't ideal, but is fast and might be good enough if collisions aren't too terrible. if collisions are to be avoided then it might use something like SecureRandom.nextBytes(new byte [16] ). Collisions won't break the app, but (sampling is only probabilistic, so we could just toss out traces where collisions happen) - however maybe will just do this anyways.
        Hide
        Philip Zeyliger added a comment -

        I put up a review on reviewboard:

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        http://review.hbase.org/r/351/#review443
        -----------------------------------------------------------

        Great start. Comments below.

        lang/java/src/java/org/apache/avro/ipc/trace/InMemorySpanStorage.java
        <http://review.hbase.org/r/351/#comment1866>

        queryHandle seems unused here. The query stuff still seems to be baking; I haven't seen if your tests depend on it, but it might make sense to stage it out of this commit and put it back in later when it's more strictly necessary.

        lang/java/src/java/org/apache/avro/ipc/trace/SpanStorage.java
        <http://review.hbase.org/r/351/#comment1864>

        Drop @author's for Apache code

        lang/java/src/java/org/apache/avro/ipc/trace/SpanStorage.java
        <http://review.hbase.org/r/351/#comment1865>

        It's weird that this returns bytes and not objects...

        lang/java/src/java/org/apache/avro/ipc/trace/TracePlugin.java
        <http://review.hbase.org/r/351/#comment1851>

        mention how this might be disabled?

        lang/java/src/java/org/apache/avro/ipc/trace/TracePlugin.java
        <http://review.hbase.org/r/351/#comment1852>

        Could you initialize these guys in-line and mark them final as well?

        lang/java/src/java/org/apache/avro/ipc/trace/TracePlugin.java
        <http://review.hbase.org/r/351/#comment1853>

        I'm not a big fan of using Configuration here. Even though AVRO has a dependency on Hadoop, it seems a bit abusive to use it where there's no Hadoop going on.

        I recommend a POJO represingting TracePluginConfiguration which people can create and pass to this.

        If you were using Hadoop configuration, you'd want the keys to be better namespaced (i.e., avro.trace.foo instead of just foo), since people would want to use the same global configuration mechanism. But my opinion is that the layer that's doing the work should use reasonably typed configuration, and if you want to layer something above it, that can be done above.

        lang/java/src/java/org/apache/avro/ipc/trace/TracePlugin.java
        <http://review.hbase.org/r/351/#comment1854>

        These are static, so you shouldn't need to instantiate them per instance.

        lang/java/src/java/org/apache/avro/ipc/trace/TracePlugin.java
        <http://review.hbase.org/r/351/#comment1855>

        Why Math.abs()? Is spanID a fixed(4) or a long? It'll store more efficiently if it's a fixed(4), though it might be more of a pain to use.

        lang/java/src/java/org/apache/avro/ipc/trace/TracePlugin.java
        <http://review.hbase.org/r/351/#comment1856>

        What's the 100 doing here?

        lang/java/src/java/org/apache/avro/ipc/trace/TracePlugin.java
        <http://review.hbase.org/r/351/#comment1857>

        Cache this?

        lang/java/src/java/org/apache/avro/ipc/trace/TracePlugin.java
        <http://review.hbase.org/r/351/#comment1858>

        Extract "createNewEmptySpan()" into a method; you've done the same thing twice.

        lang/java/src/java/org/apache/avro/ipc/trace/TracePlugin.java
        <http://review.hbase.org/r/351/#comment1859>

        You should document (perhaps in package.html, or in javadoc here) how this system uses the RPC metadata.

        lang/java/src/java/org/apache/avro/ipc/trace/TracePlugin.java
        <http://review.hbase.org/r/351/#comment1860>

        Since these aren't supposed to happen, I tend to escalate them into RuntimeException instead of dropping them on the floor. Just in case

        lang/java/src/java/org/apache/avro/ipc/trace/TracePlugin.java
        <http://review.hbase.org/r/351/#comment1861>

        I'm a little confused as to why you need both span and parent_span, but I might be missing something.

        lang/java/src/java/org/apache/avro/ipc/trace/TracePlugin.java
        <http://review.hbase.org/r/351/#comment1862>

        Log about this; you'd want to know that something is misbehaving.

        lang/java/src/java/org/apache/avro/ipc/trace/TracePlugin.java
        <http://review.hbase.org/r/351/#comment1863>

        You're using childSpan and this.childSpan inconsistency in a few places; may as well make it consistent.

        lang/java/src/test/java/org/apache/avro/ipc/trace/TestBasicTracing.java
        <http://review.hbase.org/r/351/#comment1868>

        Looks like x, y, w here.

        lang/java/src/test/java/org/apache/avro/ipc/trace/TestBasicTracing.java
        <http://review.hbase.org/r/351/#comment1869>

        I wonder if this test would read easier if you used the specific API and generated some classes for it. Not a big concern.

        share/schemas/org/apache/avro/ipc/trace/avroTrace.avdl
        <http://review.hbase.org/r/351/#comment1870>

        Perhaps add a microseconds component to this as well?

        share/schemas/org/apache/avro/ipc/trace/avroTrace.avdl
        <http://review.hbase.org/r/351/#comment1847>

        It's a little weird that "SpanEventType" isn't really a type; it's the event itself.

        Perhaps, rename SpanEventType as SpanEvent?

        share/schemas/org/apache/avro/ipc/trace/avroTrace.avdl
        <http://review.hbase.org/r/351/#comment1848>

        Document each of these fields?

        share/schemas/org/apache/avro/ipc/trace/avroTrace.avdl
        <http://review.hbase.org/r/351/#comment1849>

        It might be useful to have both host and port.

        share/schemas/org/apache/avro/ipc/trace/avroTrace.avpr
        <http://review.hbase.org/r/351/#comment1850>

        Is this generated? Could this be generated by the build instead of checked in?

        • Philip

        On 2010-07-21 13:02:51, Philip Zeyliger wrote:
        >
        > -----------------------------------------------------------
        > This is an automatically generated e-mail. To reply, visit:
        > http://review.hbase.org/r/351/
        > -----------------------------------------------------------
        >
        > (Updated 2010-07-21 13:02:51)

        • Show quoted text -
        Show
        Philip Zeyliger added a comment - I put up a review on reviewboard: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.hbase.org/r/351/#review443 ----------------------------------------------------------- Great start. Comments below. lang/java/src/java/org/apache/avro/ipc/trace/InMemorySpanStorage.java < http://review.hbase.org/r/351/#comment1866 > queryHandle seems unused here. The query stuff still seems to be baking; I haven't seen if your tests depend on it, but it might make sense to stage it out of this commit and put it back in later when it's more strictly necessary. lang/java/src/java/org/apache/avro/ipc/trace/SpanStorage.java < http://review.hbase.org/r/351/#comment1864 > Drop @author's for Apache code lang/java/src/java/org/apache/avro/ipc/trace/SpanStorage.java < http://review.hbase.org/r/351/#comment1865 > It's weird that this returns bytes and not objects... lang/java/src/java/org/apache/avro/ipc/trace/TracePlugin.java < http://review.hbase.org/r/351/#comment1851 > mention how this might be disabled? lang/java/src/java/org/apache/avro/ipc/trace/TracePlugin.java < http://review.hbase.org/r/351/#comment1852 > Could you initialize these guys in-line and mark them final as well? lang/java/src/java/org/apache/avro/ipc/trace/TracePlugin.java < http://review.hbase.org/r/351/#comment1853 > I'm not a big fan of using Configuration here. Even though AVRO has a dependency on Hadoop, it seems a bit abusive to use it where there's no Hadoop going on. I recommend a POJO represingting TracePluginConfiguration which people can create and pass to this. If you were using Hadoop configuration, you'd want the keys to be better namespaced (i.e., avro.trace.foo instead of just foo), since people would want to use the same global configuration mechanism. But my opinion is that the layer that's doing the work should use reasonably typed configuration, and if you want to layer something above it, that can be done above. lang/java/src/java/org/apache/avro/ipc/trace/TracePlugin.java < http://review.hbase.org/r/351/#comment1854 > These are static, so you shouldn't need to instantiate them per instance. lang/java/src/java/org/apache/avro/ipc/trace/TracePlugin.java < http://review.hbase.org/r/351/#comment1855 > Why Math.abs()? Is spanID a fixed(4) or a long? It'll store more efficiently if it's a fixed(4), though it might be more of a pain to use. lang/java/src/java/org/apache/avro/ipc/trace/TracePlugin.java < http://review.hbase.org/r/351/#comment1856 > What's the 100 doing here? lang/java/src/java/org/apache/avro/ipc/trace/TracePlugin.java < http://review.hbase.org/r/351/#comment1857 > Cache this? lang/java/src/java/org/apache/avro/ipc/trace/TracePlugin.java < http://review.hbase.org/r/351/#comment1858 > Extract "createNewEmptySpan()" into a method; you've done the same thing twice. lang/java/src/java/org/apache/avro/ipc/trace/TracePlugin.java < http://review.hbase.org/r/351/#comment1859 > You should document (perhaps in package.html, or in javadoc here) how this system uses the RPC metadata. lang/java/src/java/org/apache/avro/ipc/trace/TracePlugin.java < http://review.hbase.org/r/351/#comment1860 > Since these aren't supposed to happen, I tend to escalate them into RuntimeException instead of dropping them on the floor. Just in case lang/java/src/java/org/apache/avro/ipc/trace/TracePlugin.java < http://review.hbase.org/r/351/#comment1861 > I'm a little confused as to why you need both span and parent_span, but I might be missing something. lang/java/src/java/org/apache/avro/ipc/trace/TracePlugin.java < http://review.hbase.org/r/351/#comment1862 > Log about this; you'd want to know that something is misbehaving. lang/java/src/java/org/apache/avro/ipc/trace/TracePlugin.java < http://review.hbase.org/r/351/#comment1863 > You're using childSpan and this.childSpan inconsistency in a few places; may as well make it consistent. lang/java/src/test/java/org/apache/avro/ipc/trace/TestBasicTracing.java < http://review.hbase.org/r/351/#comment1868 > Looks like x, y, w here. lang/java/src/test/java/org/apache/avro/ipc/trace/TestBasicTracing.java < http://review.hbase.org/r/351/#comment1869 > I wonder if this test would read easier if you used the specific API and generated some classes for it. Not a big concern. share/schemas/org/apache/avro/ipc/trace/avroTrace.avdl < http://review.hbase.org/r/351/#comment1870 > Perhaps add a microseconds component to this as well? share/schemas/org/apache/avro/ipc/trace/avroTrace.avdl < http://review.hbase.org/r/351/#comment1847 > It's a little weird that "SpanEventType" isn't really a type; it's the event itself. Perhaps, rename SpanEventType as SpanEvent? share/schemas/org/apache/avro/ipc/trace/avroTrace.avdl < http://review.hbase.org/r/351/#comment1848 > Document each of these fields? share/schemas/org/apache/avro/ipc/trace/avroTrace.avdl < http://review.hbase.org/r/351/#comment1849 > It might be useful to have both host and port. share/schemas/org/apache/avro/ipc/trace/avroTrace.avpr < http://review.hbase.org/r/351/#comment1850 > Is this generated? Could this be generated by the build instead of checked in? Philip On 2010-07-21 13:02:51, Philip Zeyliger wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > http://review.hbase.org/r/351/ > ----------------------------------------------------------- > > (Updated 2010-07-21 13:02:51) Show quoted text -
        Hide
        Patrick Wendell added a comment -

        Here is a patch update based on the preliminary feedback. I dropped the querying stuff since it's not yet used. This patch adds the logic behind aggregating partial spans (which will have been collected from several remote hosts) and creating traces (extracting RPC call trees from a set of spans).

        We talked off-line about redundancy in the data that is collected. We could get away with only recording a spanID and a parentSpanID for each span, then we can back-out each trace by following these data (we get to the root of a trace when we see a null parentSpanID). By adding a traceID unique to each call tree, it is easier to partition the work of collecting spans into traces as we can easily group spans by trace before we try back-out a tree (perhaps this is why Dapper paper describes this method). It also lowers the chance of ID collision since spanID and parentSpanID are specific to one traceID.

        Show
        Patrick Wendell added a comment - Here is a patch update based on the preliminary feedback. I dropped the querying stuff since it's not yet used. This patch adds the logic behind aggregating partial spans (which will have been collected from several remote hosts) and creating traces (extracting RPC call trees from a set of spans). We talked off-line about redundancy in the data that is collected. We could get away with only recording a spanID and a parentSpanID for each span, then we can back-out each trace by following these data (we get to the root of a trace when we see a null parentSpanID). By adding a traceID unique to each call tree, it is easier to partition the work of collecting spans into traces as we can easily group spans by trace before we try back-out a tree (perhaps this is why Dapper paper describes this method). It also lowers the chance of ID collision since spanID and parentSpanID are specific to one traceID.
        Hide
        Philip Zeyliger added a comment -

        Another review via reviewboard:

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        http://review.cloudera.org/r/390/#review521
        -----------------------------------------------------------

        Good stuff, Patrick. A few concerns about InMemorySpanStorage and a few other things in the review below.

        lang/java/src/java/org/apache/avro/ipc/trace/InMemorySpanStorage.java
        <http://review.cloudera.org/r/390/#comment2088>

        nitpick: grammar is wonky here.

        lang/java/src/java/org/apache/avro/ipc/trace/InMemorySpanStorage.java
        <http://review.cloudera.org/r/390/#comment2089>

        Seems like you probably have concurrency issues here: a multi-threaded server shares one span storage, so, unless there's a synchronized somewhere above this, this doesn't work.

        lang/java/src/java/org/apache/avro/ipc/trace/InMemorySpanStorage.java
        <http://review.cloudera.org/r/390/#comment2091>

        This is tricky to use on a running server because the spans are still in play. You could make a copy, which is expensive, but since real clients would use files...

        lang/java/src/java/org/apache/avro/ipc/trace/SpanAggregator.java
        <http://review.cloudera.org/r/390/#comment2102>

        I suspect that SpanEvent.values() is what you're looking for here, instead of enumerating all the enums.

        lang/java/src/java/org/apache/avro/ipc/trace/SpanAggregator.java
        <http://review.cloudera.org/r/390/#comment2109>

        You do these 5 lines twice; may as well make a helper function.

        lang/java/src/java/org/apache/avro/ipc/trace/SpanAggregator.java
        <http://review.cloudera.org/r/390/#comment2110>

        After reading this code a couple of times, I think it may be simpler if you follow more of a map-reduce pattern.

        First group all the spans. Then go through every grouping and see if it's complete. Your code keeps checking to see if the grouping is complete as you go, which makes the logic a little harder to follow.

        Just a suggestion; it works either way.

        lang/java/src/java/org/apache/avro/ipc/trace/SpanAggregator.java
        <http://review.cloudera.org/r/390/#comment2107>

        Look into using EnumSet here. It's probably more efficient than a linkedlist.

        lang/java/src/java/org/apache/avro/ipc/trace/SpanAggregator.java
        <http://review.cloudera.org/r/390/#comment2108>

        is this 'if' ever false if you have a complete span between the two?

        lang/java/src/java/org/apache/avro/ipc/trace/SpanAggregator.java
        <http://review.cloudera.org/r/390/#comment2111>

        Since traces.get(traceID) is the next thing you do, you can probably iterate over the items in traces to begin with.

        lang/java/src/java/org/apache/avro/ipc/trace/Trace.java
        <http://review.cloudera.org/r/390/#comment2112>

        Do you actually need the empty constructor and the setter? If you can avoid two ways of doing something, that's often better. You might need it, of course.

        lang/java/src/java/org/apache/avro/ipc/trace/Trace.java
        <http://review.cloudera.org/r/390/#comment2113>

        You might mention that the graphi is meant to be read with your head askew. I found the encoding a bit hard to read. Perhaps (w (y (z)) (more of an S-expression-style tree) would be easier to read? Your call.

        lang/java/src/java/org/apache/avro/ipc/trace/Trace.java
        <http://review.cloudera.org/r/390/#comment2114>

        Extract 1000000 to a constant. (MICROS_PER_SECOND?)

        lang/java/src/java/org/apache/avro/ipc/trace/Trace.java
        <http://review.cloudera.org/r/390/#comment2115>

        I think our style guide typically recommends

        if (...)

        { ... } else { ... }

        (i.e., put else on the same line as the two braces)

        lang/java/src/java/org/apache/avro/ipc/trace/Trace.java
        <http://review.cloudera.org/r/390/#comment2116>

        Naively, if you structure the getChildren() function nicely, you could get away without managing the root separately, since every subnode is it's own "root". Not sure if that's easy to do in this case without trying it.

        lang/java/src/java/org/apache/avro/ipc/trace/TracePlugin.java
        <http://review.cloudera.org/r/390/#comment2117>

        I think you've just re-implemented Arrays.equals(byte[] a, byte[] b).

        • Philip
        Show
        Philip Zeyliger added a comment - Another review via reviewboard: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/390/#review521 ----------------------------------------------------------- Good stuff, Patrick. A few concerns about InMemorySpanStorage and a few other things in the review below. lang/java/src/java/org/apache/avro/ipc/trace/InMemorySpanStorage.java < http://review.cloudera.org/r/390/#comment2088 > nitpick: grammar is wonky here. lang/java/src/java/org/apache/avro/ipc/trace/InMemorySpanStorage.java < http://review.cloudera.org/r/390/#comment2089 > Seems like you probably have concurrency issues here: a multi-threaded server shares one span storage, so, unless there's a synchronized somewhere above this, this doesn't work. lang/java/src/java/org/apache/avro/ipc/trace/InMemorySpanStorage.java < http://review.cloudera.org/r/390/#comment2091 > This is tricky to use on a running server because the spans are still in play. You could make a copy, which is expensive, but since real clients would use files... lang/java/src/java/org/apache/avro/ipc/trace/SpanAggregator.java < http://review.cloudera.org/r/390/#comment2102 > I suspect that SpanEvent.values() is what you're looking for here, instead of enumerating all the enums. lang/java/src/java/org/apache/avro/ipc/trace/SpanAggregator.java < http://review.cloudera.org/r/390/#comment2109 > You do these 5 lines twice; may as well make a helper function. lang/java/src/java/org/apache/avro/ipc/trace/SpanAggregator.java < http://review.cloudera.org/r/390/#comment2110 > After reading this code a couple of times, I think it may be simpler if you follow more of a map-reduce pattern. First group all the spans. Then go through every grouping and see if it's complete. Your code keeps checking to see if the grouping is complete as you go, which makes the logic a little harder to follow. Just a suggestion; it works either way. lang/java/src/java/org/apache/avro/ipc/trace/SpanAggregator.java < http://review.cloudera.org/r/390/#comment2107 > Look into using EnumSet here. It's probably more efficient than a linkedlist. lang/java/src/java/org/apache/avro/ipc/trace/SpanAggregator.java < http://review.cloudera.org/r/390/#comment2108 > is this 'if' ever false if you have a complete span between the two? lang/java/src/java/org/apache/avro/ipc/trace/SpanAggregator.java < http://review.cloudera.org/r/390/#comment2111 > Since traces.get(traceID) is the next thing you do, you can probably iterate over the items in traces to begin with. lang/java/src/java/org/apache/avro/ipc/trace/Trace.java < http://review.cloudera.org/r/390/#comment2112 > Do you actually need the empty constructor and the setter? If you can avoid two ways of doing something, that's often better. You might need it, of course. lang/java/src/java/org/apache/avro/ipc/trace/Trace.java < http://review.cloudera.org/r/390/#comment2113 > You might mention that the graphi is meant to be read with your head askew. I found the encoding a bit hard to read. Perhaps (w (y (z)) (more of an S-expression-style tree) would be easier to read? Your call. lang/java/src/java/org/apache/avro/ipc/trace/Trace.java < http://review.cloudera.org/r/390/#comment2114 > Extract 1000000 to a constant. (MICROS_PER_SECOND?) lang/java/src/java/org/apache/avro/ipc/trace/Trace.java < http://review.cloudera.org/r/390/#comment2115 > I think our style guide typically recommends if (...) { ... } else { ... } (i.e., put else on the same line as the two braces) lang/java/src/java/org/apache/avro/ipc/trace/Trace.java < http://review.cloudera.org/r/390/#comment2116 > Naively, if you structure the getChildren() function nicely, you could get away without managing the root separately, since every subnode is it's own "root". Not sure if that's easy to do in this case without trying it. lang/java/src/java/org/apache/avro/ipc/trace/TracePlugin.java < http://review.cloudera.org/r/390/#comment2117 > I think you've just re-implemented Arrays.equals(byte[] a, byte[] b). Philip
        Hide
        Philip Zeyliger added a comment -

        Via reviewboard again:

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        http://review.cloudera.org/r/390/#review627
        -----------------------------------------------------------

        Almost there. I tried running the tests and ran into some issues:

        Testcase: testRecursingTrace took 0.006 sec
        Caused an ERROR
        Connection refused
        java.net.ConnectException: Connection refused

        Does "ant test clean" work for you? I think it's trying to re-use a port that it hasn't closed.

        lang/java/src/java/org/apache/avro/ipc/trace/Util.java
        <http://review.cloudera.org/r/390/#comment2366>

        If you can avoid this, I would avoid making this public.

        lang/java/src/java/org/apache/avro/ipc/trace/Util.java
        <http://review.cloudera.org/r/390/#comment2365>

        This doesn't feel public to me

        • Philip
        Show
        Philip Zeyliger added a comment - Via reviewboard again: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/390/#review627 ----------------------------------------------------------- Almost there. I tried running the tests and ran into some issues: Testcase: testRecursingTrace took 0.006 sec Caused an ERROR Connection refused java.net.ConnectException: Connection refused Does "ant test clean" work for you? I think it's trying to re-use a port that it hasn't closed. lang/java/src/java/org/apache/avro/ipc/trace/Util.java < http://review.cloudera.org/r/390/#comment2366 > If you can avoid this, I would avoid making this public. lang/java/src/java/org/apache/avro/ipc/trace/Util.java < http://review.cloudera.org/r/390/#comment2365 > This doesn't feel public to me Philip
        Hide
        Patrick Wendell added a comment -

        We are changing the scope of this particular JIRA, subsequent JIRA's will reflect further legs of this project.

        Show
        Patrick Wendell added a comment - We are changing the scope of this particular JIRA, subsequent JIRA's will reflect further legs of this project.
        Hide
        Patrick Wendell added a comment -

        typo!

        Show
        Patrick Wendell added a comment - typo!
        Hide
        Patrick Wendell added a comment -

        This patch sync's up with most recent version of trunk (some API changes in HTTPServer happened in-between). Also, it improves tear-down for unit tests.

        Show
        Patrick Wendell added a comment - This patch sync's up with most recent version of trunk (some API changes in HTTPServer happened in-between). Also, it improves tear-down for unit tests.
        Hide
        Philip Zeyliger added a comment -

        I just committed this. Thanks, Patrick!

        Show
        Philip Zeyliger added a comment - I just committed this. Thanks, Patrick!

          People

          • Assignee:
            Patrick Wendell
            Reporter:
            Patrick Wendell
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development