Avro
  1. Avro
  2. AVRO-722

IPC Request Metadata written for transmission before processing by plugins

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 1.5.0
    • Fix Version/s: 1.5.0
    • Component/s: java
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      There is a small issue with the processing order of request metadata in an IPC call. The map is written for transmission before the context is passed to plugins for processing and as a result any changes that client plugins make to the metadata will not get transferred as the empty map has already been written.

      Also when the response metadata is read back in it is mistakenly read into the context's request metadata instead of the response metadata. The attached patch has fixes for both of these issues.

      1. AVRO-722.patch
        5 kB
        Doug Cutting
      2. rpc-plugin.patch
        4 kB
        Stephen Gargan
      3. rpc-plugin.patch
        4 kB
        Stephen Gargan

        Activity

        Stephen Gargan created issue -
        Stephen Gargan made changes -
        Field Original Value New Value
        Attachment rpc-plugin.patch [ 12467145 ]
        Hide
        Doug Cutting added a comment -

        This reverts a change that Patrick Wendell made in AVRO-595 when adding RPC tracing. I'm not sure why Patrick made that change there: perhaps it was essential but perhaps it was not. Unfortunately the trace tests are currently disabled in pom.xml, since they intermittently fail for reasons no one has yet debugged.

        I re-enabled the trace tests in pom.xml and ran tests on unpatched trunk and they passed. Then I applied this patch and ran tests again and trace tests failed three times in a row. I reverted the patch and the trace tests then passed again. So superficially it appears that this patch breaks RPC tracing.

        It would be nice to re-enable the trace tests by default. I think the problems are/were mostly caused by the use of fixed port numbers. In four runs they pass every time for me now.

        Show
        Doug Cutting added a comment - This reverts a change that Patrick Wendell made in AVRO-595 when adding RPC tracing. I'm not sure why Patrick made that change there: perhaps it was essential but perhaps it was not. Unfortunately the trace tests are currently disabled in pom.xml, since they intermittently fail for reasons no one has yet debugged. I re-enabled the trace tests in pom.xml and ran tests on unpatched trunk and they passed. Then I applied this patch and ran tests again and trace tests failed three times in a row. I reverted the patch and the trace tests then passed again. So superficially it appears that this patch breaks RPC tracing. It would be nice to re-enable the trace tests by default. I think the problems are/were mostly caused by the use of fixed port numbers. In four runs they pass every time for me now.
        Doug Cutting made changes -
        Assignee Stephen Gargan [ sgargan ]
        Hide
        Stephen Gargan added a comment -

        I didn't realize that the trace tests had been excluded. I enabled them and worked out what was making them fail. The changes I'd made were effecting order that calls were being made to the plugin. I've added a test to insure that he order is maintained.

        As a note, the TestFileSpanStorage is fairly time and load dependent and could be the source of the intermittent failures you're seeing. Its sleeps for arbitrary amounts waiting for the spans to be written asynchronously to disk. The current wait times are pretty tight and fail periodically on me, (though admittedly my machine is not very powerful).

        What kind of error messages did it give when it failed before?

        Show
        Stephen Gargan added a comment - I didn't realize that the trace tests had been excluded. I enabled them and worked out what was making them fail. The changes I'd made were effecting order that calls were being made to the plugin. I've added a test to insure that he order is maintained. As a note, the TestFileSpanStorage is fairly time and load dependent and could be the source of the intermittent failures you're seeing. Its sleeps for arbitrary amounts waiting for the spans to be written asynchronously to disk. The current wait times are pretty tight and fail periodically on me, (though admittedly my machine is not very powerful). What kind of error messages did it give when it failed before?
        Stephen Gargan made changes -
        Attachment rpc-plugin.patch [ 12467373 ]
        Hide
        Doug Cutting added a comment -

        AVRO-644 reported problems with TestFileSpanStorage. I disabled trace tests in AVRO-646. I also saw failures in other trace tests but can't remember which. Should we re-enable trace testing and try to make TestFileSpanStorage more reliable?

        Show
        Doug Cutting added a comment - AVRO-644 reported problems with TestFileSpanStorage. I disabled trace tests in AVRO-646 . I also saw failures in other trace tests but can't remember which. Should we re-enable trace testing and try to make TestFileSpanStorage more reliable?
        Hide
        Doug Cutting added a comment -

        This works for me now with trace testing enabled. Thanks!

        Here's a new version of the patch that also re-enables trace testing.

        Show
        Doug Cutting added a comment - This works for me now with trace testing enabled. Thanks! Here's a new version of the patch that also re-enables trace testing.
        Doug Cutting made changes -
        Attachment AVRO-722.patch [ 12467476 ]
        Hide
        Doug Cutting added a comment -

        I'll commit this unless there are objections.

        Show
        Doug Cutting added a comment - I'll commit this unless there are objections.
        Doug Cutting made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Hadoop Flags [Reviewed]
        Hide
        Doug Cutting added a comment -

        I found the other trace-related failure that motivated disabling these tests: AVRO-643.

        Show
        Doug Cutting added a comment - I found the other trace-related failure that motivated disabling these tests: AVRO-643 .
        Hide
        Doug Cutting added a comment -

        I committed this.

        Show
        Doug Cutting added a comment - I committed this.
        Doug Cutting made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        Hide
        Doug Cutting added a comment -

        Thanks, Stephen for noticing this bug, providing a fix for it, and including a great unit test!

        Show
        Doug Cutting added a comment - Thanks, Stephen for noticing this bug, providing a fix for it, and including a great unit test!
        Doug Cutting made changes -
        Status Resolved [ 5 ] Closed [ 6 ]

          People

          • Assignee:
            Stephen Gargan
            Reporter:
            Stephen Gargan
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development