Details

    • Sub-task
    • Status: Open
    • Critical
    • Resolution: Unresolved
    • 0.9.2-incubating, 0.9.3
    • None
    • storm-core
    • None

    Description

      The latest netty client code will attempt to reestablish the connection on failure as part of the send method call. It will block until the connection is established or a timeout happens, by default this is about 30 seconds, which is also the default tuple timeout.

      This is exacerbated by the read lock that is held during the send, that prevents the node->socket mapping from changing while we are sending. This is mostly so that we don't close connections while we are trying to write to them, which would cause an exception. But this makes it so if there are multiple workers on a node that all get rescheduled we will wait the full 30 seconds to timeout for each worker.

      send must be non-blocking in the current design of the worker, or it will prevent other messages from being delivered, and is likely to cause many many messages to timeout on a reschedule.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              revans2 Robert Joseph Evans
              Votes:
              1 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated: