ZooKeeper
  1. ZooKeeper
  2. ZOOKEEPER-320

call auth completion in free_completions()

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 3.0.0, 3.0.1, 3.1.0
    • Fix Version/s: 3.1.1, 3.2.0
    • Component/s: c client
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      If a client calls zoo_add_auth() with an invalid scheme (e.g., "foo") the ZooKeeper server will mark their session expired and close the connection. However, the C client has returned immediately after queuing the new auth data to be sent with a ZOK return code.

      If the client then waits for their auth completion function to be called, they can wait forever, as no session event is ever delivered to that completion function. All other completion functions are notified of session events by free_completions(), which is called by cleanup_bufs() in handle_error() in handle_socket_error_msg().

      In actual fact, what can happen (about 50% of the time, for me) is that the next call by the IO thread to flush_send_queue() calls send() from within send_buffer(), and receives a SIGPIPE signal during this send() call. Because the ZooKeeper C API is a library, it properly does not catch that signal. If the user's code is not catching that signal either, they experience an abort caused by an untrapped signal. If they are ignoring the signal – which is common in context I'm working in, the Apache httpd server – then flush_send_queue()'s error return code is EPIPE, which is logged by handle_socket_error_msg(), and all non-auth completion functions are notified of a session event. However, if the caller is waiting for their auth completion function, they wait forever while the IO thread tries repeatedly to reconnect and is rejected by the server as having an expired session.

      So, first of all, it would be useful to document in the C API portion of the programmer's guide that trapping or ignoring SIGPIPE is important, as this signal may be generated by the C API.

      Next, the two attached patches call the auth completion function, if any, in free_completions(), which fixes this problem for me. The second attached patch includes auth lock/unlock function, as per ZOOKEEPER-319.

      1. ZOOKEEPER-320.patch
        0.4 kB
        Chris Darroch
      2. ZOOKEEPER-320-319.patch
        0.8 kB
        Chris Darroch
      3. ZOOKEEPER-320-319.patch
        0.8 kB
        Chris Darroch

        Activity

        Transition Time In Source Status Execution Times Last Executer Last Execution Date
        Open Open Patch Available Patch Available
        2d 21h 57m 1 Chris Darroch 20/Feb/09 20:45
        Patch Available Patch Available Resolved Resolved
        6d 23h 19m 1 Mahadev konar 27/Feb/09 20:05
        Resolved Resolved Closed Closed
        131d 18m 1 Patrick Hunt 08/Jul/09 21:24
        Patrick Hunt made changes -
        Status Resolved [ 5 ] Closed [ 6 ]
        Hide
        Hudson added a comment -

        Integrated in ZooKeeper-trunk #243 (See http://hudson.zones.apache.org/hudson/job/ZooKeeper-trunk/243/)
        . call auth completion in free_completions(). (chris darroch via mahadev)

        Show
        Hudson added a comment - Integrated in ZooKeeper-trunk #243 (See http://hudson.zones.apache.org/hudson/job/ZooKeeper-trunk/243/ ) . call auth completion in free_completions(). (chris darroch via mahadev)
        Mahadev konar made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        Hadoop Flags [Reviewed]
        Hide
        Mahadev konar added a comment -

        I just committed this. Thanks chris.

        Show
        Mahadev konar added a comment - I just committed this. Thanks chris.
        Hide
        Mahadev konar added a comment -

        my compiler is gcc 3.4.4

        Show
        Mahadev konar added a comment - my compiler is gcc 3.4.4
        Hide
        Chris Darroch added a comment -

        Also, please note my suggestion that the docs mention the need to catch, ignore, or otherwise be aware of SIGPIPE signals.

        Show
        Chris Darroch added a comment - Also, please note my suggestion that the docs mention the need to catch, ignore, or otherwise be aware of SIGPIPE signals.
        Chris Darroch made changes -
        Attachment ZOOKEEPER-320-319.patch [ 12401068 ]
        Hide
        Chris Darroch added a comment - - edited

        Updated with a NULL initialization as per the comment on ZOOKEEPER-319:

        https://issues.apache.org/jira/browse/ZOOKEEPER-319#action_12676824

        Out of interest — what compiler gives these errors? My gcc 4.1.2 with -Wall doesn't report any troubles.

        Show
        Chris Darroch added a comment - - edited Updated with a NULL initialization as per the comment on ZOOKEEPER-319 : https://issues.apache.org/jira/browse/ZOOKEEPER-319#action_12676824 Out of interest — what compiler gives these errors? My gcc 4.1.2 with -Wall doesn't report any troubles.
        Mahadev konar made changes -
        Assignee Chris Darroch [ cdarroch ]
        Hide
        Mahadev konar added a comment -

        +1 for the second patch.

        Show
        Mahadev konar added a comment - +1 for the second patch.
        Chris Darroch made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Chris Darroch made changes -
        Attachment ZOOKEEPER-320-319.patch [ 12400361 ]
        Chris Darroch made changes -
        Attachment ZOOKEEPER-320-319.patch [ 12400456 ]
        Hide
        Chris Darroch added a comment -

        This version avoids holding the auth lock while calling the user's auth completion function (which may run for a long time; we don't know).

        Show
        Chris Darroch added a comment - This version avoids holding the auth lock while calling the user's auth completion function (which may run for a long time; we don't know).
        Hide
        Chris Darroch added a comment -

        I should perhaps clarify that the reason one might want to unconditionally wait on the auth completion after calling zoo_add_auth() is so as to provide one's own sync version of that function. The various sync functions such as zoo_wget(), zoo_set2(), etc. are just wrappers of the async versions followed by an unconditional wait on the relevant completion. Similarly, in a context which must be exclusively sync-only, zoo_add_auth() needs to be followed by a wait on its completion.

        Show
        Chris Darroch added a comment - I should perhaps clarify that the reason one might want to unconditionally wait on the auth completion after calling zoo_add_auth() is so as to provide one's own sync version of that function. The various sync functions such as zoo_wget(), zoo_set2(), etc. are just wrappers of the async versions followed by an unconditional wait on the relevant completion. Similarly, in a context which must be exclusively sync-only, zoo_add_auth() needs to be followed by a wait on its completion.
        Chris Darroch made changes -
        Attachment ZOOKEEPER-320-319.patch [ 12400361 ]
        Hide
        Chris Darroch added a comment -

        This patch includes auth data locking as per ZOOKEEPER-319.

        Show
        Chris Darroch added a comment - This patch includes auth data locking as per ZOOKEEPER-319 .
        Chris Darroch made changes -
        Field Original Value New Value
        Attachment ZOOKEEPER-320.patch [ 12400360 ]
        Hide
        Chris Darroch added a comment -

        This patch does not include locking for the auth data, as per ZOOKEEPER-319.

        Show
        Chris Darroch added a comment - This patch does not include locking for the auth data, as per ZOOKEEPER-319 .
        Chris Darroch created issue -

          People

          • Assignee:
            Chris Darroch
            Reporter:
            Chris Darroch
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development