ZooKeeper
  1. ZooKeeper
  2. ZOOKEEPER-642

"exceeded deadline by N ms" floods logs

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 3.2.1
    • Fix Version/s: 3.5.0
    • Component/s: c client
    • Labels:
    • Environment:

      virtualized linux - ec2 - ubuntu

      Description

      More important zookeeper warnings are drown out by the following several times per minute:

      2010-01-12 17:39:57,227:22317(0x4147eb90):ZOO_WARN@zookeeper_interest@1335: Exceeded deadline by 13ms

      Perhaps this is an issue with the way virtualized systems manage gettimeofday results?

      Maybe the current 10ms threshold could be pushed up a bit. I notice that 95% of the messages are below 50ms.

      Is there an obvious configuration change that I can make to fix this?

      config file below:

      1. The number of milliseconds of each tick
        tickTime=2000
      2. The number of ticks that the initial
      3. synchronization phase can take
        initLimit=10
      4. The number of ticks that can pass between
      5. sending a request and getting an acknowledgement
        syncLimit=5
      6. the directory where the snapshot is stored.
        dataDir=/mnt/zookeeper
      7. the port at which the clients will connect
        clientPort=2181

      server.1=hbase.1:2888:3888
      server.2=hbase.2:2888:3888
      server.3=hbase.3:2888:3888
      server.4=hbase.4:2888:3888
      server.5=hbase.5:2888:3888

        Activity

        Hide
        Hudson added a comment -

        Integrated in ZooKeeper-trunk #1552 (See https://builds.apache.org/job/ZooKeeper-trunk/1552/)
        ZOOKEEPER-642. "exceeded deadline by N ms" floods logs (Marc Celani via michim) (Revision 1337029)

        Result = SUCCESS
        michim : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1337029
        Files :

        • /zookeeper/trunk/CHANGES.txt
        • /zookeeper/trunk/src/c/src/zookeeper.c
        Show
        Hudson added a comment - Integrated in ZooKeeper-trunk #1552 (See https://builds.apache.org/job/ZooKeeper-trunk/1552/ ) ZOOKEEPER-642 . "exceeded deadline by N ms" floods logs (Marc Celani via michim) (Revision 1337029) Result = SUCCESS michim : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1337029 Files : /zookeeper/trunk/CHANGES.txt /zookeeper/trunk/src/c/src/zookeeper.c
        Hide
        Michi Mutsuzaki added a comment -

        +1.

        Marc, thank you for the patch, and sorry it took me this long to get to this.

        --Michi

        Show
        Michi Mutsuzaki added a comment - +1. Marc, thank you for the patch, and sorry it took me this long to get to this. --Michi
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12496185/ZOOKEEPER-642.patch
        against trunk revision 1214571.

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/832//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/832//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/832//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12496185/ZOOKEEPER-642.patch against trunk revision 1214571. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/832//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/832//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/832//console This message is automatically generated.
        Hide
        Marc Celani added a comment -

        No test included because this only affects logging, and there isn't a clear way of testing anyway, as the logs were generated due to delays in waking up by OS.

        Show
        Marc Celani added a comment - No test included because this only affects logging, and there isn't a clear way of testing anyway, as the logs were generated due to delays in waking up by OS.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12496185/ZOOKEEPER-642.patch
        against trunk revision 1173949.

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/584//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/584//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/584//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12496185/ZOOKEEPER-642.patch against trunk revision 1173949. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/584//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/584//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/584//console This message is automatically generated.
        Hide
        Marc Celani added a comment -

        This patch alters the max amount of time that a deadline can be exceeded from 10ms to 200ms or 10% of the recv timeout in zookeeper.c. This reduces the amount of meaningless log messages.

        Show
        Marc Celani added a comment - This patch alters the max amount of time that a deadline can be exceeded from 10ms to 200ms or 10% of the recv timeout in zookeeper.c. This reduces the amount of meaningless log messages.
        Hide
        Vishal Kathuria added a comment -

        Thanks for submitting this patch Marc.
        One comment:
        +int max_exceed = zh->recv_timeout / 10 > 200 ? 200 :
        + zh->recv_timeout;

        should be
        +int max_exceed = zh->recv_timeout / 10 > 200 ? 200 :
        + (zh->recv_timeout/10);

        Your patch missed the /10. Looks good otherwise.

        Also, you need to click on submit patch and that will kick off the unit tests as well as notify the committers fo this patch.

        Show
        Vishal Kathuria added a comment - Thanks for submitting this patch Marc. One comment: +int max_exceed = zh->recv_timeout / 10 > 200 ? 200 : + zh->recv_timeout; should be +int max_exceed = zh->recv_timeout / 10 > 200 ? 200 : + (zh->recv_timeout/10); Your patch missed the /10. Looks good otherwise. Also, you need to click on submit patch and that will kick off the unit tests as well as notify the committers fo this patch.
        Hide
        Marc Celani added a comment -

        I'm not sure how to go about doing this, but I'd like to have the patch I just attached code reviewed.

        Show
        Marc Celani added a comment - I'm not sure how to go about doing this, but I'd like to have the patch I just attached code reviewed.
        Hide
        Marc Celani added a comment -

        This patch makes the change that vishal offered to reduce the amount of meaningless log messages.

        Show
        Marc Celani added a comment - This patch makes the change that vishal offered to reduce the amount of meaningless log messages.
        Hide
        Ashish Mishra added a comment -

        I'll work on the fix suggested by Vishal.

        Show
        Ashish Mishra added a comment - I'll work on the fix suggested by Vishal.
        Hide
        Vishal Kathuria added a comment -

        I have run into this issue as well, while trying to do some scale testing on ZooKeeper. I usually see it when the client machine is under heavy load, but the message doesn't have much diagnostic value unless we miss by a large time.

        We could change it to min(timeout/10, 200) instead of 10. In my experience, I was routinely seeing delays of less than 200 without any connection loss, hence the suggestion of that value.

        Show
        Vishal Kathuria added a comment - I have run into this issue as well, while trying to do some scale testing on ZooKeeper. I usually see it when the client machine is under heavy load, but the message doesn't have much diagnostic value unless we miss by a large time. We could change it to min(timeout/10, 200) instead of 10. In my experience, I was routinely seeing delays of less than 200 without any connection loss, hence the suggestion of that value.
        Hide
        Patrick Hunt added a comment -

        Hi Lei,

        tickTime is used by the servers, not really impacts the clients

        the clients have a "timeout" value they use when connecting to the servers, this establishes the session timeout.

        What I was suggesting on the link you listed was changing the timeout, not the tickTime. 2 sec is pretty low for timeout.

        However in your case none of this matters. The "deadline exceeded" message you are seeing is purely an OS issue.

        1) we select(..., deadline)
        2) when we wake up from select we check how close we are to the deadline, if this is exceeded by > 10ms then we log a warning.

        In your case the deadline is being exceeded by a significant amount, this is a warning because it could impact the ability
        of the client to maintain connectivity with the server.

        Show
        Patrick Hunt added a comment - Hi Lei, tickTime is used by the servers, not really impacts the clients the clients have a "timeout" value they use when connecting to the servers, this establishes the session timeout. What I was suggesting on the link you listed was changing the timeout, not the tickTime. 2 sec is pretty low for timeout. However in your case none of this matters. The "deadline exceeded" message you are seeing is purely an OS issue. 1) we select(..., deadline) 2) when we wake up from select we check how close we are to the deadline, if this is exceeded by > 10ms then we log a warning. In your case the deadline is being exceeded by a significant amount, this is a warning because it could impact the ability of the client to maintain connectivity with the server.
        Hide
        Lei Zhang added a comment -

        Maybe I had misinterpreted what you meant by "timeout" - I was referring to
        http://mail-archives.apache.org/mod_mbox/hadoop-zookeeper-user/200908.mbox/%253C4A8D7B4B.5020701@apache.org%253E
        .

        "typically we suggest timeouts in the 20-30 second range"

        Same. Still seeing same message every 10 seconds.

        Show
        Lei Zhang added a comment - Maybe I had misinterpreted what you meant by "timeout" - I was referring to http://mail-archives.apache.org/mod_mbox/hadoop-zookeeper-user/200908.mbox/%253C4A8D7B4B.5020701@apache.org%253E . "typically we suggest timeouts in the 20-30 second range" Same. Still seeing same message every 10 seconds.
        Hide
        Patrick Hunt added a comment -

        > We have bumped up tickTime to 20000 per Patrick's suggestion in another thread.

        Hm, I'm not sure which thread you are referring to but tickTime has no impact on this particular issue.

        > "Exceeded deadline by 769ms"
        > ... The message as it is has a fairly low diagnostic value.

        I don't think that's the case here. Nearly 800ms is a pretty significant issue, esp as the vm is unloaded as you mentioned.

        Can you try using cli_mt instead of cli_st and see if you notice any difference?

        This message "Exceeded deadline by 769ms" is saying that the ZK client asked the operating system, via the "select(..., timeout) call, to either select something or wakeup after the specified timeout. From the select man page:

        "timeout is an upper bound on the amount of time elapsed before select() returns."

        We are printing this message because select woke up 769ms later than what we asked, which is pretty significant. Notice this has nothing to do with the server, server settings/config or server performance, it's all local to the client/select call. This tells me that something is up on your client, prolly due to use of vmware. Perhaps Swapping at the host (non-virt) level? Try the cli_mt, you might also try running w/o VMware and see what happens, it will give you a baseline.

        Show
        Patrick Hunt added a comment - > We have bumped up tickTime to 20000 per Patrick's suggestion in another thread. Hm, I'm not sure which thread you are referring to but tickTime has no impact on this particular issue. > "Exceeded deadline by 769ms" > ... The message as it is has a fairly low diagnostic value. I don't think that's the case here. Nearly 800ms is a pretty significant issue, esp as the vm is unloaded as you mentioned. Can you try using cli_mt instead of cli_st and see if you notice any difference? This message "Exceeded deadline by 769ms" is saying that the ZK client asked the operating system, via the "select(..., timeout) call, to either select something or wakeup after the specified timeout. From the select man page: "timeout is an upper bound on the amount of time elapsed before select() returns." We are printing this message because select woke up 769ms later than what we asked, which is pretty significant. Notice this has nothing to do with the server, server settings/config or server performance, it's all local to the client/select call. This tells me that something is up on your client, prolly due to use of vmware. Perhaps Swapping at the host (non-virt) level? Try the cli_mt, you might also try running w/o VMware and see what happens, it will give you a baseline.
        Hide
        Lei Zhang added a comment -

        I have taken over Dale's responsibility of zookeeper. We have bumped up tickTime to 20000 per Patrick's suggestion in another thread. Now we see these "Exceeded deadline by 769ms" message every 10 seconds - I'm testing using 'cli_st localhost:<port>', on a VMware Linux machine that is mostly idle . I echo Dale's comment:

        The message as it is has a fairly low diagnostic value.

        Since this message is at WARN level, I feel we need to do something. But what:
        o bump up priority of zookeeper daemon
        o check bug in client library
        o check bug in zookeeper server

        Somehow this doesn't smell like a real "Exceeded timeline" issue to me.

        Show
        Lei Zhang added a comment - I have taken over Dale's responsibility of zookeeper. We have bumped up tickTime to 20000 per Patrick's suggestion in another thread. Now we see these "Exceeded deadline by 769ms" message every 10 seconds - I'm testing using 'cli_st localhost:<port>', on a VMware Linux machine that is mostly idle . I echo Dale's comment: The message as it is has a fairly low diagnostic value. Since this message is at WARN level, I feel we need to do something. But what: o bump up priority of zookeeper daemon o check bug in client library o check bug in zookeeper server Somehow this doesn't smell like a real "Exceeded timeline" issue to me.
        Hide
        Patrick Hunt added a comment -

        @dale basically we are asking to be woken up in a certain amount of time (in ms), and we are woken up later than we asked for. This is
        to be expected on heavily loaded system, we are logging in order to provide some insight (say you see a bunch of these, then
        some session timeout, it might be useful). I agree though, in general not particularly useful, bit then in some cases it might be.

        The selection of 10ms is a mistake though (esp as some systems don't have very good resolution on timing). It should be larger, perhaps
        scaled relative to the timeout. We could output as debug, but then it would essentially never be avail when you wanted it. We could
        latch (only output the first N logs of this type) but then it would get lost over time. Perhaps we could do something where we output
        a max of once every minute (as warn, debug the rest of the time). Might be a good tradeoff (increase the check from 10ms at the same
        time).

        Show
        Patrick Hunt added a comment - @dale basically we are asking to be woken up in a certain amount of time (in ms), and we are woken up later than we asked for. This is to be expected on heavily loaded system, we are logging in order to provide some insight (say you see a bunch of these, then some session timeout, it might be useful). I agree though, in general not particularly useful, bit then in some cases it might be. The selection of 10ms is a mistake though (esp as some systems don't have very good resolution on timing). It should be larger, perhaps scaled relative to the timeout. We could output as debug, but then it would essentially never be avail when you wanted it. We could latch (only output the first N logs of this type) but then it would get lost over time. Perhaps we could do something where we output a max of once every minute (as warn, debug the rest of the time). Might be a good tradeoff (increase the check from 10ms at the same time).
        Hide
        Dale Johnson added a comment -

        It is a small instance.

        When the machine becomes heavily loaded, the times go up quite a bit on
        these messages, some into the thousands of milliseconds. But the
        smaller values appear on a pretty regular basis.

        Does the warning mean that it's not responded within a single tick?

        The message as it is has a fairly low diagnostic value. I guess if they
        happened with enough frequency and that was a signal to get more
        zookeeper servers, or that the client hosts were too overloaded, it
        could be more helpful.

        Show
        Dale Johnson added a comment - It is a small instance. When the machine becomes heavily loaded, the times go up quite a bit on these messages, some into the thousands of milliseconds. But the smaller values appear on a pretty regular basis. Does the warning mean that it's not responded within a single tick? The message as it is has a fairly low diagnostic value. I guess if they happened with enough frequency and that was a signal to get more zookeeper servers, or that the client hosts were too overloaded, it could be more helpful.
        Hide
        Patrick Hunt added a comment -

        unfortunately it's not configurable & I'm not sure why 10 was chosen (nothing in the
        src to indicate.

        What type of ec2 instance is this, small? Is this at all correlated with the
        client/clienthost having increased workload?

        I think we should make some sort of change here, not sure what. Perhaps this
        should be a % of timeout rather than fixed. Or perhaps made configurable. Alternately
        we might make this debug (or switch to debug if > some number output in a period of
        time, etc...)

        Show
        Patrick Hunt added a comment - unfortunately it's not configurable & I'm not sure why 10 was chosen (nothing in the src to indicate. What type of ec2 instance is this, small? Is this at all correlated with the client/clienthost having increased workload? I think we should make some sort of change here, not sure what. Perhaps this should be a % of timeout rather than fixed. Or perhaps made configurable. Alternately we might make this debug (or switch to debug if > some number output in a period of time, etc...)

          People

          • Assignee:
            Marc Celani
            Reporter:
            Dale Johnson
          • Votes:
            2 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development