ZooKeeper
  1. ZooKeeper
  2. ZOOKEEPER-804

c unit tests failing due to "assertion cptr failed"

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 3.4.0
    • Fix Version/s: 3.3.2, 3.4.0
    • Component/s: c client
    • Labels:
      None
    • Environment:

      gcc 4.4.3, ubuntu lucid lynx, dual core laptop (intel)

    • Hadoop Flags:
      Reviewed

      Description

      I'm seeing this frequently:

      [exec] Zookeeper_simpleSystem::testPing : elapsed 18006 : OK
      [exec] Zookeeper_simpleSystem::testAcl : elapsed 1022 : OK
      [exec] Zookeeper_simpleSystem::testChroot : elapsed 3145 : OK
      [exec] Zookeeper_simpleSystem::testAuth ZooKeeper server started : elapsed 25687 : OK
      [exec] zktest-mt: /home/phunt/dev/workspace/gitzk/src/c/src/zookeeper.c:1952: zookeeper_process: Assertion `cptr' failed.
      [exec] make: *** [run-check] Aborted
      [exec] Zookeeper_simpleSystem::testHangingClient

      Mahadev can you take a look?

      1. ZOOKEEPER-804-1.patch
        0.5 kB
        Patrick Hunt
      2. ZOOKEEPER-804-1.patch
        0.8 kB
        Jared Cantwell
      3. ZOOKEEPER-804.patch
        0.8 kB
        Michi Mutsuzaki

        Issue Links

          Activity

          Hide
          Mahadev konar added a comment -

          this is the same issue as ZOOKEEPER-707. It should be very hard to reproduce. THis is surely a bug. I will upload a patch soon and we can get this in 3.3.2.

          Show
          Mahadev konar added a comment - this is the same issue as ZOOKEEPER-707 . It should be very hard to reproduce. THis is surely a bug. I will upload a patch soon and we can get this in 3.3.2.
          Hide
          Patrick Hunt added a comment -

          I ran the c tests 3 times and it failed with the same error each time. Let me know if I can help you verify the fix.

          Show
          Patrick Hunt added a comment - I ran the c tests 3 times and it failed with the same error each time. Let me know if I can help you verify the fix.
          Hide
          Mahadev konar added a comment -

          great... I cant reproduce it on my machine. I rarely see this error.

          Show
          Mahadev konar added a comment - great... I cant reproduce it on my machine. I rarely see this error.
          Hide
          Michi Mutsuzaki added a comment -

          I haven't been able to reproduce it, but it looks like we shouldn't assert(cptr) if zookeeper_close has been called (zookeeper.c line 1950). Can we do something like:

          completion_list_t *cptr = dequeue_completion(&zh->sent_requests);
          if (zh->close_requested == 1) {
          // zookeeper_close has been called. No need to assert cptr. Just free it if it's not NULL
          if (cptr)

          { destroy_completion_entry(cptr); }

          // some more cleanup?
          return ZINVALIDSTATE;
          }

          // zookeeper_close hadn't been called when we called dequeue_completion. cptr must not be NULL.
          assert(cptr);

          --Michi

          Show
          Michi Mutsuzaki added a comment - I haven't been able to reproduce it, but it looks like we shouldn't assert(cptr) if zookeeper_close has been called (zookeeper.c line 1950). Can we do something like: completion_list_t *cptr = dequeue_completion(&zh->sent_requests); if (zh->close_requested == 1) { // zookeeper_close has been called. No need to assert cptr. Just free it if it's not NULL if (cptr) { destroy_completion_entry(cptr); } // some more cleanup? return ZINVALIDSTATE; } // zookeeper_close hadn't been called when we called dequeue_completion. cptr must not be NULL. assert(cptr); --Michi
          Hide
          Michi Mutsuzaki added a comment -

          This patch modifies zookeeper_process() function. After dequeue_completion() is called, it checks if zookeeper_close has been called. If it has been called, it returns ZINVALIDSTATE.

          --Michi

          Show
          Michi Mutsuzaki added a comment - This patch modifies zookeeper_process() function. After dequeue_completion() is called, it checks if zookeeper_close has been called. If it has been called, it returns ZINVALIDSTATE. --Michi
          Hide
          Mahadev konar added a comment -

          +1 for the patch. Pat can you please try it out and see it fixes the test cases? I will go ahead and commit if it passes for you.

          Show
          Mahadev konar added a comment - +1 for the patch. Pat can you please try it out and see it fixes the test cases? I will go ahead and commit if it passes for you.
          Hide
          Patrick Hunt added a comment -

          +1 – I ran this a number of times (~10) and it passed each time. Ship it.

          Show
          Patrick Hunt added a comment - +1 – I ran this a number of times (~10) and it passed each time. Ship it.
          Hide
          Mahadev konar added a comment -

          I just committed this. thanks michi!

          Show
          Mahadev konar added a comment - I just committed this. thanks michi!
          Hide
          Hudson added a comment -

          Integrated in ZooKeeper-trunk #958 (See https://hudson.apache.org/hudson/job/ZooKeeper-trunk/958/)
          ZOOKEEPER-804. c unit tests failing due to "assertion cptr failed" (michi mutsuzaki via mahadev)

          Show
          Hudson added a comment - Integrated in ZooKeeper-trunk #958 (See https://hudson.apache.org/hudson/job/ZooKeeper-trunk/958/ ) ZOOKEEPER-804 . c unit tests failing due to "assertion cptr failed" (michi mutsuzaki via mahadev)
          Hide
          Jared Cantwell added a comment -

          Should the return statement of this patch be:

          return api_epilog(zh,ZINVALIDSTATE);

          Otherwise, it seems possible that the reference counting could get off in this case.

          Show
          Jared Cantwell added a comment - Should the return statement of this patch be: return api_epilog(zh,ZINVALIDSTATE); Otherwise, it seems possible that the reference counting could get off in this case.
          Hide
          Patrick Hunt added a comment -

          Reopening due to issue raised with the patch (see earlier comment).

          Is this an issue that should be addressed or ok to resolve?

          Show
          Patrick Hunt added a comment - Reopening due to issue raised with the patch (see earlier comment). Is this an issue that should be addressed or ok to resolve?
          Hide
          Jared Cantwell added a comment -

          It seems like zookeeper_process unnecessarily calls api_prolog() and api_epilog() to begin with. But given that api_prolog() is called at the beginning, if this new code path executes while a close is requested, then the threads will successfully exit, but the final part of zookeeper_close that releases the memory on the last reference will not execute (since the last reference will never be reached). Unless I'm missing something, this will leak the zkhandle.

          Show
          Jared Cantwell added a comment - It seems like zookeeper_process unnecessarily calls api_prolog() and api_epilog() to begin with. But given that api_prolog() is called at the beginning, if this new code path executes while a close is requested, then the threads will successfully exit, but the final part of zookeeper_close that releases the memory on the last reference will not execute (since the last reference will never be reached). Unless I'm missing something, this will leak the zkhandle.
          Hide
          Jared Cantwell added a comment -

          Not sure what the protocol is here, but I'm gonna go ahead and attach a revised patch.

          Show
          Jared Cantwell added a comment - Not sure what the protocol is here, but I'm gonna go ahead and attach a revised patch.
          Hide
          Michi Mutsuzaki added a comment -

          Jared, thank you for volunteering to fix this. I haven't had a chance to take care of it.

          --Michi

          Show
          Michi Mutsuzaki added a comment - Jared, thank you for volunteering to fix this. I haven't had a chance to take care of it. --Michi
          Hide
          Patrick Hunt added a comment -

          Thanks Jared, I missed your question on IRC (you had quit before I got back). What you've submitted is fine, in general though you should just submit a patch against the current svn (so a diff against code which already had the original patch committed to it).

          Michi, can you (also ben/mahadev) review this and let me know if it's ok to commit, I'll commit it if you guys sign off. Thanks all!

          Show
          Patrick Hunt added a comment - Thanks Jared, I missed your question on IRC (you had quit before I got back). What you've submitted is fine, in general though you should just submit a patch against the current svn (so a diff against code which already had the original patch committed to it). Michi, can you (also ben/mahadev) review this and let me know if it's ok to commit, I'll commit it if you guys sign off. Thanks all!
          Hide
          Michi Mutsuzaki added a comment -

          The patch looks good, but I can't seem to apply it. Has anybody seen this error?

          $ cd branch-3.3
          $ patch -p0 < ZOOKEEPER-804.patch
          patching file src/c/src/zookeeper.c
          Hunk #1 FAILED at 1947.
          1 out of 1 hunk FAILED – saving rejects to file src/c/src/zookeeper.c.rej

          --Michi

          Show
          Michi Mutsuzaki added a comment - The patch looks good, but I can't seem to apply it. Has anybody seen this error? $ cd branch-3.3 $ patch -p0 < ZOOKEEPER-804 .patch patching file src/c/src/zookeeper.c Hunk #1 FAILED at 1947. 1 out of 1 hunk FAILED – saving rejects to file src/c/src/zookeeper.c.rej --Michi
          Hide
          Jared Cantwell added a comment -

          That probably failed because the first patch is already applied, so this patch doesn't exactly apply to your trunk. I can open a new bug and submit a patch that way if its preferred.

          Show
          Jared Cantwell added a comment - That probably failed because the first patch is already applied, so this patch doesn't exactly apply to your trunk. I can open a new bug and submit a patch that way if its preferred.
          Hide
          Michi Mutsuzaki added a comment -

          +1.

          > I can open a new bug and submit a patch that way if its preferred.

          No worry, it's not a big deal since this is a one line change.

          Thanks again, Jared!
          --Michi

          Show
          Michi Mutsuzaki added a comment - +1. > I can open a new bug and submit a patch that way if its preferred. No worry, it's not a big deal since this is a one line change. Thanks again, Jared! --Michi
          Hide
          Patrick Hunt added a comment -

          Updated patch to apply against latest trunk (hopefully branch too).

          Show
          Patrick Hunt added a comment - Updated patch to apply against latest trunk (hopefully branch too).
          Hide
          Patrick Hunt added a comment -

          +1 on the second patch. Tested and it seems fine, committed to trunk/branch33 both.

          Show
          Patrick Hunt added a comment - +1 on the second patch. Tested and it seems fine, committed to trunk/branch33 both.
          Hide
          Hudson added a comment -

          Integrated in ZooKeeper-trunk #973 (See https://hudson.apache.org/hudson/job/ZooKeeper-trunk/973/)
          ZOOKEEPER-804. c unit tests failing due to "assertion cptr failed" (second patch)

          Show
          Hudson added a comment - Integrated in ZooKeeper-trunk #973 (See https://hudson.apache.org/hudson/job/ZooKeeper-trunk/973/ ) ZOOKEEPER-804 . c unit tests failing due to "assertion cptr failed" (second patch)

            People

            • Assignee:
              Michi Mutsuzaki
              Reporter:
              Patrick Hunt
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development