Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 3.3.1
    • Fix Version/s: 3.3.2, 3.4.0
    • Component/s: contrib-bindings
    • Labels:
      None
    • Environment:

      vmware workstation - guest OS:Linux python:2.4.3

    • Hadoop Flags:
      Reviewed

      Description

      We recently upgraded zookeeper from 3.2.1 to 3.3.1, now we are seeing less client deadlock on session expiration, which is a definite plus!

      Unfortunately we are seeing memory leak that requires our zk clients to be restarted every half-day. Valgrind result:

      ==8804== 25 (12 direct, 13 indirect) bytes in 1 blocks are definitely lost in loss record 255 of 670
      ==8804== at 0x4021C42: calloc (vg_replace_malloc.c:418)
      ==8804== by 0x5047B42: parse_acls (zookeeper.c:369)
      ==8804== by 0x5047EF6: pyzoo_create (zookeeper.c:1009)
      ==8804== by 0x40786CC: PyCFunction_Call (in /usr/lib/libpython2.4.so.1.0)
      ==8804== by 0x40B31DC: PyEval_EvalFrame (in /usr/lib/libpython2.4.so.1.0)
      ==8804== by 0x40B4485: PyEval_EvalCodeEx (in /usr/lib/libpython2.4.so.1.0)

      1. ZOOKEEPER-792.patch
        1 kB
        Lei Zhang
      2. ZOOKEEPER-792.patch
        1 kB
        Henry Robinson
      3. ZOOKEEPER-792.patch
        1 kB
        Henry Robinson

        Activity

        Hide
        Lei Zhang added a comment -

        Plug memory leak in pyzoo_get() and pyzoo_create().

        Show
        Lei Zhang added a comment - Plug memory leak in pyzoo_get() and pyzoo_create().
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12447995/ZOOKEEPER-792.patch
        against trunk revision 953041.

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no tests are needed for this patch.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/120/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/120/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/120/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12447995/ZOOKEEPER-792.patch against trunk revision 953041. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/120/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/120/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/120/console This message is automatically generated.
        Hide
        Mahadev konar added a comment -

        henry, can you take a look at this?

        Show
        Mahadev konar added a comment - henry, can you take a look at this?
        Hide
        Lei Zhang added a comment -

        We've been using this patch in production (16-node cluster) for over a month. I'd like to have it go into never release. Can somebody please code review?

        Show
        Lei Zhang added a comment - We've been using this patch in production (16-node cluster) for over a month. I'd like to have it go into never release. Can somebody please code review?
        Hide
        Henry Robinson added a comment -

        Hi -

        Sorry for the slow response! I just took a look over the patch - good catches.

        +1. I'll commit within the day.

        Henry

        Show
        Henry Robinson added a comment - Hi - Sorry for the slow response! I just took a look over the patch - good catches. +1. I'll commit within the day. Henry
        Hide
        Henry Robinson added a comment -

        Just to update - I've found that zkpython tests are failing in trunk, and I don't want to commit a patch when the tests are broken. I'll be creating a JIRA shortly to address the problem once I've looked into it slightly further.

        Show
        Henry Robinson added a comment - Just to update - I've found that zkpython tests are failing in trunk, and I don't want to commit a patch when the tests are broken. I'll be creating a JIRA shortly to address the problem once I've looked into it slightly further.
        Hide
        Henry Robinson added a comment -

        Aha - I think I have found the problem, and it was related to this patch.

        PyObject *ret = Py_BuildValue( "(s#,N)", buffer,buffer_len, stat_dict );
        + free_pywatcher(pw);
        free(buffer);

        We shouldn't free the pywatcher_t object here because it may be called later. This was what was causing the segfault I was seeing. I'll upload a new patch with this line removed; I hope it will still fix your memory consumption issues.

        Show
        Henry Robinson added a comment - Aha - I think I have found the problem, and it was related to this patch. PyObject *ret = Py_BuildValue( "(s#,N)", buffer,buffer_len, stat_dict ); + free_pywatcher(pw); free(buffer); We shouldn't free the pywatcher_t object here because it may be called later. This was what was causing the segfault I was seeing. I'll upload a new patch with this line removed; I hope it will still fix your memory consumption issues.
        Hide
        Henry Robinson added a comment -

        Updated patch to remove that bug.

        Show
        Henry Robinson added a comment - Updated patch to remove that bug.
        Hide
        Henry Robinson added a comment -

        I forgot --no-prefix. Plus ca change, plus c'est la meme chose.

        Show
        Henry Robinson added a comment - I forgot --no-prefix. Plus ca change, plus c'est la meme chose.
        Hide
        Lei Zhang added a comment -

        Thanks a lot.

        Show
        Lei Zhang added a comment - Thanks a lot.
        Hide
        Henry Robinson added a comment -

        I just committed this! Thanks Lei Zhang!

        Show
        Henry Robinson added a comment - I just committed this! Thanks Lei Zhang!
        Hide
        Hudson added a comment -

        Integrated in ZooKeeper-trunk #936 (See https://hudson.apache.org/hudson/job/ZooKeeper-trunk/936/)

        Show
        Hudson added a comment - Integrated in ZooKeeper-trunk #936 (See https://hudson.apache.org/hudson/job/ZooKeeper-trunk/936/ )

          People

          • Assignee:
            Lei Zhang
            Reporter:
            Lei Zhang
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development