Bookkeeper
  1. Bookkeeper
  2. BOOKKEEPER-569

Critical performance bug in InterleavedLedgerStorage

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.2.1, 4.3.0
    • Component/s: None
    • Labels:
      None

      Description

      There's a synchronization on InterleavedLedgerStorage#flush(), which kills performance when you're writing to many ledgers on a single bookie. Both #flush and #addEntry are synchronized, which blocks any adds being serviced while the sync thread is running.

      The sync on #addEntry has always been there, but on #flush it has only existed since BOOKKEEPER-293. The addition was obviously a mistake.

      Fix is simply to remove it.

        Activity

        Hide
        Hudson added a comment -

        Integrated in bookkeeper-trunk #106 (See https://builds.apache.org/job/bookkeeper-trunk/106/)
        BOOKKEEPER-569: Critical performance bug in InterleavedLedgerStorage (ivank via fpj) (Revision 1446902)

        Result = SUCCESS
        fpj :
        Files :

        • /zookeeper/bookkeeper/trunk/CHANGES.txt
        • /zookeeper/bookkeeper/trunk/bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/InterleavedLedgerStorage.java
        Show
        Hudson added a comment - Integrated in bookkeeper-trunk #106 (See https://builds.apache.org/job/bookkeeper-trunk/106/ ) BOOKKEEPER-569 : Critical performance bug in InterleavedLedgerStorage (ivank via fpj) (Revision 1446902) Result = SUCCESS fpj : Files : /zookeeper/bookkeeper/trunk/CHANGES.txt /zookeeper/bookkeeper/trunk/bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/InterleavedLedgerStorage.java
        Hide
        Flavio Junqueira added a comment -

        Trunk: Committed revision 1446902.
        Branch 4.2: Committed revision 1446900.

        Show
        Flavio Junqueira added a comment - Trunk: Committed revision 1446902. Branch 4.2: Committed revision 1446900.
        Hide
        Flavio Junqueira added a comment -

        +1, looks good.

        Show
        Flavio Junqueira added a comment - +1, looks good.
        Hide
        Sijie Guo added a comment -

        +1 for this change. Flavio Junqueira how is your opinion on Ivan's replies? If there is +1 from you, the patch is ready to be in.

        Show
        Sijie Guo added a comment - +1 for this change. Flavio Junqueira how is your opinion on Ivan's replies? If there is +1 from you, the patch is ready to be in.
        Hide
        Hadoop QA added a comment -

        Testing JIRA BOOKKEEPER-569

        Patch 0001-BOOKKEEPER-569-Critical-performance-issue-in-Interle.patch downloaded at Wed Feb 13 17:22:11 UTC 2013

        ----------------------------

        +1 PATCH_APPLIES
        +1 CLEAN
        -1 RAW_PATCH_ANALYSIS
        . +1 the patch does not introduce any @author tags
        . +1 the patch does not introduce any tabs
        . +1 the patch does not introduce any trailing spaces
        . +1 the patch does not introduce any line longer than 120
        . -1 the patch does not add/modify any testcase
        +1 RAT
        . +1 the patch does not seem to introduce new RAT warnings
        +1 JAVADOC
        . +1 the patch does not seem to introduce new Javadoc warnings
        +1 COMPILE
        . +1 HEAD compiles
        . +1 patch compiles
        . +1 the patch does not seem to introduce new javac warnings
        +1 FINDBUGS
        . +1 the patch does not seem to introduce new Findbugs warnings
        +1 TESTS
        . Tests run: 815
        +1 DISTRO
        . +1 distro tarball builds with the patch

        ----------------------------
        -1 Overall result, please check the reported -1(s)

        The full output of the test-patch run is available at

        . https://builds.apache.org/job/bookkeeper-trunk-precommit-build/269/

        Show
        Hadoop QA added a comment - Testing JIRA BOOKKEEPER-569 Patch 0001-BOOKKEEPER-569-Critical-performance-issue-in-Interle.patch downloaded at Wed Feb 13 17:22:11 UTC 2013 ---------------------------- +1 PATCH_APPLIES +1 CLEAN -1 RAW_PATCH_ANALYSIS . +1 the patch does not introduce any @author tags . +1 the patch does not introduce any tabs . +1 the patch does not introduce any trailing spaces . +1 the patch does not introduce any line longer than 120 . -1 the patch does not add/modify any testcase +1 RAT . +1 the patch does not seem to introduce new RAT warnings +1 JAVADOC . +1 the patch does not seem to introduce new Javadoc warnings +1 COMPILE . +1 HEAD compiles . +1 patch compiles . +1 the patch does not seem to introduce new javac warnings +1 FINDBUGS . +1 the patch does not seem to introduce new Findbugs warnings +1 TESTS . Tests run: 815 +1 DISTRO . +1 distro tarball builds with the patch ---------------------------- -1 Overall result, please check the reported -1(s) The full output of the test-patch run is available at . https://builds.apache.org/job/bookkeeper-trunk-precommit-build/269/
        Hide
        Ivan Kelly added a comment -

        1. Why is addEntry synchronized? Other than flush, it seems to be the only other synchronized method.

        It doesn't necessarily have to be, as EntryLogger#addEntry is synchronized and LedgerCacheImpl seems quite threadsafe. It has always been synchronized though, so I don't want to change it without good reason.

        2. Both addEntry and flush access ledgerCache. Are you sure that making flush unsynchronized will not cause any race?

        ledgerCache is threadsafe as far as I can see. Also, up until Jan 4, flush was no unsynchronized, as it didn't present a problem.

        3. What exactly makes you think it is an obvious mistake, other than you being the author of the 293 patch?

        Its completely unrelated to the patch. Usually, for a change like that, unrelated to the issue, I'd have noted why I did it in the jira or added a comment. I think I had been touching that code to make it accessible for the tests, and changed it back wrong. I need to check at home, as thats the repo the original changes were in.

        Show
        Ivan Kelly added a comment - 1. Why is addEntry synchronized? Other than flush, it seems to be the only other synchronized method. It doesn't necessarily have to be, as EntryLogger#addEntry is synchronized and LedgerCacheImpl seems quite threadsafe. It has always been synchronized though, so I don't want to change it without good reason. 2. Both addEntry and flush access ledgerCache. Are you sure that making flush unsynchronized will not cause any race? ledgerCache is threadsafe as far as I can see. Also, up until Jan 4, flush was no unsynchronized, as it didn't present a problem. 3. What exactly makes you think it is an obvious mistake, other than you being the author of the 293 patch? Its completely unrelated to the patch. Usually, for a change like that, unrelated to the issue, I'd have noted why I did it in the jira or added a comment. I think I had been touching that code to make it accessible for the tests, and changed it back wrong. I need to check at home, as thats the repo the original changes were in.
        Hide
        Flavio Junqueira added a comment -

        I have a few questions here:

        1. Why is addEntry synchronized? Other than flush, it seems to be the only other synchronized method.
        2. Both addEntry and flush access ledgerCache. Are you sure that making flush unsynchronized will not cause any race?
        3. What exactly makes you think it is an obvious mistake, other than you being the author of the 293 patch?
        Show
        Flavio Junqueira added a comment - I have a few questions here: Why is addEntry synchronized? Other than flush, it seems to be the only other synchronized method. Both addEntry and flush access ledgerCache. Are you sure that making flush unsynchronized will not cause any race? What exactly makes you think it is an obvious mistake, other than you being the author of the 293 patch?

          People

          • Assignee:
            Ivan Kelly
            Reporter:
            Ivan Kelly
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development