Bookkeeper
  1. Bookkeeper
  2. BOOKKEEPER-190

Add entries would fail when number of open ledgers reaches more than openFileLimit.

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.1.0
    • Component/s: bookkeeper-server
    • Labels:
      None

      Description

      when the number of open ledgers reaches more than openFileLimit, a file info will be closed and removed from opened ledgers list. And after BOOKKEEPER-137, the ledger index file creation delayed until necessary.

      suppose ledger l is removed from opened ledger list, and its index file haven't been created.
      new add entries operations of other ledgers came into bookie server, a new page need to be grab for them. so bookie server may need to flush the dirty pages of ledger l(when page cache is full). and the flush would fail due to NoLedgerException (no index file found).

      actually the ledger l isn't lost, it could be recovered if restarting bookie server, but the bookie server would not work well on adding entries.

      a proposal solution is that we need to force index creation when the ledger is evicted from open ledgers list.

      2012-03-21 14:00:42,989 - DEBUG - [NIOServerFactory-5000:LedgerCache@235] - New ledger index file created for ledgerId: 4
      2012-03-21 14:00:42,990 - INFO  - [NIOServerFactory-5000:LedgerCache@241] - Ledger 2 is evicted from file info cache.
      2012-03-21 14:00:42,990 - DEBUG - [New I/O client worker #1-1:PerChannelBookieClient$2@255] - Successfully wrote request for adding entry: 0 ledger-id: 4 bookie: /10.82.129.173:5000 entry length: 70
      2012-03-21 14:00:42,990 - ERROR - [NIOServerFactory-5000:BookieServer@361] - Error writing 0@4
      org.apache.bookkeeper.bookie.Bookie$NoLedgerException: Ledger 2 not found
              at org.apache.bookkeeper.bookie.LedgerCache.getFileInfo(LedgerCache.java:228)
              at org.apache.bookkeeper.bookie.LedgerCache.flushLedger(LedgerCache.java:359)
              at org.apache.bookkeeper.bookie.LedgerCache.flushLedger(LedgerCache.java:292)
              at org.apache.bookkeeper.bookie.LedgerCache.grabCleanPage(LedgerCache.java:447)
              at org.apache.bookkeeper.bookie.LedgerCache.putEntryOffset(LedgerCache.java:157)
              at org.apache.bookkeeper.bookie.LedgerDescriptor.addEntry(LedgerDescriptor.java:130)
              at org.apache.bookkeeper.bookie.Bookie.addEntryInternal(Bookie.java:1059)
              at org.apache.bookkeeper.bookie.Bookie.addEntry(Bookie.java:1099)
              at org.apache.bookkeeper.proto.BookieServer.processPacket(BookieServer.java:357)
              at org.apache.bookkeeper.proto.NIOServerFactory$Cnxn.readRequest(NIOServerFactory.java:315)
              at org.apache.bookkeeper.proto.NIOServerFactory$Cnxn.doIO(NIOServerFactory.java:213)
              at org.apache.bookkeeper.proto.NIOServerFactory.run(NIOServerFactory.java:124)
      2012-03-21 14:00:42,991 - DEBUG - [pool-3-thread-1:PerChannelBookieClient@576] - Got response for add request from bookie: /10.82.129.173:5000 for ledger: 4 entry: 0 rc: 101
      2012-03-21 14:00:42,991 - ERROR - [pool-3-thread-1:PerChannelBookieClient@594] - Add for ledger: 4, entry: 0 failed on bookie: /10.82.129.173:5000 with code: 101
      2012-03-21 14:00:42,991 - WARN  - [pool-3-thread-1:PendingAddOp@142] - Write did not succeed: 4, 0
      
      
      1. BOOKKEEPER-190.diff
        12 kB
        Sijie Guo
      2. BOOKKEEPER-190.diff_v2
        11 kB
        Sijie Guo
      3. BOOKKEEPER-190.diff_v3
        20 kB
        Sijie Guo

        Activity

        Sijie Guo created issue -
        Hide
        Sijie Guo added a comment -

        attach a patch to fix this issue.

        the patch is quite simple, just force index creation when the ledger is evicted from open ledgers list.

        also it fixed the reference counting issue on FileInfo, to release useCount after use it. otherwise, the index file is not closed actually, will cause 'Too many opened files' problem.

        Show
        Sijie Guo added a comment - attach a patch to fix this issue. the patch is quite simple, just force index creation when the ledger is evicted from open ledgers list. also it fixed the reference counting issue on FileInfo, to release useCount after use it. otherwise, the index file is not closed actually, will cause 'Too many opened files' problem.
        Sijie Guo made changes -
        Field Original Value New Value
        Attachment BOOKKEEPER-190.diff [ 12519192 ]
        Sijie Guo made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Hide
        Flavio Junqueira added a comment -

        hi sijie, it looks good, but let me propose a slightly different approach for the test. I think that it should be more focused and should be really just exercising the cache here instead of emulating a full run. We could start a single bookie, create ledgers through the bookie interface, and verify that we don't get an error when evicting the ledger. How does it sound to you?

        Show
        Flavio Junqueira added a comment - hi sijie, it looks good, but let me propose a slightly different approach for the test. I think that it should be more focused and should be really just exercising the cache here instead of emulating a full run. We could start a single bookie, create ledgers through the bookie interface, and verify that we don't get an error when evicting the ledger. How does it sound to you?
        Hide
        Sijie Guo added a comment -

        @Flavio.

        I agreed. since the test is related to cache eviction, so I moved the test case to LedgerCacheTest, and changed it as what you suggested. a new patch is attached.

        Show
        Sijie Guo added a comment - @Flavio. I agreed. since the test is related to cache eviction, so I moved the test case to LedgerCacheTest, and changed it as what you suggested. a new patch is attached.
        Sijie Guo made changes -
        Attachment BOOKKEEPER-190.diff_v2 [ 12519402 ]
        Hide
        Flavio Junqueira added a comment -

        +1, looks great, Sijie. Thanks for changing the test.

        Show
        Flavio Junqueira added a comment - +1, looks great, Sijie. Thanks for changing the test.
        Hide
        Ivan Kelly added a comment -

        I have a few small comments on the patch.

        1. In evictFileInfoIfNecessary() you should use {}, so that the string isn't constructed is info level is turned off. It's not an issue here really, but it's a good habit to get into.
        2. Why construct a Bookie at all? The LedgerCache can be used as a standalone component now, so we should test it as such.
        Show
        Ivan Kelly added a comment - I have a few small comments on the patch. In evictFileInfoIfNecessary() you should use {}, so that the string isn't constructed is info level is turned off. It's not an issue here really, but it's a good habit to get into. Why construct a Bookie at all? The LedgerCache can be used as a standalone component now, so we should test it as such.
        Hide
        Sijie Guo added a comment -

        > 1. In evictFileInfoIfNecessary() you should use {}, so that the string isn't constructed is info level is turned off

        OK. I will did the change.

        > Why construct a Bookie at all? The LedgerCache can be used as a standalone component now, so we should test it as such.

        yeah. seems that testLedgerEviction could use LedgerCache#putEntryOffset to test.

        but I am not so clear that LedgerCacheTest#testAddEntryException tries to test what kind of case. seems that it tried to populate the ledger cache to force eviction, which is similar with testLedgerEviction. could any one explain it?

        Show
        Sijie Guo added a comment - > 1. In evictFileInfoIfNecessary() you should use {}, so that the string isn't constructed is info level is turned off OK. I will did the change. > Why construct a Bookie at all? The LedgerCache can be used as a standalone component now, so we should test it as such. yeah. seems that testLedgerEviction could use LedgerCache#putEntryOffset to test. but I am not so clear that LedgerCacheTest#testAddEntryException tries to test what kind of case. seems that it tried to populate the ledger cache to force eviction, which is similar with testLedgerEviction. could any one explain it?
        Hide
        Flavio Junqueira added a comment -

        Hi Sijie, We introduced LedgerCacheTest (and LedgerCacheTest#testAddEntryException) in BOOKKEEPER-22.

        Show
        Flavio Junqueira added a comment - Hi Sijie, We introduced LedgerCacheTest (and LedgerCacheTest#testAddEntryException) in BOOKKEEPER-22 .
        Hide
        Sijie Guo added a comment -

        thanks, Flavio.

        attach a new patch.

        in this patch, I modified the LedgerCacheTest according to Ivan's suggestions, which make the test focus on testing ledger cache itself.

        BTW, it remove/modify 'System.out' code in LedgerCache, and also fix a simple bug LedgerCache#grabCleanPage (we don't remove a ledger from cleanLedger list if we found that there is no clean page in it. then clean ledger list would not be empty, so no flush will be trigger).

        Show
        Sijie Guo added a comment - thanks, Flavio. attach a new patch. in this patch, I modified the LedgerCacheTest according to Ivan's suggestions, which make the test focus on testing ledger cache itself. BTW, it remove/modify 'System.out' code in LedgerCache, and also fix a simple bug LedgerCache#grabCleanPage (we don't remove a ledger from cleanLedger list if we found that there is no clean page in it. then clean ledger list would not be empty, so no flush will be trigger).
        Sijie Guo made changes -
        Attachment BOOKKEEPER-190.diff_v3 [ 12519921 ]
        Hide
        Ivan Kelly added a comment - - edited

        Committed as r1306839. Thanks Sijie.

        Show
        Ivan Kelly added a comment - - edited Committed as r1306839. Thanks Sijie.
        Ivan Kelly made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Assignee Sijie Guo [ hustlmsp ] Ivan Kelly [ ikelly ]
        Resolution Fixed [ 1 ]
        Hide
        Hudson added a comment -

        Integrated in bookkeeper-trunk #433 (See https://builds.apache.org/job/bookkeeper-trunk/433/)
        BOOKKEEPER-190: Add entries would fail when number of open ledgers reaches more than openFileLimit. (sijie via ivank) (Revision 1306839)

        Result = ABORTED
        ivank :
        Files :

        • /zookeeper/bookkeeper/trunk/CHANGES.txt
        • /zookeeper/bookkeeper/trunk/bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/Bookie.java
        • /zookeeper/bookkeeper/trunk/bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/FileInfo.java
        • /zookeeper/bookkeeper/trunk/bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/LedgerCacheImpl.java
        • /zookeeper/bookkeeper/trunk/bookkeeper-server/src/main/java/org/apache/bookkeeper/conf/ServerConfiguration.java
        • /zookeeper/bookkeeper/trunk/bookkeeper-server/src/test/java/org/apache/bookkeeper/bookie/BookieJournalTest.java
        • /zookeeper/bookkeeper/trunk/bookkeeper-server/src/test/java/org/apache/bookkeeper/bookie/LedgerCacheTest.java
        • /zookeeper/bookkeeper/trunk/bookkeeper-server/src/test/java/org/apache/bookkeeper/bookie/UpgradeTest.java
        • /zookeeper/bookkeeper/trunk/bookkeeper-server/src/test/java/org/apache/bookkeeper/client/LedgerCacheTest.java
        Show
        Hudson added a comment - Integrated in bookkeeper-trunk #433 (See https://builds.apache.org/job/bookkeeper-trunk/433/ ) BOOKKEEPER-190 : Add entries would fail when number of open ledgers reaches more than openFileLimit. (sijie via ivank) (Revision 1306839) Result = ABORTED ivank : Files : /zookeeper/bookkeeper/trunk/CHANGES.txt /zookeeper/bookkeeper/trunk/bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/Bookie.java /zookeeper/bookkeeper/trunk/bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/FileInfo.java /zookeeper/bookkeeper/trunk/bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/LedgerCacheImpl.java /zookeeper/bookkeeper/trunk/bookkeeper-server/src/main/java/org/apache/bookkeeper/conf/ServerConfiguration.java /zookeeper/bookkeeper/trunk/bookkeeper-server/src/test/java/org/apache/bookkeeper/bookie/BookieJournalTest.java /zookeeper/bookkeeper/trunk/bookkeeper-server/src/test/java/org/apache/bookkeeper/bookie/LedgerCacheTest.java /zookeeper/bookkeeper/trunk/bookkeeper-server/src/test/java/org/apache/bookkeeper/bookie/UpgradeTest.java /zookeeper/bookkeeper/trunk/bookkeeper-server/src/test/java/org/apache/bookkeeper/client/LedgerCacheTest.java
        Ivan Kelly made changes -
        Status Resolved [ 5 ] Closed [ 6 ]

          People

          • Assignee:
            Ivan Kelly
            Reporter:
            Sijie Guo
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development