Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.0.0-M3
    • Component/s: None
    • Labels:
      None

      Description

      During my experiments and tests of removing one-level and sub-level indices at least one integration test "SearchAuthorizationIT" failed (the test fails recursivelyDelete()). A debugging session showed that the follwing:

      • in recursivelyDelete() multiple search requests are done which leads to multiple open cursors in the XDBM search engine
      • an entry is deleted
      • when the open cursors are advanced wrong/unexpected entries are returned

      I was able to create a small test that shows the problem:

      • the index contains six tuples:
        (a,1)
        (b,2)
        (c,3)
        (d,4)
        (e,5)
        (f,6)
      • a cursor over the index is created and advanced two times, the expected tuples (a,1) and (b,2) were returned
      • now tuple (c,3) is deleted
      • when the cursor is advanced again the tuple (b,2) is returned again! I had expected (d,4).

      Note that this doesn't happen with AvlIndex.

      1. jdbm5.diff
        76 kB
        Selcuk Aya
      2. jdbm4.diff
        30 kB
        Selcuk Aya
      3. jdbm3.diff
        30 kB
        Selcuk Aya
      4. jdbm2.diff
        1 kB
        Selcuk Aya
      5. jdbm1.diff
        130 kB
        Selcuk Aya
      6. IndexTest.java
        4 kB
        Stefan Seelmann
      7. DIRSERVER-1642.patch
        25 kB
        Stefan Seelmann

        Activity

        Stefan Seelmann created issue -
        Hide
        Stefan Seelmann added a comment -

        Attached test case

        Show
        Stefan Seelmann added a comment - Attached test case
        Stefan Seelmann made changes -
        Field Original Value New Value
        Attachment IndexTest.java [ 12490570 ]
        Hide
        Kiran Ayyagari added a comment -

        I have recently discussed this sort of behavior with Emmanuel while trying to consider JDBM for replication journal.
        The javadoc of browse() method in BTree says:

        WARNING: If you make structural modifications to the BTree during browsing, you will get inconsistent browing results.

        Show
        Kiran Ayyagari added a comment - I have recently discussed this sort of behavior with Emmanuel while trying to consider JDBM for replication journal. The javadoc of browse() method in BTree says: WARNING: If you make structural modifications to the BTree during browsing, you will get inconsistent browing results.
        Hide
        Emmanuel Lecharny added a comment -

        Kiran is right here, but we must assume that even if we delete some element in the BTree, the cursor we have built on top of it should still be able to return the correct next element, as it stores the current position.

        I think the problem has more to do with the way we manipulate the cursor's internal pointers than with the way JDBM deal with BTree modifications.

        I will investigate this aspect anyway.

        Show
        Emmanuel Lecharny added a comment - Kiran is right here, but we must assume that even if we delete some element in the BTree, the cursor we have built on top of it should still be able to return the correct next element, as it stores the current position. I think the problem has more to do with the way we manipulate the cursor's internal pointers than with the way JDBM deal with BTree modifications. I will investigate this aspect anyway.
        Hide
        Stefan Seelmann added a comment -

        I think Kiran is right and the problem is in JDBM. I looked into BPage.Browser. There an integer "int index" is used to store the current position. But if the BTree is modified the elements may be re-ordered in the BPage and the pointer now points to the wrong element. One idea I have is to modify the BPage.Browser to not use an int pointer, but to store the key element. Then for the next call of getNext() or getPrevious() the stored key element is used to lookup the actual index in the BTree. Of course such a lookup is more expensive than just incrementing an integer. I'm too tired to test that now.

        Show
        Stefan Seelmann added a comment - I think Kiran is right and the problem is in JDBM. I looked into BPage.Browser. There an integer "int index" is used to store the current position. But if the BTree is modified the elements may be re-ordered in the BPage and the pointer now points to the wrong element. One idea I have is to modify the BPage.Browser to not use an int pointer, but to store the key element. Then for the next call of getNext() or getPrevious() the stored key element is used to lookup the actual index in the BTree. Of course such a lookup is more expensive than just incrementing an integer. I'm too tired to test that now.
        Hide
        Emmanuel Lecharny added a comment -

        I think it's a bit more complex than just using the Key in the Browser. IMO, the remove (or add) operation must update the browser so that the call to a getNextTuple or getPreviousTuple still return the correct value. One idea would be to retain the last returned key, so that we can reinitialize the index. There is a corner case though for a delete : if the removed element is the last returned one, we must have a way to set the correct position by using the key we returned two calls before.

        And i'm afraid it's not enough : for big indexes, the BTree will be span on multiple BPage, and during the remove operation, we don't have access to the browser.

        Removing an element in a BTree while browsing it seems a bad idea...

        Otherwise, the ultimate solution is to drop JDBM and to use a MVCC system, where elements are not deleted when in use. But this is another story...

        Show
        Emmanuel Lecharny added a comment - I think it's a bit more complex than just using the Key in the Browser. IMO, the remove (or add) operation must update the browser so that the call to a getNextTuple or getPreviousTuple still return the correct value. One idea would be to retain the last returned key, so that we can reinitialize the index. There is a corner case though for a delete : if the removed element is the last returned one, we must have a way to set the correct position by using the key we returned two calls before. And i'm afraid it's not enough : for big indexes, the BTree will be span on multiple BPage, and during the remove operation, we don't have access to the browser. Removing an element in a BTree while browsing it seems a bad idea... Otherwise, the ultimate solution is to drop JDBM and to use a MVCC system, where elements are not deleted when in use. But this is another story...
        Hide
        Emmanuel Lecharny added a comment -

        Damn... The more I think about the issue, the more I find it critical.

        If we remove an entry while someone is doing a search, the search will fail. Also we have some problem during replication, just when we try to replicate some deletion, and it might prefectly explain why we get those issues.

        Show
        Emmanuel Lecharny added a comment - Damn... The more I think about the issue, the more I find it critical. If we remove an entry while someone is doing a search, the search will fail. Also we have some problem during replication, just when we try to replicate some deletion, and it might prefectly explain why we get those issues.
        Hide
        Stefan Seelmann added a comment -

        First try of a fix.

        • the BPage.Browser stores the last key that was returned
        • in case the BTree was modified the stored key is used to recover the BPage.Browser state
        • to find out if the BTress was modified a modification counter is used
        • added a new test class which should cover all cases
        Show
        Stefan Seelmann added a comment - First try of a fix. the BPage.Browser stores the last key that was returned in case the BTree was modified the stored key is used to recover the BPage.Browser state to find out if the BTress was modified a modification counter is used added a new test class which should cover all cases
        Stefan Seelmann made changes -
        Attachment DIRSERVER-1642.patch [ 12490878 ]
        Hide
        Selcuk Aya added a comment -

        A basic working version for the jdbm versioned Btree.
        SnapshotBtree: unit test for the versioned btree. Demonstrates consistent browsing actions while tree is modified.

        LRUCache:implements a versioned lru cache. As new versions are created, part of the cache is used for storing previous versions of Btree pages. Concurrent read/write to a BPage is synchronized here.

        ActionContext.java, ActionVersioning.java: basically keeps track of the versionassigned to the action. When client enters an action, its action context is stored in a thread local variable. ActionVersioning keeps track of the next readwrite action version and the minimum read version.

        BPage and BTree:as updates are done to a page, they are copied on write at thislevel. Also browser keeps track of its action context.

        Show
        Selcuk Aya added a comment - A basic working version for the jdbm versioned Btree. SnapshotBtree: unit test for the versioned btree. Demonstrates consistent browsing actions while tree is modified. LRUCache:implements a versioned lru cache. As new versions are created, part of the cache is used for storing previous versions of Btree pages. Concurrent read/write to a BPage is synchronized here. ActionContext.java, ActionVersioning.java: basically keeps track of the versionassigned to the action. When client enters an action, its action context is stored in a thread local variable. ActionVersioning keeps track of the next readwrite action version and the minimum read version. BPage and BTree:as updates are done to a page, they are copied on write at thislevel. Also browser keeps track of its action context.
        Selcuk Aya made changes -
        Attachment jdbm1.diff [ 12491600 ]
        Hide
        Selcuk Aya added a comment -

        modified BTree.java for the existing unit and integration tests to pass. Note that, jdbm-partition is not modified to use the new versioned B+tree and there are only a couple of unit tests for the versioned B+tree yet.

        Show
        Selcuk Aya added a comment - modified BTree.java for the existing unit and integration tests to pass. Note that, jdbm-partition is not modified to use the new versioned B+tree and there are only a couple of unit tests for the versioned B+tree yet.
        Selcuk Aya made changes -
        Attachment jdbm2.diff [ 12491668 ]
        Hide
        Selcuk Aya added a comment -

        changes for jdbm partition to use the versioned Btree. THere are some failures with the server integration tests

        Show
        Selcuk Aya added a comment - changes for jdbm partition to use the versioned Btree. THere are some failures with the server integration tests
        Selcuk Aya made changes -
        Attachment jdbm3.diff [ 12491782 ]
        Selcuk Aya made changes -
        Attachment jdbm4.diff [ 12491784 ]
        Hide
        Selcuk Aya added a comment -

        I changed jdbm-partition to use the versioned Btree. Test cases except StoredProcedureIT are passing. I think the issue with StoredProcedureIT is an exisiting one with serialization of the store procedure and I added an ignore for this test case.

        Show
        Selcuk Aya added a comment - I changed jdbm-partition to use the versioned Btree. Test cases except StoredProcedureIT are passing. I think the issue with StoredProcedureIT is an exisiting one with serialization of the store procedure and I added an ignore for this test case.
        Selcuk Aya made changes -
        Attachment jdbm5.diff [ 12492042 ]
        Hide
        Emmanuel Lecharny added a comment -

        Fixed applying Selcuk's patches

        Show
        Emmanuel Lecharny added a comment - Fixed applying Selcuk's patches
        Emmanuel Lecharny made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Fix Version/s 2.0.0-M3 [ 12316467 ]
        Resolution Fixed [ 1 ]

          People

          • Assignee:
            Unassigned
            Reporter:
            Stefan Seelmann
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development