Bookkeeper
  1. Bookkeeper
  2. BOOKKEEPER-112

Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.1.0
    • Component/s: None
    • Labels:
      None

      Description

      Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.

      Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
      Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

      1. BK-112.patch
        18 kB
        Sijie Guo
      2. BOOKKEEPER-112.patch
        54 kB
        Sijie Guo
      3. BOOKKEEPER-112.patch_v2
        31 kB
        Sijie Guo
      4. BOOKKEEPER-112.patch_v3
        33 kB
        Sijie Guo
      5. BOOKKEEPER-112.patch_v4
        44 kB
        Sijie Guo
      6. BOOKKEEPER-112.patch_v5
        45 kB
        Sijie Guo
      7. bk-112.pdf
        37 kB
        Flavio Junqueira
      8. bk-112.pdf
        43 kB
        Flavio Junqueira
      9. BOOKKEEPER-112.patch_v6
        39 kB
        Sijie Guo
      10. BOOKKEEPER-112.patch_v7
        39 kB
        Sijie Guo
      11. BOOKKEEPER-112.patch_v8
        39 kB
        Sijie Guo

        Issue Links

          Activity

          Hide
          Flavio Junqueira added a comment -

          I'm not sure I have a good way to guarantee that it always happens this way, but my feeling is that we shouldn't try to recover a bookie that participated in a ledger ensemble for which the ledger is still open.

          One way is to check that all ledger fragments to recover are from ledgers that have been closed already. We can do it by checking the ledger metadata stored in zookeeper. Bookie recovery proceeds only if all ledger are closed.

          Show
          Flavio Junqueira added a comment - I'm not sure I have a good way to guarantee that it always happens this way, but my feeling is that we shouldn't try to recover a bookie that participated in a ledger ensemble for which the ledger is still open. One way is to check that all ledger fragments to recover are from ledgers that have been closed already. We can do it by checking the ledger metadata stored in zookeeper. Bookie recovery proceeds only if all ledger are closed.
          Hide
          Ivan Kelly added a comment -

          Moved to 4.1.0, as fixing this will be part of a large review of bookie recovery.

          Show
          Ivan Kelly added a comment - Moved to 4.1.0, as fixing this will be part of a large review of bookie recovery.
          Hide
          Sijie Guo added a comment -

          Attach a patch.

          The idea is when ledger handle encounters BadVersion metadata, it should reread a updated metadata, and compare it with the old one to check whether it can resolve the confliction or not.

          I will put this patch on review board for discussion.

          Show
          Sijie Guo added a comment - Attach a patch. The idea is when ledger handle encounters BadVersion metadata, it should reread a updated metadata, and compare it with the old one to check whether it can resolve the confliction or not. I will put this patch on review board for discussion.
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/3472/
          -----------------------------------------------------------

          Review request for bookkeeper.

          Summary
          -------

          Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.

          Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
          Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

          This addresses bug BOOKKEEPER-112.
          https://issues.apache.org/jira/browse/BOOKKEEPER-112

          Diffs


          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java 37623dc
          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java 8de20c9
          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerMetadata.java c353e46
          bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookieRecoveryTest.java 80e46b9
          bookkeeper-server/src/test/java/org/apache/bookkeeper/client/TestFencing.java e1e8449

          Diff: https://reviews.apache.org/r/3472/diff

          Testing
          -------

          Thanks,

          Sijie

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3472/ ----------------------------------------------------------- Review request for bookkeeper. Summary ------- Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata. Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery() This addresses bug BOOKKEEPER-112 . https://issues.apache.org/jira/browse/BOOKKEEPER-112 Diffs bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java 37623dc bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java 8de20c9 bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerMetadata.java c353e46 bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookieRecoveryTest.java 80e46b9 bookkeeper-server/src/test/java/org/apache/bookkeeper/client/TestFencing.java e1e8449 Diff: https://reviews.apache.org/r/3472/diff Testing ------- Thanks, Sijie
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/3472/
          -----------------------------------------------------------

          (Updated 2012-01-29 09:58:30.513187)

          Review request for bookkeeper.

          Changes
          -------

          we need to check the ledger metadata status before proceed recovery action.

          for those OPENED ledgers,
          1) whose last ensemble contains the failed bookie, we should not proceed recovery action. since we can't promise last entry to be fully replicated. (also there may be other side effects)
          2) whose last ensemble doesn't contain the failed bookie, it is safe to proceed recovery action.

          for those IN_RECOVERY ledgers, we have to check whether last ensemble contains the failed bookie. if it is, the recovery tool has to help closing this ledger, since the normal bookkeeper client may fail to close it. (a corn case: 3 bookies (bk1, bk2, bk3), quorum size 3, ensemble size 3. no entry is written. bk3 is failed. bk1 and bk2 returns NoEntry, bk3 returns HandleNotAvailable. ledger can't be closed.)

          for 2) case of OPENED ledgers, both PendingAddOp and BookKeeperAdmin needs to rereadMetadata when encountering BADVERSION and try to resolve such confliction to avoid #close it.

          Summary
          -------

          Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.

          Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
          Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

          This addresses bug BOOKKEEPER-112.
          https://issues.apache.org/jira/browse/BOOKKEEPER-112

          Diffs (updated)


          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeper.java 5bb37c3
          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java 37623dc
          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java 547e240
          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerMetadata.java b403aa1
          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerOpenOp.java 56186ab
          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerRecoveryOp.java 4625bbb
          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/PendingReadOp.java 29070eb
          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/ReadLastConfirmedOp.java 43e999d
          bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookieRecoveryTest.java 8526db5
          bookkeeper-server/src/test/java/org/apache/bookkeeper/client/TestFencing.java 015e4e4
          bookkeeper-server/src/test/java/org/apache/bookkeeper/test/LedgerOpenTest.java PRE-CREATION

          Diff: https://reviews.apache.org/r/3472/diff

          Testing
          -------

          Thanks,

          Sijie

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3472/ ----------------------------------------------------------- (Updated 2012-01-29 09:58:30.513187) Review request for bookkeeper. Changes ------- we need to check the ledger metadata status before proceed recovery action. for those OPENED ledgers, 1) whose last ensemble contains the failed bookie, we should not proceed recovery action. since we can't promise last entry to be fully replicated. (also there may be other side effects) 2) whose last ensemble doesn't contain the failed bookie, it is safe to proceed recovery action. for those IN_RECOVERY ledgers, we have to check whether last ensemble contains the failed bookie. if it is, the recovery tool has to help closing this ledger, since the normal bookkeeper client may fail to close it. (a corn case: 3 bookies (bk1, bk2, bk3), quorum size 3, ensemble size 3. no entry is written. bk3 is failed. bk1 and bk2 returns NoEntry, bk3 returns HandleNotAvailable. ledger can't be closed.) for 2) case of OPENED ledgers, both PendingAddOp and BookKeeperAdmin needs to rereadMetadata when encountering BADVERSION and try to resolve such confliction to avoid #close it. Summary ------- Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata. Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery() This addresses bug BOOKKEEPER-112 . https://issues.apache.org/jira/browse/BOOKKEEPER-112 Diffs (updated) bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeper.java 5bb37c3 bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java 37623dc bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java 547e240 bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerMetadata.java b403aa1 bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerOpenOp.java 56186ab bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerRecoveryOp.java 4625bbb bookkeeper-server/src/main/java/org/apache/bookkeeper/client/PendingReadOp.java 29070eb bookkeeper-server/src/main/java/org/apache/bookkeeper/client/ReadLastConfirmedOp.java 43e999d bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookieRecoveryTest.java 8526db5 bookkeeper-server/src/test/java/org/apache/bookkeeper/client/TestFencing.java 015e4e4 bookkeeper-server/src/test/java/org/apache/bookkeeper/test/LedgerOpenTest.java PRE-CREATION Diff: https://reviews.apache.org/r/3472/diff Testing ------- Thanks, Sijie
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/3472/#review4745
          -----------------------------------------------------------

          I had a quick look over this. The exclude bookies stuff needs to be removed since BOOKKEEPER-23 gets rid of the need for it & it exposes internal details through the public api which is bad. The other fix in here looks ok, but I found it hard to pick it out with the excludeBookies stuff in there also.

          • Ivan

          On 2012-01-29 09:58:30, Sijie Guo wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/3472/

          -----------------------------------------------------------

          (Updated 2012-01-29 09:58:30)

          Review request for bookkeeper.

          Summary

          -------

          Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.

          Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done

          Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

          This addresses bug BOOKKEEPER-112.

          https://issues.apache.org/jira/browse/BOOKKEEPER-112

          Diffs

          -----

          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeper.java 5bb37c3

          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java 37623dc

          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java 547e240

          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerMetadata.java b403aa1

          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerOpenOp.java 56186ab

          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerRecoveryOp.java 4625bbb

          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/PendingReadOp.java 29070eb

          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/ReadLastConfirmedOp.java 43e999d

          bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookieRecoveryTest.java 8526db5

          bookkeeper-server/src/test/java/org/apache/bookkeeper/client/TestFencing.java 015e4e4

          bookkeeper-server/src/test/java/org/apache/bookkeeper/test/LedgerOpenTest.java PRE-CREATION

          Diff: https://reviews.apache.org/r/3472/diff

          Testing

          -------

          Thanks,

          Sijie

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3472/#review4745 ----------------------------------------------------------- I had a quick look over this. The exclude bookies stuff needs to be removed since BOOKKEEPER-23 gets rid of the need for it & it exposes internal details through the public api which is bad. The other fix in here looks ok, but I found it hard to pick it out with the excludeBookies stuff in there also. Ivan On 2012-01-29 09:58:30, Sijie Guo wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3472/ ----------------------------------------------------------- (Updated 2012-01-29 09:58:30) Review request for bookkeeper. Summary ------- Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata. Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery() This addresses bug BOOKKEEPER-112 . https://issues.apache.org/jira/browse/BOOKKEEPER-112 Diffs ----- bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeper.java 5bb37c3 bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java 37623dc bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java 547e240 bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerMetadata.java b403aa1 bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerOpenOp.java 56186ab bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerRecoveryOp.java 4625bbb bookkeeper-server/src/main/java/org/apache/bookkeeper/client/PendingReadOp.java 29070eb bookkeeper-server/src/main/java/org/apache/bookkeeper/client/ReadLastConfirmedOp.java 43e999d bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookieRecoveryTest.java 8526db5 bookkeeper-server/src/test/java/org/apache/bookkeeper/client/TestFencing.java 015e4e4 bookkeeper-server/src/test/java/org/apache/bookkeeper/test/LedgerOpenTest.java PRE-CREATION Diff: https://reviews.apache.org/r/3472/diff Testing ------- Thanks, Sijie
          Hide
          jiraposter@reviews.apache.org added a comment -

          On 2012-02-01 15:01:34, Ivan Kelly wrote:

          > I had a quick look over this. The exclude bookies stuff needs to be removed since BOOKKEEPER-23 gets rid of the need for it & it exposes internal details through the public api which is bad. The other fix in here looks ok, but I found it hard to pick it out with the excludeBookies stuff in there also.

          if the failed bookie existed in the quorum set which maxAddConfirmed entry belongs to, readLastConfirmed could not succeed, either open and openNoRecovery would fail. (this issue has been reported in BOOKKEEPER-152, but it seems that it is hard to separate into two patches. so I put them in this patch.)

          so recovery tool needs failed bookies information as hints to tell readLastConfirmed to skip failed bookies. I think to change excludedBookies related api to protected, not expose to public.

          what is your opinion?

          • Sijie

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/3472/#review4745
          -----------------------------------------------------------

          On 2012-01-29 09:58:30, Sijie Guo wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/3472/

          -----------------------------------------------------------

          (Updated 2012-01-29 09:58:30)

          Review request for bookkeeper.

          Summary

          -------

          Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.

          Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done

          Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

          This addresses bug BOOKKEEPER-112.

          https://issues.apache.org/jira/browse/BOOKKEEPER-112

          Diffs

          -----

          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeper.java 5bb37c3

          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java 37623dc

          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java 547e240

          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerMetadata.java b403aa1

          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerOpenOp.java 56186ab

          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerRecoveryOp.java 4625bbb

          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/PendingReadOp.java 29070eb

          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/ReadLastConfirmedOp.java 43e999d

          bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookieRecoveryTest.java 8526db5

          bookkeeper-server/src/test/java/org/apache/bookkeeper/client/TestFencing.java 015e4e4

          bookkeeper-server/src/test/java/org/apache/bookkeeper/test/LedgerOpenTest.java PRE-CREATION

          Diff: https://reviews.apache.org/r/3472/diff

          Testing

          -------

          Thanks,

          Sijie

          Show
          jiraposter@reviews.apache.org added a comment - On 2012-02-01 15:01:34, Ivan Kelly wrote: > I had a quick look over this. The exclude bookies stuff needs to be removed since BOOKKEEPER-23 gets rid of the need for it & it exposes internal details through the public api which is bad. The other fix in here looks ok, but I found it hard to pick it out with the excludeBookies stuff in there also. if the failed bookie existed in the quorum set which maxAddConfirmed entry belongs to, readLastConfirmed could not succeed, either open and openNoRecovery would fail. (this issue has been reported in BOOKKEEPER-152 , but it seems that it is hard to separate into two patches. so I put them in this patch.) so recovery tool needs failed bookies information as hints to tell readLastConfirmed to skip failed bookies. I think to change excludedBookies related api to protected, not expose to public. what is your opinion? Sijie ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3472/#review4745 ----------------------------------------------------------- On 2012-01-29 09:58:30, Sijie Guo wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3472/ ----------------------------------------------------------- (Updated 2012-01-29 09:58:30) Review request for bookkeeper. Summary ------- Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata. Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery() This addresses bug BOOKKEEPER-112 . https://issues.apache.org/jira/browse/BOOKKEEPER-112 Diffs ----- bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeper.java 5bb37c3 bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java 37623dc bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java 547e240 bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerMetadata.java b403aa1 bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerOpenOp.java 56186ab bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerRecoveryOp.java 4625bbb bookkeeper-server/src/main/java/org/apache/bookkeeper/client/PendingReadOp.java 29070eb bookkeeper-server/src/main/java/org/apache/bookkeeper/client/ReadLastConfirmedOp.java 43e999d bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookieRecoveryTest.java 8526db5 bookkeeper-server/src/test/java/org/apache/bookkeeper/client/TestFencing.java 015e4e4 bookkeeper-server/src/test/java/org/apache/bookkeeper/test/LedgerOpenTest.java PRE-CREATION Diff: https://reviews.apache.org/r/3472/diff Testing ------- Thanks, Sijie
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/3472/
          -----------------------------------------------------------

          (Updated 2012-02-22 14:43:04.529989)

          Review request for bookkeeper.

          Changes
          -------

          attach a new patch to remove excludedBookies related codes.

          Summary
          -------

          Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.

          Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
          Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

          This addresses bug BOOKKEEPER-112.
          https://issues.apache.org/jira/browse/BOOKKEEPER-112

          Diffs (updated)


          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java 37623dc
          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java a94a0e5
          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerMetadata.java b403aa1
          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerRecoveryOp.java c67a79c
          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/PendingReadOp.java 29070eb
          bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookieRecoveryTest.java 99258ac
          bookkeeper-server/src/test/java/org/apache/bookkeeper/client/TestFencing.java 015e4e4
          bookkeeper-server/src/test/java/org/apache/bookkeeper/test/BaseTestCase.java dada67a

          Diff: https://reviews.apache.org/r/3472/diff

          Testing
          -------

          Thanks,

          Sijie

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3472/ ----------------------------------------------------------- (Updated 2012-02-22 14:43:04.529989) Review request for bookkeeper. Changes ------- attach a new patch to remove excludedBookies related codes. Summary ------- Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata. Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery() This addresses bug BOOKKEEPER-112 . https://issues.apache.org/jira/browse/BOOKKEEPER-112 Diffs (updated) bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java 37623dc bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java a94a0e5 bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerMetadata.java b403aa1 bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerRecoveryOp.java c67a79c bookkeeper-server/src/main/java/org/apache/bookkeeper/client/PendingReadOp.java 29070eb bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookieRecoveryTest.java 99258ac bookkeeper-server/src/test/java/org/apache/bookkeeper/client/TestFencing.java 015e4e4 bookkeeper-server/src/test/java/org/apache/bookkeeper/test/BaseTestCase.java dada67a Diff: https://reviews.apache.org/r/3472/diff Testing ------- Thanks, Sijie
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/3472/
          -----------------------------------------------------------

          (Updated 2012-02-24 17:09:54.127303)

          Review request for bookkeeper.

          Changes
          -------

          discussed with Ivan offline, it seems that it is OK to recover those opened / in recovery status ledgers. for those in recovery status ledgers, if the last entry on the failed bookie, the recovery add should change ensemble. the last entry would be written on the other bookies. the tricky case is for opened ledgers, since readEntries for openNoRecovery can't read more than lastAddConfirmed, we change ledgerHandle#readEntries to use PendingReadOp to skip boundary checking to try to read lastAddConfirmed + 1 to avoid missing replicating this entry.

          Summary
          -------

          Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.

          Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
          Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

          This addresses bug BOOKKEEPER-112.
          https://issues.apache.org/jira/browse/BOOKKEEPER-112

          Diffs (updated)


          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java 37623dc
          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java a94a0e5
          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerMetadata.java b403aa1
          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerRecoveryOp.java c67a79c
          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/PendingReadOp.java 29070eb
          bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookieRecoveryTest.java 99258ac
          bookkeeper-server/src/test/java/org/apache/bookkeeper/client/TestFencing.java 015e4e4
          bookkeeper-server/src/test/java/org/apache/bookkeeper/test/BaseTestCase.java dada67a

          Diff: https://reviews.apache.org/r/3472/diff

          Testing
          -------

          Thanks,

          Sijie

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3472/ ----------------------------------------------------------- (Updated 2012-02-24 17:09:54.127303) Review request for bookkeeper. Changes ------- discussed with Ivan offline, it seems that it is OK to recover those opened / in recovery status ledgers. for those in recovery status ledgers, if the last entry on the failed bookie, the recovery add should change ensemble. the last entry would be written on the other bookies. the tricky case is for opened ledgers, since readEntries for openNoRecovery can't read more than lastAddConfirmed, we change ledgerHandle#readEntries to use PendingReadOp to skip boundary checking to try to read lastAddConfirmed + 1 to avoid missing replicating this entry. Summary ------- Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata. Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery() This addresses bug BOOKKEEPER-112 . https://issues.apache.org/jira/browse/BOOKKEEPER-112 Diffs (updated) bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java 37623dc bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java a94a0e5 bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerMetadata.java b403aa1 bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerRecoveryOp.java c67a79c bookkeeper-server/src/main/java/org/apache/bookkeeper/client/PendingReadOp.java 29070eb bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookieRecoveryTest.java 99258ac bookkeeper-server/src/test/java/org/apache/bookkeeper/client/TestFencing.java 015e4e4 bookkeeper-server/src/test/java/org/apache/bookkeeper/test/BaseTestCase.java dada67a Diff: https://reviews.apache.org/r/3472/diff Testing ------- Thanks, Sijie
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/3472/#review5324
          -----------------------------------------------------------

          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java
          <https://reviews.apache.org/r/3472/#comment11620>

          I don't think going up to lastAddConfirmed + 1 is enough. lastAddConfirmed can trail behind the last written entry by an undefined amount. For example, if we asyncAdd 10 entries, they will each have the same lastAddConfirmed in the packet if the 10th is sent before the 1st is acknowledged.

          The main problem is that bookie recovery reads each entry in parallel, while for this last bit, we need to read sequentially. I think after each ledger fragment successfully completes, this should kick off another operation, which steps up from lastAddConfirmed until it gets NoSuchEntry.

          • Ivan

          On 2012-02-24 17:09:54, Sijie Guo wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/3472/

          -----------------------------------------------------------

          (Updated 2012-02-24 17:09:54)

          Review request for bookkeeper.

          Summary

          -------

          Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.

          Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done

          Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

          This addresses bug BOOKKEEPER-112.

          https://issues.apache.org/jira/browse/BOOKKEEPER-112

          Diffs

          -----

          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java 37623dc

          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java a94a0e5

          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerMetadata.java b403aa1

          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerRecoveryOp.java c67a79c

          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/PendingReadOp.java 29070eb

          bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookieRecoveryTest.java 99258ac

          bookkeeper-server/src/test/java/org/apache/bookkeeper/client/TestFencing.java 015e4e4

          bookkeeper-server/src/test/java/org/apache/bookkeeper/test/BaseTestCase.java dada67a

          Diff: https://reviews.apache.org/r/3472/diff

          Testing

          -------

          Thanks,

          Sijie

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3472/#review5324 ----------------------------------------------------------- bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java < https://reviews.apache.org/r/3472/#comment11620 > I don't think going up to lastAddConfirmed + 1 is enough. lastAddConfirmed can trail behind the last written entry by an undefined amount. For example, if we asyncAdd 10 entries, they will each have the same lastAddConfirmed in the packet if the 10th is sent before the 1st is acknowledged. The main problem is that bookie recovery reads each entry in parallel, while for this last bit, we need to read sequentially. I think after each ledger fragment successfully completes, this should kick off another operation, which steps up from lastAddConfirmed until it gets NoSuchEntry. Ivan On 2012-02-24 17:09:54, Sijie Guo wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3472/ ----------------------------------------------------------- (Updated 2012-02-24 17:09:54) Review request for bookkeeper. Summary ------- Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata. Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery() This addresses bug BOOKKEEPER-112 . https://issues.apache.org/jira/browse/BOOKKEEPER-112 Diffs ----- bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java 37623dc bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java a94a0e5 bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerMetadata.java b403aa1 bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerRecoveryOp.java c67a79c bookkeeper-server/src/main/java/org/apache/bookkeeper/client/PendingReadOp.java 29070eb bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookieRecoveryTest.java 99258ac bookkeeper-server/src/test/java/org/apache/bookkeeper/client/TestFencing.java 015e4e4 bookkeeper-server/src/test/java/org/apache/bookkeeper/test/BaseTestCase.java dada67a Diff: https://reviews.apache.org/r/3472/diff Testing ------- Thanks, Sijie
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/3472/
          -----------------------------------------------------------

          (Updated 2012-02-28 11:09:19.006294)

          Review request for bookkeeper.

          Changes
          -------

          adding a readforward operation if lastEnsemble need to be repaired, which addresses Ivan's comment.

          Summary
          -------

          Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.

          Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
          Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

          This addresses bug BOOKKEEPER-112.
          https://issues.apache.org/jira/browse/BOOKKEEPER-112

          Diffs (updated)


          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java 37623dc
          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java a94a0e5
          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerMetadata.java b403aa1
          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerRecoveryOp.java c67a79c
          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/PendingReadOp.java 29070eb
          bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookieRecoveryTest.java 99258ac
          bookkeeper-server/src/test/java/org/apache/bookkeeper/client/TestFencing.java 015e4e4
          bookkeeper-server/src/test/java/org/apache/bookkeeper/test/BaseTestCase.java dada67a

          Diff: https://reviews.apache.org/r/3472/diff

          Testing
          -------

          Thanks,

          Sijie

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3472/ ----------------------------------------------------------- (Updated 2012-02-28 11:09:19.006294) Review request for bookkeeper. Changes ------- adding a readforward operation if lastEnsemble need to be repaired, which addresses Ivan's comment. Summary ------- Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata. Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery() This addresses bug BOOKKEEPER-112 . https://issues.apache.org/jira/browse/BOOKKEEPER-112 Diffs (updated) bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java 37623dc bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java a94a0e5 bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerMetadata.java b403aa1 bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerRecoveryOp.java c67a79c bookkeeper-server/src/main/java/org/apache/bookkeeper/client/PendingReadOp.java 29070eb bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookieRecoveryTest.java 99258ac bookkeeper-server/src/test/java/org/apache/bookkeeper/client/TestFencing.java 015e4e4 bookkeeper-server/src/test/java/org/apache/bookkeeper/test/BaseTestCase.java dada67a Diff: https://reviews.apache.org/r/3472/diff Testing ------- Thanks, Sijie
          Hide
          Sijie Guo added a comment -

          new patch addressed Ivan's comment.

          Show
          Sijie Guo added a comment - new patch addressed Ivan's comment.
          Hide
          Ivan Kelly added a comment -

          Another corner case came to me while reading through this. What happens if a client has a ledger open and is writing to it. The srcBookie(B1) isn't actually failed, so the client can continue to write to it. The recovery comes in, copies all the entries(up to entry X), and updates the metadata, but the client keep writing entries(up to entry Y), unaware of the recovery process. Anything between X & Y would be underreplicated, as B1 is no longer in the ensemble for the fragment. Im not sure what the best course of action would be in this case, maybe we can force an ensemble change, or force the ledger closed.

          Show
          Ivan Kelly added a comment - Another corner case came to me while reading through this. What happens if a client has a ledger open and is writing to it. The srcBookie(B1) isn't actually failed, so the client can continue to write to it. The recovery comes in, copies all the entries(up to entry X), and updates the metadata, but the client keep writing entries(up to entry Y), unaware of the recovery process. Anything between X & Y would be underreplicated, as B1 is no longer in the ensemble for the fragment. Im not sure what the best course of action would be in this case, maybe we can force an ensemble change, or force the ledger closed.
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/3472/#review5494
          -----------------------------------------------------------

          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java
          <https://reviews.apache.org/r/3472/#comment11911>

          Why not use the SafeOrderedExecutor as BookKeeper does?

          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java
          <https://reviews.apache.org/r/3472/#comment11912>

          bitwise &, I guess you meant && here.

          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java
          <https://reviews.apache.org/r/3472/#comment11913>

          I dont like boolean flags like readForward being passed to the constructor here. I think it would be better to have a class called SingleFragmentCallbackWithForwardRead which inherits from SingleFragmentCallback. You can then overload processResult on that.

          • Ivan

          On 2012-02-28 11:09:19, Sijie Guo wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/3472/

          -----------------------------------------------------------

          (Updated 2012-02-28 11:09:19)

          Review request for bookkeeper.

          Summary

          -------

          Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.

          Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done

          Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

          This addresses bug BOOKKEEPER-112.

          https://issues.apache.org/jira/browse/BOOKKEEPER-112

          Diffs

          -----

          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java 37623dc

          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java a94a0e5

          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerMetadata.java b403aa1

          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerRecoveryOp.java c67a79c

          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/PendingReadOp.java 29070eb

          bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookieRecoveryTest.java 99258ac

          bookkeeper-server/src/test/java/org/apache/bookkeeper/client/TestFencing.java 015e4e4

          bookkeeper-server/src/test/java/org/apache/bookkeeper/test/BaseTestCase.java dada67a

          Diff: https://reviews.apache.org/r/3472/diff

          Testing

          -------

          Thanks,

          Sijie

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3472/#review5494 ----------------------------------------------------------- bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java < https://reviews.apache.org/r/3472/#comment11911 > Why not use the SafeOrderedExecutor as BookKeeper does? bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java < https://reviews.apache.org/r/3472/#comment11912 > bitwise &, I guess you meant && here. bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java < https://reviews.apache.org/r/3472/#comment11913 > I dont like boolean flags like readForward being passed to the constructor here. I think it would be better to have a class called SingleFragmentCallbackWithForwardRead which inherits from SingleFragmentCallback. You can then overload processResult on that. Ivan On 2012-02-28 11:09:19, Sijie Guo wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3472/ ----------------------------------------------------------- (Updated 2012-02-28 11:09:19) Review request for bookkeeper. Summary ------- Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata. Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery() This addresses bug BOOKKEEPER-112 . https://issues.apache.org/jira/browse/BOOKKEEPER-112 Diffs ----- bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java 37623dc bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java a94a0e5 bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerMetadata.java b403aa1 bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerRecoveryOp.java c67a79c bookkeeper-server/src/main/java/org/apache/bookkeeper/client/PendingReadOp.java 29070eb bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookieRecoveryTest.java 99258ac bookkeeper-server/src/test/java/org/apache/bookkeeper/client/TestFencing.java 015e4e4 bookkeeper-server/src/test/java/org/apache/bookkeeper/test/BaseTestCase.java dada67a Diff: https://reviews.apache.org/r/3472/diff Testing ------- Thanks, Sijie
          Hide
          Flavio Junqueira added a comment -

          There is a distinction between bookie recovery and ledger recovery. I don't think we should do bookie recovery on an open ledger.

          Show
          Flavio Junqueira added a comment - There is a distinction between bookie recovery and ledger recovery. I don't think we should do bookie recovery on an open ledger.
          Hide
          Sijie Guo added a comment -

          > Anything between X & Y would be underreplicated, as B1 is no longer in the ensemble for the fragment.

          I am doubting why B1 is still being written. If BookieRecoveryTool is run to re-replicate entries of B1, the admin guy would ensure B1 is down then do re-replication. If we want to ensure B1 is not alive, maybe we can do some checking before re-replication, like checking whether B1 in available list or not, trying setup connection and do some operations.

          Show
          Sijie Guo added a comment - > Anything between X & Y would be underreplicated, as B1 is no longer in the ensemble for the fragment. I am doubting why B1 is still being written. If BookieRecoveryTool is run to re-replicate entries of B1, the admin guy would ensure B1 is down then do re-replication. If we want to ensure B1 is not alive, maybe we can do some checking before re-replication, like checking whether B1 in available list or not, trying setup connection and do some operations.
          Hide
          Sijie Guo added a comment -

          > There is a distinction between bookie recovery and ledger recovery. I don't think we should do bookie recovery on an open ledger.

          If we don't bookie recovery on open ledgers, the user may find each time he run BookieRecoveryTool it would not success since there was opened ledgers not being recovered. The user may have no idea when to stop running this tool.

          Show
          Sijie Guo added a comment - > There is a distinction between bookie recovery and ledger recovery. I don't think we should do bookie recovery on an open ledger. If we don't bookie recovery on open ledgers, the user may find each time he run BookieRecoveryTool it would not success since there was opened ledgers not being recovered. The user may have no idea when to stop running this tool.
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/3472/
          -----------------------------------------------------------

          (Updated 2012-03-02 11:27:34.221927)

          Review request for bookkeeper.

          Changes
          -------

          new patch remove readForward flag from constructor. and also remove ScheduledExecutorService, since we don't need to care about long-chain, since the callback will be triggered in bookie client's netty thread not in same thread.

          Summary
          -------

          Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.

          Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
          Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

          This addresses bug BOOKKEEPER-112.
          https://issues.apache.org/jira/browse/BOOKKEEPER-112

          Diffs (updated)


          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java 37623dc
          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java f71e53f
          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerMetadata.java b403aa1
          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerRecoveryOp.java c67a79c
          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/PendingReadOp.java 29070eb
          bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookieRecoveryTest.java 99258ac
          bookkeeper-server/src/test/java/org/apache/bookkeeper/client/TestFencing.java 015e4e4
          bookkeeper-server/src/test/java/org/apache/bookkeeper/test/BaseTestCase.java dada67a

          Diff: https://reviews.apache.org/r/3472/diff

          Testing
          -------

          Thanks,

          Sijie

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3472/ ----------------------------------------------------------- (Updated 2012-03-02 11:27:34.221927) Review request for bookkeeper. Changes ------- new patch remove readForward flag from constructor. and also remove ScheduledExecutorService, since we don't need to care about long-chain, since the callback will be triggered in bookie client's netty thread not in same thread. Summary ------- Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata. Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery() This addresses bug BOOKKEEPER-112 . https://issues.apache.org/jira/browse/BOOKKEEPER-112 Diffs (updated) bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java 37623dc bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java f71e53f bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerMetadata.java b403aa1 bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerRecoveryOp.java c67a79c bookkeeper-server/src/main/java/org/apache/bookkeeper/client/PendingReadOp.java 29070eb bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookieRecoveryTest.java 99258ac bookkeeper-server/src/test/java/org/apache/bookkeeper/client/TestFencing.java 015e4e4 bookkeeper-server/src/test/java/org/apache/bookkeeper/test/BaseTestCase.java dada67a Diff: https://reviews.apache.org/r/3472/diff Testing ------- Thanks, Sijie
          Hide
          Ivan Kelly added a comment -

          Flavio and I discussed this a little yesterday evening and after thinking about it for a little bit afterwards, the problem seems clearer to me now.

          So, what is under discussion here is what do we do with the final fragment of an open ledger. This actually boils down to the same problem we have for fencing. By recovering the bookie, we are introducing a second writer, violating our 1-writer assumption. Since we now have more than one writer, it is necessary for there to be a consensus among all writers on where the ledger fragment ends.

          There are 3 situations which this open final fragment can occur.

          1. the original writer crashed, then bookie crashed before ledger recovery
          2. the original writer has the ledger open, but has not written anything since the bookie crashed
          3. the bookie being recovered isn't actually down

          One solution proposed by Flavio yesterday was that we should wait until no open final fragments exist before updating the ZK metadata. This works for 2. However, for 1 & 3, the recovery will wait forever.

          One way im leaning towards now, is to replicate all entries in the fragment, and then ensure that no more entries are added to this specific fragment. This would require a change to how fencing works. Instead of fencing by ledger id, we would have to fence by fragment id. When the original writer tries to write, the write will fail, and then try to replace the bookie to which the write failed to write to (all bookies in this case). This deals with 1, because all entries written before the writer crash will be replicated. It works for 2, because the next write by the writer will see that its current ledger fragment is fenced and that the crashed bookie is down, so it will build a new ensemble and start writing a new fragment. It deals with 3, as the current fragment will be rereplicated and any further attempts by the writer will force it to rebuild its ensemble.

          Show
          Ivan Kelly added a comment - Flavio and I discussed this a little yesterday evening and after thinking about it for a little bit afterwards, the problem seems clearer to me now. So, what is under discussion here is what do we do with the final fragment of an open ledger. This actually boils down to the same problem we have for fencing. By recovering the bookie, we are introducing a second writer, violating our 1-writer assumption. Since we now have more than one writer, it is necessary for there to be a consensus among all writers on where the ledger fragment ends. There are 3 situations which this open final fragment can occur. the original writer crashed, then bookie crashed before ledger recovery the original writer has the ledger open, but has not written anything since the bookie crashed the bookie being recovered isn't actually down One solution proposed by Flavio yesterday was that we should wait until no open final fragments exist before updating the ZK metadata. This works for 2. However, for 1 & 3, the recovery will wait forever. One way im leaning towards now, is to replicate all entries in the fragment, and then ensure that no more entries are added to this specific fragment. This would require a change to how fencing works. Instead of fencing by ledger id, we would have to fence by fragment id. When the original writer tries to write, the write will fail, and then try to replace the bookie to which the write failed to write to (all bookies in this case). This deals with 1, because all entries written before the writer crash will be replicated. It works for 2, because the next write by the writer will see that its current ledger fragment is fenced and that the crashed bookie is down, so it will build a new ensemble and start writing a new fragment. It deals with 3, as the current fragment will be rereplicated and any further attempts by the writer will force it to rebuild its ensemble.
          Hide
          Sijie Guo added a comment -

          yeah, fencing by fragment id is a very good solution, although I am not so clear how fencing handles following case.

          we have 5 bookies, bk[1-5], suppose bk5 is down, ledger l is opened, with last fragment is bk3, bk4, bk5. the recovery tool fence bk3 & bk4, further attempts from ledger l will force it to rebuild new ensemble, suppose (bk1, bk2, bk3). bk3 is a common bookie between the old ensemble and the new ensemble. how to deal with writing to such bookie? because bookie server has no knowledge about ledger distribution info, it doesn't know the writing is to an old ensemble or to a new ensemble. unless we also send fragment id in the addEntry request.

          Show
          Sijie Guo added a comment - yeah, fencing by fragment id is a very good solution, although I am not so clear how fencing handles following case. we have 5 bookies, bk [1-5] , suppose bk5 is down, ledger l is opened, with last fragment is bk3, bk4, bk5. the recovery tool fence bk3 & bk4, further attempts from ledger l will force it to rebuild new ensemble, suppose (bk1, bk2, bk3). bk3 is a common bookie between the old ensemble and the new ensemble. how to deal with writing to such bookie? because bookie server has no knowledge about ledger distribution info, it doesn't know the writing is to an old ensemble or to a new ensemble. unless we also send fragment id in the addEntry request.
          Hide
          Flavio Junqueira added a comment -

          There are two occasions during the lifetime of a ledger that we use consensus through zookeeper: when we change the ensemble and when we close the ledger. By design, the former is only proposed by the writer, whereas the latter can be proposed by either the writer or another client trying to recover it. Trying to change the design so that we can have multiple clients proposing changes to the ensemble of a ledger would be difficult and prone to errors, so I suggest we keep this part of the design the way it is.

          One way to perform the fencing for recovery and still keep the original design as is with respect to ensemble changes is to wait for the writer to mark in the ledger metadata such a change. Say that we externally detect that a bookie C has crashed. If the writer of a given ledger L removes C from its configuration and writes to ZooKeeper, then we can safely recover the ledger fragment of C for L. If the writer of L never makes such a change, then we assume that the writer can still talk to C, and consequently we don't care.

          We can monitor ensemble changes by watching the node in ZooKeeper. Using watches will make it difficult to implement an HBase backend for metadata as we proposed in another jira.

          Show
          Flavio Junqueira added a comment - There are two occasions during the lifetime of a ledger that we use consensus through zookeeper: when we change the ensemble and when we close the ledger. By design, the former is only proposed by the writer, whereas the latter can be proposed by either the writer or another client trying to recover it. Trying to change the design so that we can have multiple clients proposing changes to the ensemble of a ledger would be difficult and prone to errors, so I suggest we keep this part of the design the way it is. One way to perform the fencing for recovery and still keep the original design as is with respect to ensemble changes is to wait for the writer to mark in the ledger metadata such a change. Say that we externally detect that a bookie C has crashed. If the writer of a given ledger L removes C from its configuration and writes to ZooKeeper, then we can safely recover the ledger fragment of C for L. If the writer of L never makes such a change, then we assume that the writer can still talk to C, and consequently we don't care. We can monitor ensemble changes by watching the node in ZooKeeper. Using watches will make it difficult to implement an HBase backend for metadata as we proposed in another jira.
          Hide
          Flavio Junqueira added a comment -

          The attached document reflects the discussion Sijie, Ivan, and myself had on the design of bookie recovery to solve the issue described in this jira.

          Show
          Flavio Junqueira added a comment - The attached document reflects the discussion Sijie, Ivan, and myself had on the design of bookie recovery to solve the issue described in this jira.
          Hide
          Sijie Guo added a comment -

          I would modify the patch according to the document ASAP.

          Show
          Sijie Guo added a comment - I would modify the patch according to the document ASAP.
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/3472/
          -----------------------------------------------------------

          (Updated 2012-03-23 15:30:15.796027)

          Review request for bookkeeper.

          Changes
          -------

          modify the patch according to the document.

          Summary
          -------

          Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.

          Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
          Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

          This addresses bug BOOKKEEPER-112.
          https://issues.apache.org/jira/browse/BOOKKEEPER-112

          Diffs (updated)


          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java 37623dc
          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java f71e53f
          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerMetadata.java b403aa1
          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerRecoveryOp.java c67a79c
          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/PendingReadOp.java 539d6b2
          bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookieRecoveryTest.java b8923e8
          bookkeeper-server/src/test/java/org/apache/bookkeeper/client/TestFencing.java 7de1c10
          bookkeeper-server/src/test/java/org/apache/bookkeeper/test/BookKeeperClusterTestCase.java 0b882c6
          bookkeeper-server/src/test/java/org/apache/bookkeeper/test/CloseTest.java e28d32c

          Diff: https://reviews.apache.org/r/3472/diff

          Testing
          -------

          Thanks,

          Sijie

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3472/ ----------------------------------------------------------- (Updated 2012-03-23 15:30:15.796027) Review request for bookkeeper. Changes ------- modify the patch according to the document. Summary ------- Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata. Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery() This addresses bug BOOKKEEPER-112 . https://issues.apache.org/jira/browse/BOOKKEEPER-112 Diffs (updated) bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java 37623dc bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java f71e53f bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerMetadata.java b403aa1 bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerRecoveryOp.java c67a79c bookkeeper-server/src/main/java/org/apache/bookkeeper/client/PendingReadOp.java 539d6b2 bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookieRecoveryTest.java b8923e8 bookkeeper-server/src/test/java/org/apache/bookkeeper/client/TestFencing.java 7de1c10 bookkeeper-server/src/test/java/org/apache/bookkeeper/test/BookKeeperClusterTestCase.java 0b882c6 bookkeeper-server/src/test/java/org/apache/bookkeeper/test/CloseTest.java e28d32c Diff: https://reviews.apache.org/r/3472/diff Testing ------- Thanks, Sijie
          Hide
          Sijie Guo added a comment -

          attach a new patch fixed according to the document.

          Show
          Sijie Guo added a comment - attach a new patch fixed according to the document.
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/3472/#review6515
          -----------------------------------------------------------

          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java
          <https://reviews.apache.org/r/3472/#comment14178>

          Perhaps we should close lh here. It's a noop really, as a non recovery lh is a ReadOnlyLedgerHandle, where noop is just a stub. We should close it anyhow, just in case it changes to not being a noop in the future.

          • Ivan

          On 2012-03-23 15:30:15, Sijie Guo wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/3472/

          -----------------------------------------------------------

          (Updated 2012-03-23 15:30:15)

          Review request for bookkeeper.

          Summary

          -------

          Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.

          Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done

          Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

          This addresses bug BOOKKEEPER-112.

          https://issues.apache.org/jira/browse/BOOKKEEPER-112

          Diffs

          -----

          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java 37623dc

          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java f71e53f

          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerMetadata.java b403aa1

          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerRecoveryOp.java c67a79c

          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/PendingReadOp.java 539d6b2

          bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookieRecoveryTest.java b8923e8

          bookkeeper-server/src/test/java/org/apache/bookkeeper/client/TestFencing.java 7de1c10

          bookkeeper-server/src/test/java/org/apache/bookkeeper/test/BookKeeperClusterTestCase.java 0b882c6

          bookkeeper-server/src/test/java/org/apache/bookkeeper/test/CloseTest.java e28d32c

          Diff: https://reviews.apache.org/r/3472/diff

          Testing

          -------

          Thanks,

          Sijie

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3472/#review6515 ----------------------------------------------------------- bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java < https://reviews.apache.org/r/3472/#comment14178 > Perhaps we should close lh here. It's a noop really, as a non recovery lh is a ReadOnlyLedgerHandle, where noop is just a stub. We should close it anyhow, just in case it changes to not being a noop in the future. Ivan On 2012-03-23 15:30:15, Sijie Guo wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3472/ ----------------------------------------------------------- (Updated 2012-03-23 15:30:15) Review request for bookkeeper. Summary ------- Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata. Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery() This addresses bug BOOKKEEPER-112 . https://issues.apache.org/jira/browse/BOOKKEEPER-112 Diffs ----- bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java 37623dc bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java f71e53f bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerMetadata.java b403aa1 bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerRecoveryOp.java c67a79c bookkeeper-server/src/main/java/org/apache/bookkeeper/client/PendingReadOp.java 539d6b2 bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookieRecoveryTest.java b8923e8 bookkeeper-server/src/test/java/org/apache/bookkeeper/client/TestFencing.java 7de1c10 bookkeeper-server/src/test/java/org/apache/bookkeeper/test/BookKeeperClusterTestCase.java 0b882c6 bookkeeper-server/src/test/java/org/apache/bookkeeper/test/CloseTest.java e28d32c Diff: https://reviews.apache.org/r/3472/diff Testing ------- Thanks, Sijie
          Hide
          jiraposter@reviews.apache.org added a comment -

          On 2012-03-29 12:44:18, Ivan Kelly wrote:

          > bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java, line 444

          > <https://reviews.apache.org/r/3472/diff/7/?file=94988#file94988line444>

          >

          > Perhaps we should close lh here. It's a noop really, as a non recovery lh is a ReadOnlyLedgerHandle, where noop is just a stub. We should close it anyhow, just in case it changes to not being a noop in the future.

          otherwise the patch looks good to me. +1

          • Ivan

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/3472/#review6515
          -----------------------------------------------------------

          On 2012-03-23 15:30:15, Sijie Guo wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/3472/

          -----------------------------------------------------------

          (Updated 2012-03-23 15:30:15)

          Review request for bookkeeper.

          Summary

          -------

          Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.

          Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done

          Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

          This addresses bug BOOKKEEPER-112.

          https://issues.apache.org/jira/browse/BOOKKEEPER-112

          Diffs

          -----

          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java 37623dc

          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java f71e53f

          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerMetadata.java b403aa1

          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerRecoveryOp.java c67a79c

          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/PendingReadOp.java 539d6b2

          bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookieRecoveryTest.java b8923e8

          bookkeeper-server/src/test/java/org/apache/bookkeeper/client/TestFencing.java 7de1c10

          bookkeeper-server/src/test/java/org/apache/bookkeeper/test/BookKeeperClusterTestCase.java 0b882c6

          bookkeeper-server/src/test/java/org/apache/bookkeeper/test/CloseTest.java e28d32c

          Diff: https://reviews.apache.org/r/3472/diff

          Testing

          -------

          Thanks,

          Sijie

          Show
          jiraposter@reviews.apache.org added a comment - On 2012-03-29 12:44:18, Ivan Kelly wrote: > bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java, line 444 > < https://reviews.apache.org/r/3472/diff/7/?file=94988#file94988line444 > > > Perhaps we should close lh here. It's a noop really, as a non recovery lh is a ReadOnlyLedgerHandle, where noop is just a stub. We should close it anyhow, just in case it changes to not being a noop in the future. otherwise the patch looks good to me. +1 Ivan ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3472/#review6515 ----------------------------------------------------------- On 2012-03-23 15:30:15, Sijie Guo wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3472/ ----------------------------------------------------------- (Updated 2012-03-23 15:30:15) Review request for bookkeeper. Summary ------- Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata. Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery() This addresses bug BOOKKEEPER-112 . https://issues.apache.org/jira/browse/BOOKKEEPER-112 Diffs ----- bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java 37623dc bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java f71e53f bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerMetadata.java b403aa1 bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerRecoveryOp.java c67a79c bookkeeper-server/src/main/java/org/apache/bookkeeper/client/PendingReadOp.java 539d6b2 bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookieRecoveryTest.java b8923e8 bookkeeper-server/src/test/java/org/apache/bookkeeper/client/TestFencing.java 7de1c10 bookkeeper-server/src/test/java/org/apache/bookkeeper/test/BookKeeperClusterTestCase.java 0b882c6 bookkeeper-server/src/test/java/org/apache/bookkeeper/test/CloseTest.java e28d32c Diff: https://reviews.apache.org/r/3472/diff Testing ------- Thanks, Sijie
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/3472/
          -----------------------------------------------------------

          (Updated 2012-03-29 13:29:23.834764)

          Review request for bookkeeper.

          Changes
          -------

          close opened non-recovery ledger handle as Ivan's suggestion.

          Summary
          -------

          Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.

          Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
          Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

          This addresses bug BOOKKEEPER-112.
          https://issues.apache.org/jira/browse/BOOKKEEPER-112

          Diffs (updated)


          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java 37623dc
          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java f71e53f
          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerMetadata.java b403aa1
          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerRecoveryOp.java c67a79c
          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/PendingReadOp.java 539d6b2
          bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookieRecoveryTest.java b8923e8
          bookkeeper-server/src/test/java/org/apache/bookkeeper/client/TestFencing.java 7de1c10
          bookkeeper-server/src/test/java/org/apache/bookkeeper/test/BookKeeperClusterTestCase.java 0b882c6
          bookkeeper-server/src/test/java/org/apache/bookkeeper/test/CloseTest.java e28d32c

          Diff: https://reviews.apache.org/r/3472/diff

          Testing
          -------

          Thanks,

          Sijie

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3472/ ----------------------------------------------------------- (Updated 2012-03-29 13:29:23.834764) Review request for bookkeeper. Changes ------- close opened non-recovery ledger handle as Ivan's suggestion. Summary ------- Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata. Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery() This addresses bug BOOKKEEPER-112 . https://issues.apache.org/jira/browse/BOOKKEEPER-112 Diffs (updated) bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java 37623dc bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java f71e53f bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerMetadata.java b403aa1 bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerRecoveryOp.java c67a79c bookkeeper-server/src/main/java/org/apache/bookkeeper/client/PendingReadOp.java 539d6b2 bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookieRecoveryTest.java b8923e8 bookkeeper-server/src/test/java/org/apache/bookkeeper/client/TestFencing.java 7de1c10 bookkeeper-server/src/test/java/org/apache/bookkeeper/test/BookKeeperClusterTestCase.java 0b882c6 bookkeeper-server/src/test/java/org/apache/bookkeeper/test/CloseTest.java e28d32c Diff: https://reviews.apache.org/r/3472/diff Testing ------- Thanks, Sijie
          Hide
          Sijie Guo added a comment -

          new patch to close opened non-recovery ledger handle as Ivan's suggestion.

          Show
          Sijie Guo added a comment - new patch to close opened non-recovery ledger handle as Ivan's suggestion.
          Hide
          Sijie Guo added a comment -

          attach a new patch to add missing line which releasing permit when submitting callback in PeadingReadOp, similar what BOOKKEEPER-186 did.

          Show
          Sijie Guo added a comment - attach a new patch to add missing line which releasing permit when submitting callback in PeadingReadOp, similar what BOOKKEEPER-186 did.
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/3472/
          -----------------------------------------------------------

          (Updated 2012-03-30 12:12:54.459832)

          Review request for bookkeeper.

          Changes
          -------

          adding one line to release permit when submit callback in PendingReadOp. similar as BOOKKEEPER-186.

          Summary
          -------

          Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.

          Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
          Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

          This addresses bug BOOKKEEPER-112.
          https://issues.apache.org/jira/browse/BOOKKEEPER-112

          Diffs (updated)


          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java 37623dc
          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java f71e53f
          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerMetadata.java b403aa1
          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerRecoveryOp.java c67a79c
          bookkeeper-server/src/main/java/org/apache/bookkeeper/client/PendingReadOp.java 539d6b2
          bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookieRecoveryTest.java b8923e8
          bookkeeper-server/src/test/java/org/apache/bookkeeper/client/TestFencing.java 7de1c10
          bookkeeper-server/src/test/java/org/apache/bookkeeper/test/BookKeeperClusterTestCase.java 0b882c6
          bookkeeper-server/src/test/java/org/apache/bookkeeper/test/CloseTest.java e28d32c

          Diff: https://reviews.apache.org/r/3472/diff

          Testing
          -------

          Thanks,

          Sijie

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3472/ ----------------------------------------------------------- (Updated 2012-03-30 12:12:54.459832) Review request for bookkeeper. Changes ------- adding one line to release permit when submit callback in PendingReadOp. similar as BOOKKEEPER-186 . Summary ------- Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata. Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery() This addresses bug BOOKKEEPER-112 . https://issues.apache.org/jira/browse/BOOKKEEPER-112 Diffs (updated) bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java 37623dc bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java f71e53f bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerMetadata.java b403aa1 bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerRecoveryOp.java c67a79c bookkeeper-server/src/main/java/org/apache/bookkeeper/client/PendingReadOp.java 539d6b2 bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookieRecoveryTest.java b8923e8 bookkeeper-server/src/test/java/org/apache/bookkeeper/client/TestFencing.java 7de1c10 bookkeeper-server/src/test/java/org/apache/bookkeeper/test/BookKeeperClusterTestCase.java 0b882c6 bookkeeper-server/src/test/java/org/apache/bookkeeper/test/CloseTest.java e28d32c Diff: https://reviews.apache.org/r/3472/diff Testing ------- Thanks, Sijie
          Hide
          Ivan Kelly added a comment -

          lgtm +1

          Show
          Ivan Kelly added a comment - lgtm +1
          Hide
          Sijie Guo added a comment -

          committed as r1307743. thanks Ivan, Flavio for discussion. thanks Ivan for reviewing.

          Show
          Sijie Guo added a comment - committed as r1307743. thanks Ivan, Flavio for discussion. thanks Ivan for reviewing.
          Hide
          Hudson added a comment -

          Integrated in bookkeeper-trunk #438 (See https://builds.apache.org/job/bookkeeper-trunk/438/)
          BOOKKEEPER-112: Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail (sijie) (Revision 1307743)

          Result = ABORTED
          sijie :
          Files :

          • /zookeeper/bookkeeper/trunk/CHANGES.txt
          • /zookeeper/bookkeeper/trunk/bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java
          • /zookeeper/bookkeeper/trunk/bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java
          • /zookeeper/bookkeeper/trunk/bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerMetadata.java
          • /zookeeper/bookkeeper/trunk/bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerRecoveryOp.java
          • /zookeeper/bookkeeper/trunk/bookkeeper-server/src/main/java/org/apache/bookkeeper/client/PendingReadOp.java
          • /zookeeper/bookkeeper/trunk/bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookieRecoveryTest.java
          • /zookeeper/bookkeeper/trunk/bookkeeper-server/src/test/java/org/apache/bookkeeper/client/TestFencing.java
          • /zookeeper/bookkeeper/trunk/bookkeeper-server/src/test/java/org/apache/bookkeeper/test/BookKeeperClusterTestCase.java
          • /zookeeper/bookkeeper/trunk/bookkeeper-server/src/test/java/org/apache/bookkeeper/test/CloseTest.java
          Show
          Hudson added a comment - Integrated in bookkeeper-trunk #438 (See https://builds.apache.org/job/bookkeeper-trunk/438/ ) BOOKKEEPER-112 : Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail (sijie) (Revision 1307743) Result = ABORTED sijie : Files : /zookeeper/bookkeeper/trunk/CHANGES.txt /zookeeper/bookkeeper/trunk/bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java /zookeeper/bookkeeper/trunk/bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java /zookeeper/bookkeeper/trunk/bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerMetadata.java /zookeeper/bookkeeper/trunk/bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerRecoveryOp.java /zookeeper/bookkeeper/trunk/bookkeeper-server/src/main/java/org/apache/bookkeeper/client/PendingReadOp.java /zookeeper/bookkeeper/trunk/bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookieRecoveryTest.java /zookeeper/bookkeeper/trunk/bookkeeper-server/src/test/java/org/apache/bookkeeper/client/TestFencing.java /zookeeper/bookkeeper/trunk/bookkeeper-server/src/test/java/org/apache/bookkeeper/test/BookKeeperClusterTestCase.java /zookeeper/bookkeeper/trunk/bookkeeper-server/src/test/java/org/apache/bookkeeper/test/CloseTest.java
          Hide
          Flavio Junqueira added a comment -

          Hi Sijie, I'm not done reviewing it and I'd like to before having it in. Is it ok?

          Show
          Flavio Junqueira added a comment - Hi Sijie, I'm not done reviewing it and I'd like to before having it in. Is it ok?
          Hide
          Flavio Junqueira added a comment -

          +1, it looks good. There are some parts in which the format is not looking good and there are some mispellings, but I don't think it is worth reverting. I'll fix these in a separate patch.

          Show
          Flavio Junqueira added a comment - +1, it looks good. There are some parts in which the format is not looking good and there are some mispellings, but I don't think it is worth reverting. I'll fix these in a separate patch.
          Hide
          Sijie Guo added a comment -

          oh, sorry Flavio. I may miss some words on the sync up meeting.

          Show
          Sijie Guo added a comment - oh, sorry Flavio. I may miss some words on the sync up meeting.

            People

            • Assignee:
              Sijie Guo
              Reporter:
              Flavio Junqueira
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development