Uploaded image for project: 'Jackrabbit Oak'
  1. Jackrabbit Oak
  2. OAK-5446

leaseUpdateThread might be blocked by leaseUpdateCheck

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.4, 1.5.14
    • None
    • None
    • None

    Description


      moved over to OAK-5528 due to internal Jira issues, please do not delete this ticket while the problem is being investigated

      Fighting with cluster nodes losing their lease and shutting down oak-core in a cloud environment. For reasons unknown at this point in time, the whole process seems to skip about two minutes of real time.

      This is a situation from which oak currently does not recover. Code analysis shows that ClusterNodeInfo is handed the LeaseCheckDocumentStoreWrapper instance to use as store. This is fatal since any action the renewLease() tries to do will first invoke the performLeaseCheck(). The lease check will, when the FailureMargin is reached, stall the renewLease() thread for 5 retry attempts and then declare the lease to be lost.

      The ClusterNodeInfo should instead be using the "real" DocumentStore, not the wrapped one, IMO.

      Attachments

        1. OAK-5446.diff
          4 kB
          Julian Reschke
        2. OAK-5446.testcase
          4 kB
          Stefan Egli
        3. OAK-5446.testcase.v3
          5 kB
          Stefan Egli
        4. OAK-5446-jr.diff
          5 kB
          Julian Reschke

        Issue Links

          Activity

            People

              Unassigned Unassigned
              stefan.eissing Stefan Eissing
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: