River
  1. River
  2. RIVER-142

concurrency problem in DGC lease expiration handling

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: jtsk_2.0
    • Fix Version/s: River_2.2.0
    • Component/s: net_jini_jeri
    • Labels:
      None
    • Bugtraq ID:
      4848840

      Description

      Bugtraq ID 4848840

      In the server-side DGC implementation's thread that check's for lease expirations (com.sun.jini.jeri.internal.runtime.ObjectTable.LeaseChecker.run), it checks for them while synchronized on the overall lease table, but it delays notifying the expired leases' individual registered Targets about the expirations until after it has released the lease table lock. This approach was taken from the JRMP implementation, which is that way because of the fix for 4118056 (a previous deadlock bug-- but now, I'm thinking that the JRMP implementation has this bug too).

      The problem seems to be that after releasing the lease table lock, it is possible for another lease renewal/request to come in (from the same DGC client and for the same remote object) that would then be invalidated by the subsequent Target notification made by the lease expiration check thread-- and thus the client's lease renewal (for that remote object) will be forgotten. It would appear that the synchronization approach here needs to be reconsidered.

      ( Comments note: )

      In addition to the basic problem of the expired-then-renewed client being removed from the referenced set, there is also the problem of the sequence table entry being forgotten-- which prevents detection of a "late clean call".

      Normally, late clean calls are not a problem because sequence numbers are retained while the client is in the referenced set (and there is no such thing as a "strong dirty"). But in this case, with the following order of events on the server side:

      1. dirty, seqNo=2
      2. (lease expiration)
      3. clean, seqNo=1

      The primary bug here is that the first two events will leave the client missing from the referenced set. But the secondary bug is that even if that's fixed, with the sequence number forgotten, the third event (the "late clean call") will still cause the client to be removed from the referenced set.

      1. River-142.patch
        68 kB
        Peter Firmstone

        Activity

        Hide
        Peter Firmstone added a comment -

        Fix attached for review

        Show
        Peter Firmstone added a comment - Fix attached for review
        Hide
        Hudson added a comment -

        Integrated in River-trunk #493 (See https://builds.apache.org/job/River-trunk/493/)
        River-142 Slightly different to the original patch, this commit fixes delayed garbage collection synchronization issues by processing expired leases immediately, without locking the entire object table. Lease has been changed to be responsible for expiry, notification and processing (on the garbage collection thread), synchronized internally. A Lease in the object table must now be replaced once it expires and cannot be renewed, it is removed from the table after it is marked expired, to prevent garbage collection of potentially active leases. Internal classes have been separated from ObjectTable and BasicExportTable to encapsulate or simplify synchronization and locking. Target is now more faithful to Exporter.unexport's documented behaviour and interrupts dispatched method calls when force is true when possible.

        I wasn't able to create a test to simulate the original failure condition, to do so requires a large number of leases to be processed (to create a time window to process garbage collection of leases after releasing the table lock) and proper timing of dirty calls, garbage collection and clean calls. The new code processes the lease immediately and isn't subject to the time window.

        Show
        Hudson added a comment - Integrated in River-trunk #493 (See https://builds.apache.org/job/River-trunk/493/ ) River-142 Slightly different to the original patch, this commit fixes delayed garbage collection synchronization issues by processing expired leases immediately, without locking the entire object table. Lease has been changed to be responsible for expiry, notification and processing (on the garbage collection thread), synchronized internally. A Lease in the object table must now be replaced once it expires and cannot be renewed, it is removed from the table after it is marked expired, to prevent garbage collection of potentially active leases. Internal classes have been separated from ObjectTable and BasicExportTable to encapsulate or simplify synchronization and locking. Target is now more faithful to Exporter.unexport's documented behaviour and interrupts dispatched method calls when force is true when possible. I wasn't able to create a test to simulate the original failure condition, to do so requires a large number of leases to be processed (to create a time window to process garbage collection of leases after releasing the table lock) and proper timing of dirty calls, garbage collection and clean calls. The new code processes the lease immediately and isn't subject to the time window.
        Hide
        Peter Firmstone added a comment -

        Fix committed

        Show
        Peter Firmstone added a comment - Fix committed
        Hide
        Tom Hobbs added a comment -

        Moved into River 2.2.0

        Show
        Tom Hobbs added a comment - Moved into River 2.2.0
        Hide
        Hudson added a comment -

        Integrated in River-trunk #504 (See https://builds.apache.org/job/River-trunk/504/)
        River-142

        Affects only River 2.2.0, this bug causes a memory leak by creating new threads unnecessarily.

        Show
        Hudson added a comment - Integrated in River-trunk #504 (See https://builds.apache.org/job/River-trunk/504/ ) River-142 Affects only River 2.2.0, this bug causes a memory leak by creating new threads unnecessarily.
        Hide
        Hudson added a comment -

        Integrated in River-trunk-jdk7 #5 (See https://builds.apache.org/job/River-trunk-jdk7/5/)
        River-142

        Affects only River 2.2.0, this bug causes a memory leak by creating new threads unnecessarily.

        Show
        Hudson added a comment - Integrated in River-trunk-jdk7 #5 (See https://builds.apache.org/job/River-trunk-jdk7/5/ ) River-142 Affects only River 2.2.0, this bug causes a memory leak by creating new threads unnecessarily.
        Hide
        Peter Firmstone added a comment -

        The last commit was to resolve issue River-403

        Show
        Peter Firmstone added a comment - The last commit was to resolve issue River-403
        Hide
        Peter Firmstone added a comment -

        Bug Fixed

        Show
        Peter Firmstone added a comment - Bug Fixed

          People

          • Assignee:
            Peter Firmstone
            Reporter:
            Peter Jones
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development