Uploaded image for project: 'Geode'
  1. Geode
  2. GEODE-5631

failedBatchRemovalMessageKeys never cleared

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 1.7.0
    • wan

    Description

      Experiment setup:

      • Region A created with async event listener attached to it
      • For every event processed by the async listener, a new entry is put into another region, Region B.
      • There is a client which does 1 million operations on 1500 keys on Region A. [to trigger conflation.]
      • 3 servers, 1 locator and 1 client.

      Issue:
      It was confirmed that after upgrading to 1.6.0 , we saw an increase in the memory footprint after all operations are completed.

      Cause:

      • We had a data structure to store all the queue removal messages that comes in when the secondary is in process of GII, called failedBatchRemovalMessageKeys.
      • Two removal messages were sent to the secondary for a single event, one from the processor which was processing the event and another from the conflation thread which conflated the event and hence wants the secondary to remove it.
      • Of the two messages whichever comes first, it removes the event from the queue.
      • When the second message comes in, and we try to remove it from the queue, it hits an EntryNotFoundException. This makes the message think that secondary is in GII and hence stores it in the failedBatchRemovalMessageKeys, and expects that when GII is complete, this message will be processed.
      • But GII was already done long before, and this data structure, failedBatchRemovalMessageKeys keeps storing messages which are never removed and hence a large memory footprint.

      Fix:
      The data structure failedBatchRemovalMessageKeys is not used anymore if it was already processed once, as GII happens once in a server’s lifecycle.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              nnag Nabarun Nag
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h 20m
                  1h 20m