Uploaded image for project: 'Ignite'
  1. Ignite
  2. IGNITE-13358

Improvements for partition clearing related parts

Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments


    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 2.10
    • None
    • Docs Required, Release Notes Required


      We have several issues related to a partition clearing worth fixing.

      1. PartitionsEvictManager doent's provide obvious guarantees for a correctness when a node or a cache group is stopped while partitions are concurrently clearing.

      2. GridDhtLocalPartition#awaitDestroy is called while holding topology write lock, which is deadlock prone, because we currently require write lock to destroy a partition.

      3. GridDhtLocalPartition contains a lot of messy code related to partition clearing, most notably ClearFuture, but the clearing is done by PartitionsEvictManager. We should get rid of a clearing code in GridDhtLocalPartition. This should also bring better code readility and help understand what happening during a clearing.

      4. Currently moving partitions are cleared before rebalancing in the order different to rebalanceOrder, breaking the contract. Better to submit such partitions for clearing to the rebalancing pool before each group starts to rebalance. This will allow faster rebalancing (accoring to configured rebalance pool size) and will provide rebalanceOrder guarantees.

      5. The clearing logic for for moving partitions (before rebalancing) seems incorrect: it's possible to lost updates received during clearing.

      6. To clear partitions before full rebalancing we utilize same threads as for a partition eviction. This can slow rebalancing even if we have resources. Better to clear partitions in the rebalance pool (explicitely dedicated by user).

      7. It's possible to reserve a renting partition, which have absolutely no meaning. All operations with a renting partitions (except clearing) are a waste of resources.

      8. Partition eviction causes system pool tasks starvation if a number of threads in system pool=1. This can break crucial functionality.



          This comment will be Viewable by All Users Viewable by All Users


            ascherbakov Alexey Scherbakov Assign to me
            ascherbakov Alexey Scherbakov
            0 Vote for this issue
            3 Start watching this issue



              Time Tracking

              Original Estimate - Not Specified
              Not Specified
              Remaining Estimate - 0h
              Time Spent - 1h 10m
              1h 10m


                Issue deployment