Uploaded image for project: 'Ignite'
  1. Ignite
  2. IGNITE-13358

Improvements for partition clearing related parts



    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 2.10
    • None
    • Docs Required, Release Notes Required


      We have several issues related to a partition clearing worth fixing.

      1. PartitionsEvictManager doent's provide obvious guarantees for a correctness when a node or a cache group is stopped while partitions are concurrently clearing.

      2. GridDhtLocalPartition#awaitDestroy is called while holding topology write lock, which is deadlock prone, because we currently require write lock to destroy a partition.

      3. GridDhtLocalPartition contains a lot of messy code related to partition clearing, most notably ClearFuture, but the clearing is done by PartitionsEvictManager. We should get rid of a clearing code in GridDhtLocalPartition. This should also bring better code readility and help understand what happening during a clearing.

      4. Currently moving partitions are cleared before rebalancing in the order different to rebalanceOrder, breaking the contract. Better to submit such partitions for clearing to the rebalancing pool before each group starts to rebalance. This will allow faster rebalancing (accoring to configured rebalance pool size) and will provide rebalanceOrder guarantees.

      5. The clearing logic for for moving partitions (before rebalancing) seems incorrect: it's possible to lost updates received during clearing.

      6. To clear partitions before full rebalancing we utilize same threads as for a partition eviction. This can slow rebalancing even if we have resources. Better to clear partitions in the rebalance pool (explicitely dedicated by user).

      7. It's possible to reserve a renting partition, which have absolutely no meaning. All operations with a renting partitions (except clearing) are a waste of resources.

      8. Partition eviction causes system pool tasks starvation if a number of threads in system pool=1. This can break crucial functionality.


        Issue Links



              ascherbakov Alexey Scherbakov
              ascherbakov Alexey Scherbakov
              0 Vote for this issue
              3 Start watching this issue



                Time Tracking

                  Original Estimate - Not Specified
                  Not Specified
                  Remaining Estimate - 0h
                  Time Spent - 1h 10m
                  1h 10m