Uploaded image for project: 'Ignite'
  1. Ignite
  2. IGNITE-13358

Improvements for partition clearing related parts

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 2.10
    • None
    • Docs Required, Release Notes Required

    Description

      We have several issues related to a partition clearing worth fixing.

      1. PartitionsEvictManager doent's provide obvious guarantees for a correctness when a node or a cache group is stopped while partitions are concurrently clearing.

      2. GridDhtLocalPartition#awaitDestroy is called while holding topology write lock, which is deadlock prone, because we currently require write lock to destroy a partition.

      3. GridDhtLocalPartition contains a lot of messy code related to partition clearing, most notably ClearFuture, but the clearing is done by PartitionsEvictManager. We should get rid of a clearing code in GridDhtLocalPartition. This should also bring better code readility and help understand what happening during a clearing.

      4. Currently moving partitions are cleared before rebalancing in the order different to rebalanceOrder, breaking the contract. Better to submit such partitions for clearing to the rebalancing pool before each group starts to rebalance. This will allow faster rebalancing (accoring to configured rebalance pool size) and will provide rebalanceOrder guarantees.

      5. The clearing logic for for moving partitions (before rebalancing) seems incorrect: it's possible to lost updates received during clearing.

      6. To clear partitions before full rebalancing we utilize same threads as for a partition eviction. This can slow rebalancing even if we have resources. Better to clear partitions in the rebalance pool (explicitely dedicated by user).

      7. It's possible to reserve a renting partition, which have absolutely no meaning. All operations with a renting partitions (except clearing) are a waste of resources.

      8. Partition eviction causes system pool tasks starvation if a number of threads in system pool=1. This can break crucial functionality.

      Attachments

        Issue Links

          Activity

            People

              ascherbakov Alexey Scherbakov
              ascherbakov Alexey Scherbakov
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h 10m
                  1h 10m