Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-2572

Throttle the deletion of data from the distributed cache

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Duplicate
    • Affects Version/s: 0.23.0
    • Fix Version/s: 0.23.0
    • Component/s: distributed-cache
    • Labels:
      None

      Description

      When deleting entries from the distributed cache we do so in a background thread. Once the size limit of the distributed cache is reached all unused entries are deleted. MAPREDUCE-2494 changes this so that entries are deleted in LRU order until the usage falls below a given threshold. In either of these cases we are periodically flooding a disk with delete requests which can slow down all IO operations to a drive. It would be better to be able to throttle this deletion so that it is spread out over a longer period of time. This jira is to add in this throttling.

      On investigating it seems much simpler to backport MPAREDUCE-2494 to 20S before implementing this change rather then try to implement it without LRU deletion, because LRU goes a long way towards reducing the load on the disk anyways.

        Attachments

        1. THROTTLING-security-v1.patch
          20 kB
          Robert Joseph Evans
        2. MR-2572-trunk-v1.patch
          1.0 kB
          Robert Joseph Evans

          Activity

            People

            • Assignee:
              revans2 Robert Joseph Evans
              Reporter:
              revans2 Robert Joseph Evans
            • Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: