Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-2494

Make the distributed cache delete entires using LRU priority

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.20.205.0, 0.21.0
    • Fix Version/s: 0.20.205.0, 0.23.0
    • Component/s: distributed-cache
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Hide
      Added config option mapreduce.tasktracker.cache.local.keep.pct to the TaskTracker. It is the target percentage of the local distributed cache that should be kept in between garbage collection runs. In practice it will delete unused distributed cache entries in LRU order until the size of the cache is less than mapreduce.tasktracker.cache.local.keep.pct of the maximum cache size. This is a floating point value between 0.0 and 1.0. The default is 0.95.
      Show
      Added config option mapreduce.tasktracker.cache.local.keep.pct to the TaskTracker. It is the target percentage of the local distributed cache that should be kept in between garbage collection runs. In practice it will delete unused distributed cache entries in LRU order until the size of the cache is less than mapreduce.tasktracker.cache.local.keep.pct of the maximum cache size. This is a floating point value between 0.0 and 1.0. The default is 0.95.

      Description

      Currently the distributed cache will wait until a cache directory is above a preconfigured threshold. At which point it will delete all entries that are not currently being used. It seems like we would get far fewer cache misses if we kept some of them around, even when they are not being used. We should add in a configurable percentage for a goal of how much of the cache should remain clear when not in use, and select objects to delete based off of how recently they were used, and possibly also how large they are/how difficult is it to download them again.

        Attachments

        1. MAPREDUCE-2494-V2.patch
          16 kB
          Robert Joseph Evans
        2. MAPREDUCE-2494-V1.patch
          15 kB
          Robert Joseph Evans
        3. MAPREDUCE-2494-20.20X-V3.patch
          17 kB
          Robert Joseph Evans
        4. MAPREDUCE-2494-20.20X-V1.patch
          17 kB
          Robert Joseph Evans

          Issue Links

            Activity

              People

              • Assignee:
                revans2 Robert Joseph Evans
                Reporter:
                revans2 Robert Joseph Evans
              • Votes:
                0 Vote for this issue
                Watchers:
                15 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: