Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-2494

Make the distributed cache delete entires using LRU priority

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.20.205.0, 0.21.0
    • Fix Version/s: 0.20.205.0, 0.23.0
    • Component/s: distributed-cache
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Hide
      Added config option mapreduce.tasktracker.cache.local.keep.pct to the TaskTracker. It is the target percentage of the local distributed cache that should be kept in between garbage collection runs. In practice it will delete unused distributed cache entries in LRU order until the size of the cache is less than mapreduce.tasktracker.cache.local.keep.pct of the maximum cache size. This is a floating point value between 0.0 and 1.0. The default is 0.95.
      Show
      Added config option mapreduce.tasktracker.cache.local.keep.pct to the TaskTracker. It is the target percentage of the local distributed cache that should be kept in between garbage collection runs. In practice it will delete unused distributed cache entries in LRU order until the size of the cache is less than mapreduce.tasktracker.cache.local.keep.pct of the maximum cache size. This is a floating point value between 0.0 and 1.0. The default is 0.95.

      Description

      Currently the distributed cache will wait until a cache directory is above a preconfigured threshold. At which point it will delete all entries that are not currently being used. It seems like we would get far fewer cache misses if we kept some of them around, even when they are not being used. We should add in a configurable percentage for a goal of how much of the cache should remain clear when not in use, and select objects to delete based off of how recently they were used, and possibly also how large they are/how difficult is it to download them again.

      1. MAPREDUCE-2494-20.20X-V3.patch
        17 kB
        Robert Joseph Evans
      2. MAPREDUCE-2494-20.20X-V1.patch
        17 kB
        Robert Joseph Evans
      3. MAPREDUCE-2494-V2.patch
        16 kB
        Robert Joseph Evans
      4. MAPREDUCE-2494-V1.patch
        15 kB
        Robert Joseph Evans

        Issue Links

          Activity

          Robert Joseph Evans created issue -
          Robert Joseph Evans made changes -
          Field Original Value New Value
          Attachment MAPREDUCE-2494-V1.patch [ 12479370 ]
          Robert Joseph Evans made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Release Note Added config option mapreduce.tasktracker.cache.local.keep.pct to the TaskTracker. It is the minimum percentage of the local distributed cache that should be kept in between garbage collection runs. This is a floating point value between 0.0 and 1.0. The default is 0.75.
          Robert Joseph Evans made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Robert Joseph Evans made changes -
          Attachment MAPREDUCE-2494-V2.patch [ 12480159 ]
          Robert Joseph Evans made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Chris Douglas made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Hadoop Flags [Reviewed]
          Fix Version/s 0.23.0 [ 12315570 ]
          Resolution Fixed [ 1 ]
          Todd Lipcon made changes -
          Link This issue breaks MAPREDUCE-2573 [ MAPREDUCE-2573 ]
          Robert Joseph Evans made changes -
          Resolution Fixed [ 1 ]
          Status Resolved [ 5 ] Reopened [ 4 ]
          Robert Joseph Evans made changes -
          Attachment MAPREDUCE-2494-20.20X-V1.patch [ 12481842 ]
          Robert Joseph Evans made changes -
          Status Reopened [ 4 ] Patch Available [ 10002 ]
          Robert Joseph Evans made changes -
          Affects Version/s 0.20.205.0 [ 12316391 ]
          Robert Joseph Evans made changes -
          Fix Version/s 0.20.205.0 [ 12316391 ]
          Robert Joseph Evans made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Robert Joseph Evans made changes -
          Attachment MAPREDUCE-2494-20.20X-V3.patch [ 12487708 ]
          Robert Joseph Evans made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Release Note Added config option mapreduce.tasktracker.cache.local.keep.pct to the TaskTracker. It is the minimum percentage of the local distributed cache that should be kept in between garbage collection runs. This is a floating point value between 0.0 and 1.0. The default is 0.75. Added config option mapreduce.tasktracker.cache.local.keep.pct to the TaskTracker. It is the minimum percentage of the local distributed cache that should be kept in between garbage collection runs. This is a floating point value between 0.0 and 1.0. The default is 0.95.
          Robert Joseph Evans made changes -
          Release Note Added config option mapreduce.tasktracker.cache.local.keep.pct to the TaskTracker. It is the minimum percentage of the local distributed cache that should be kept in between garbage collection runs. This is a floating point value between 0.0 and 1.0. The default is 0.95. Added config option mapreduce.tasktracker.cache.local.keep.pct to the TaskTracker. It is the target percentage of the local distributed cache that should be kept in between garbage collection runs. In practice it will delete unused distributed cache entries in LRU order until the size of the cache is less than mapreduce.tasktracker.cache.local.keep.pct of the maximum cache size. This is a floating point value between 0.0 and 1.0. The default is 0.95.
          Mahadev konar made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Resolution Fixed [ 1 ]
          Matt Foley made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Robert Joseph Evans made changes -
          Link This issue breaks HDFS-3843 [ HDFS-3843 ]

            People

            • Assignee:
              Robert Joseph Evans
              Reporter:
              Robert Joseph Evans
            • Votes:
              0 Vote for this issue
              Watchers:
              15 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development