Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-1909

TrackerDistributedCacheManager takes a blocking lock fo a loop that executes 10K times

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      In TrackerDistributedCachaManager.java , the portion where the cache is cleaned up, the lock is taken on the main hash table and then all the entries are scanned to see if they can be deleted. That's a long lockage. The table is likely to have 10K entries.

      I would like to reduce the longest lock duration by maintaining the set of CacheStatus es to delete incrementally.

      1: Let there be a new HashSet, deleteSet, that's protected under synchronized(cachedArchives)

      2: When refcount is decreased to 0, move the CacheStatus from cachedArchives to deleteSet

      3: When seeking an existing CacheStatus, look in deleteSet if it isn't in cachedArchives

      4: When refcount is increased from 0 to 1 in a pre-existing CacheStatus [see 3:, above] move the CacheStatus from deleteSet to cachedArchives

      5: When we clean the cache, under synchronized(cachedArchives) , move deleteSet to a local variable and create a new empty HashSet. This is constant time.

      Attachments

        Issue Links

          Activity

            People

              dking Dick King
              dking Dick King
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: