Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-665

FIFO Cache (Unsynchronized): 9x times performance boost

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Duplicate
    • Affects Version/s: 1.3
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Environment:

      JRockit R27 (Java 6)

      Description

      Attached is modified version of LRUCache where
      1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that "reordering"/true (performance bottleneck of LRU) is replaced to "insertion-order"/false (so that it became FIFO)
      2. Almost all (absolutely unneccessary) synchronized statements commented out

      See discussion at http://www.nabble.com/LRUCache---synchronized%21--td16439831.html

      Performance metrics (taken from SOLR Admin):

      LRU
      Requests: 7638
      Average Time-Per-Request: 15300
      Average Request-per-Second: 0.06

      FIFO:
      Requests: 3355
      Average Time-Per-Request: 1610
      Average Request-per-Second: 0.11

      Performance increased 9 times which roughly corresponds to a number of CPU in a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)

      Current number of documents: 7494689
      name: filterCache
      class: org.apache.solr.search.LRUCache
      version: 1.0
      description: LRU Cache(maxSize=10000000, initialSize=1000)
      stats: lookups : 15966954582
      hits : 16391851546
      hitratio : 0.102
      inserts : 4246120
      evictions : 0
      size : 2668705
      cumulative_lookups : 16415839763
      cumulative_hits : 16411608101
      cumulative_hitratio : 0.99
      cumulative_inserts : 4246246
      cumulative_evictions : 0

      Thanks

      1. SimplestConcurrentLRUCache.java
        2 kB
        Fuad Efendi
      2. FIFOCache.java
        8 kB
        Fuad Efendi
      3. ConcurrentLRUWeakCache.java
        2 kB
        Fuad Efendi
      4. ConcurrentLRUCache.java
        3 kB
        Noble Paul
      5. ConcurrentFIFOCache.java
        1 kB
        Noble Paul
      6. ConcurrentFIFOCache.java
        2 kB
        Noble Paul

        Activity

        Hide
        funtick Fuad Efendi added a comment -

        I renamed to FIFOCache just before opening an issue; in a local system it is (modified) LRUCache so that filterCache has reference to 'old' name...

        Show
        funtick Fuad Efendi added a comment - I renamed to FIFOCache just before opening an issue; in a local system it is (modified) LRUCache so that filterCache has reference to 'old' name...
        Hide
        funtick Fuad Efendi added a comment -

        Almost forgot: I am estimating performance gains basing on real application in-production, multithreaded, Tomcat 6.0.16 & JRockit R27 (Java 6) & SLES10 & two quad-core Opteron 2350 (8 CPUs total) & 25Gb for SOLR...
        And, first queries run a long, more than a minute (warming up caches with faceted query id:[* TO *]) average Time-Per-Request decreases over time and it is now 1591.5232 giving 10x performance boost.
        Facets are highly distributed as you can see from website and filterCache... HTTP caching is supported - to see real timing you should execute real queries...
        ConcurrentHashMap is not applicable - we are not modifying cached item indeed... FIFO is without 'Out' if we have enough memory.

        Show
        funtick Fuad Efendi added a comment - Almost forgot: I am estimating performance gains basing on real application in-production, multithreaded, Tomcat 6.0.16 & JRockit R27 (Java 6) & SLES10 & two quad-core Opteron 2350 (8 CPUs total) & 25Gb for SOLR... And, first queries run a long, more than a minute (warming up caches with faceted query id: [* TO *] ) average Time-Per-Request decreases over time and it is now 1591.5232 giving 10x performance boost. Facets are highly distributed as you can see from website and filterCache... HTTP caching is supported - to see real timing you should execute real queries... ConcurrentHashMap is not applicable - we are not modifying cached item indeed... FIFO is without 'Out' if we have enough memory.
        Hide
        yseeley@gmail.com Yonik Seeley added a comment -

        This implementation isn't thread safe (due the the removed synchronization).

        Show
        yseeley@gmail.com Yonik Seeley added a comment - This implementation isn't thread safe (due the the removed synchronization).
        Hide
        noble.paul Noble Paul added a comment -

        Why do we have almost the entire get() method synchronized on the map in LRUCache .
        stats increment can possibly be put out of synchronized

        Show
        noble.paul Noble Paul added a comment - Why do we have almost the entire get() method synchronized on the map in LRUCache . stats increment can possibly be put out of synchronized
        Hide
        yseeley@gmail.com Yonik Seeley added a comment -

        Why do we have almost the entire get() method synchronized on the map in LRUCache .

        It simplified the code and reduced the number of branches. What's your proposed version?
        Here is the current method:

          public Object get(Object key) {
            synchronized (map) {
              Object val = map.get(key);
              if (state == State.LIVE) {
                // only increment lookups and hits if we are live.
                lookups++;
                stats.lookups.incrementAndGet();
                if (val!=null) {
                  hits++;
                  stats.hits.incrementAndGet();
                }
              }
              return val;
            }
          }
        
        Show
        yseeley@gmail.com Yonik Seeley added a comment - Why do we have almost the entire get() method synchronized on the map in LRUCache . It simplified the code and reduced the number of branches. What's your proposed version? Here is the current method: public Object get( Object key) { synchronized (map) { Object val = map.get(key); if (state == State.LIVE) { // only increment lookups and hits if we are live. lookups++; stats.lookups.incrementAndGet(); if (val!= null ) { hits++; stats.hits.incrementAndGet(); } } return val; } }
        Hide
        noble.paul Noble Paul added a comment - - edited

        I guess this is safe and faster

        public Object get(Object key) {
            synchronized (map) {
              Object val = map.get(key);
              if (state == State.LIVE) {
                // only increment lookups and hits if we are live.
                lookups++;
              }
            }
              stats.lookups.incrementAndGet();
              if (val != null) {
                //hits++; this must be removed
                stats.hits.incrementAndGet();
              }
              return val;
          }
        

        let us remove the field hits and use stats.hits wherever we need it

        Show
        noble.paul Noble Paul added a comment - - edited I guess this is safe and faster public Object get( Object key) { synchronized (map) { Object val = map.get(key); if (state == State.LIVE) { // only increment lookups and hits if we are live. lookups++; } } stats.lookups.incrementAndGet(); if (val != null ) { //hits++; this must be removed stats.hits.incrementAndGet(); } return val; } let us remove the field hits and use stats.hits wherever we need it
        Hide
        yseeley@gmail.com Yonik Seeley added a comment - - edited

        Your version changes what the method does in a couple of respects though.

        let us remove the field hits and use stats.hits wherever we need it

        stats.hits is for all caches of this type (it's shared across searchers). hits is local to this cache instance.

        Show
        yseeley@gmail.com Yonik Seeley added a comment - - edited Your version changes what the method does in a couple of respects though. let us remove the field hits and use stats.hits wherever we need it stats.hits is for all caches of this type (it's shared across searchers). hits is local to this cache instance.
        Hide
        funtick Fuad Efendi added a comment -

        Regarding Thread Safety:

        • yes, we need synchronized get() method for LRU cache because each access reorders LinkedHashMap.
        • absolutely no need to synchronize get() method for FIFO!
        • probably we need to deal with insertion, but do not synchronize it: instead, extend LinkedHashMap and make some 'removal' synchronized... With caches large enough to store all object we do not need it. We probably do not need to synchronize 'removal' at all - it removes entry but does not remove/finalize referenced object.

        From JavaDoc: "Note that this implementation is not synchronized. If multiple threads access a linked hash map concurrently, and at least
        one of the threads modifies the map structurally, it must be synchronized externally."
        However, we do not modify cache structurally during iteration loop or any other 'structure' access (we do not use Iterator!) - so, advice from JavaDoc is not applicable.

        We should synchronize only removeEntryForKey of HashMap; unfortunately we can't do it without rewriting HashMap. Probably we can use ConcurrentHashMap as a base class of LinkedHashMap, but I don't know answer yet... I am guessing that unsynchronized entry removal won't be significant issue in multithreaded environment.

        Show
        funtick Fuad Efendi added a comment - Regarding Thread Safety: yes, we need synchronized get() method for LRU cache because each access reorders LinkedHashMap. absolutely no need to synchronize get() method for FIFO! probably we need to deal with insertion, but do not synchronize it: instead, extend LinkedHashMap and make some 'removal' synchronized... With caches large enough to store all object we do not need it. We probably do not need to synchronize 'removal' at all - it removes entry but does not remove/finalize referenced object. From JavaDoc: "Note that this implementation is not synchronized. If multiple threads access a linked hash map concurrently, and at least one of the threads modifies the map structurally, it must be synchronized externally." However, we do not modify cache structurally during iteration loop or any other 'structure' access (we do not use Iterator!) - so, advice from JavaDoc is not applicable. We should synchronize only removeEntryForKey of HashMap; unfortunately we can't do it without rewriting HashMap. Probably we can use ConcurrentHashMap as a base class of LinkedHashMap, but I don't know answer yet... I am guessing that unsynchronized entry removal won't be significant issue in multithreaded environment.
        Hide
        yseeley@gmail.com Yonik Seeley added a comment -

        absolutely no need to synchronize get() method for FIFO!

        A cache is not a read-only object. Gets need to be synchronized because other threads can be changing the cache.
        If anyone wants to learn more about thread safety and concurrency, I'd recommend "Java concurrency in practice"
        http://www.javaconcurrencyinpractice.com/

        Show
        yseeley@gmail.com Yonik Seeley added a comment - absolutely no need to synchronize get() method for FIFO! A cache is not a read-only object. Gets need to be synchronized because other threads can be changing the cache. If anyone wants to learn more about thread safety and concurrency, I'd recommend "Java concurrency in practice" http://www.javaconcurrencyinpractice.com/
        Hide
        funtick Fuad Efendi added a comment -
        • Classes from java.util.concurrent.atomic designed NOT to be synchronized, per-instance stats should be replaced to AtomicLong instead of long:
          {{ private long lookups;
          private long hits;
          private long inserts;
          private long evictions;}}
        • get() method of FIFO do not need any synchronization
        • get() method of LRU reorers LinkedHashMap, unsynchronized access may cause orphan Entry objects
        • synchronized insertion for FIFO won't cause performance degradation because get() is unsynchronized.
        Show
        funtick Fuad Efendi added a comment - Classes from java.util.concurrent.atomic designed NOT to be synchronized, per-instance stats should be replaced to AtomicLong instead of long : {{ private long lookups; private long hits; private long inserts; private long evictions;}} get() method of FIFO do not need any synchronization get() method of LRU reorers LinkedHashMap, unsynchronized access may cause orphan Entry objects synchronized insertion for FIFO won't cause performance degradation because get() is unsynchronized.
        Hide
        yseeley@gmail.com Yonik Seeley added a comment -

        per-instance stats should be replaced to AtomicLong instead of long.

        Since we need to synchronize anyway, it's more efficient to just stick things like hits++ inside the sync block instead of having a separate AtomicLong.

        Show
        yseeley@gmail.com Yonik Seeley added a comment - per-instance stats should be replaced to AtomicLong instead of long. Since we need to synchronize anyway, it's more efficient to just stick things like hits++ inside the sync block instead of having a separate AtomicLong.
        Hide
        funtick Fuad Efendi added a comment -

        bq. absolutely no need to synchronize get() method for FIFO!

        A cache is not a read-only object. Gets need to be synchronized because other threads can be changing the cache.

        I am familiar with Doug Lea's findings (he wrote his book in 1996, and it became part of java.util.concurrent after 10(!!!) years).

        changing the cache

        • what is 'cache change'?
          Changing the stored Key, changing the referenced object - never ever happens in SOLR. Removal of object - yes. More correctly: removal of key. get(MyKey) will return null OR will return MyValue, and "ConcurentCacheModification" will never be thrown in SOLR (we are not using Iterator!). We can insert concurrently the same (key, value) pairs - not a problem.
        Show
        funtick Fuad Efendi added a comment - bq. absolutely no need to synchronize get() method for FIFO! A cache is not a read-only object. Gets need to be synchronized because other threads can be changing the cache. I am familiar with Doug Lea's findings (he wrote his book in 1996, and it became part of java.util.concurrent after 10(!!!) years). changing the cache what is 'cache change'? Changing the stored Key, changing the referenced object - never ever happens in SOLR. Removal of object - yes. More correctly: removal of key. get(MyKey) will return null OR will return MyValue, and "ConcurentCacheModification" will never be thrown in SOLR (we are not using Iterator!). We can insert concurrently the same (key, value) pairs - not a problem.
        Hide
        yseeley@gmail.com Yonik Seeley added a comment -

        what is 'cache change'?

        Adding a new key/value pair to the cache, or removal of a key/value pair from the cache.

        get(MyKey) will return null OR will return MyValue, and "ConcurentCacheModification" will never be thrown in SOLR

        This would only be under very specific usage patterns - everything pre-cached, and no adds or removes once the cache is "LIVE" (accessed by more than one thread concurrently).
        ConcurentModification is best effort, not guaranteed. You can still get incorrect results or other exceptions from code that isn't thread safe, even when a ConcurentModification isn't thrown.

        Show
        yseeley@gmail.com Yonik Seeley added a comment - what is 'cache change'? Adding a new key/value pair to the cache, or removal of a key/value pair from the cache. get(MyKey) will return null OR will return MyValue, and "ConcurentCacheModification" will never be thrown in SOLR This would only be under very specific usage patterns - everything pre-cached, and no adds or removes once the cache is "LIVE" (accessed by more than one thread concurrently). ConcurentModification is best effort, not guaranteed. You can still get incorrect results or other exceptions from code that isn't thread safe, even when a ConcurentModification isn't thrown.
        Hide
        timmsc Sean Timm added a comment -

        Fuad--

        I agree with Yonik here. From the HashMap Javadoc:

        If multiple threads access this map concurrently, and at least one of the threads modifies the map structurally, it must be synchronized externally. (A structural modification is any operation that adds or deletes one or more mappings; merely changing the value associated with a key that an instance already contains is not a structural modification.)

        Show
        timmsc Sean Timm added a comment - Fuad-- I agree with Yonik here. From the HashMap Javadoc: If multiple threads access this map concurrently, and at least one of the threads modifies the map structurally, it must be synchronized externally. (A structural modification is any operation that adds or deletes one or more mappings; merely changing the value associated with a key that an instance already contains is not a structural modification.)
        Hide
        funtick Fuad Efendi added a comment -

        If multiple threads access this map concurrently, and at least one of the threads modifies the map structurally, it must be synchronized externally. (A structural modification is any operation that adds or deletes one or more mappings; merely changing the value associated with a key that an instance already contains is not a structural modification.)

        I already commented it. Just try to avoid 'books' and 'authors', also try to find meaning in JavaDocs and try to browse JavaSource instead...

        • structural modification, related ConcurrentModificationException, and related Iterator: not applicable to SOLR Cache.

        Will never happen. We need to synchronize inserts because each insert may calculate size and remove 'eldest' entry, and we need to avoid OOMs. We need to synchronize retrieves for LRU because 'Linked' HashMap (with Access Order) will change links (will reorder Map). And we do not need to synchronize retrieves from Insertion-Ordered LinkedHashMap (FIFO). It is classic... i'd like to research more java.util.concurrent

        Show
        funtick Fuad Efendi added a comment - If multiple threads access this map concurrently, and at least one of the threads modifies the map structurally, it must be synchronized externally. (A structural modification is any operation that adds or deletes one or more mappings; merely changing the value associated with a key that an instance already contains is not a structural modification.) I already commented it. Just try to avoid 'books' and 'authors', also try to find meaning in JavaDocs and try to browse JavaSource instead... structural modification , related ConcurrentModificationException , and related Iterator : not applicable to SOLR Cache. Will never happen. We need to synchronize inserts because each insert may calculate size and remove 'eldest' entry, and we need to avoid OOMs. We need to synchronize retrieves for LRU because 'Linked' HashMap (with Access Order) will change links (will reorder Map). And we do not need to synchronize retrieves from Insertion-Ordered LinkedHashMap (FIFO). It is classic... i'd like to research more java.util.concurrent
        Hide
        yseeley@gmail.com Yonik Seeley added a comment -

        Closing, since it's pretty much impossible to avoid all forms of synchronization and memory barriers for get() on a shared Map object.

        Show
        yseeley@gmail.com Yonik Seeley added a comment - Closing, since it's pretty much impossible to avoid all forms of synchronization and memory barriers for get() on a shared Map object.
        Hide
        funtick Fuad Efendi added a comment -

        Why? We have PoC at least, and we know the bottleneck!
        We need improvement: avoid some synchronization; it is extremely easy with FIFO. We may try to improve LRU too.
        Not everything in JAVA is extremely good: for instance, synchronization. Even for single-threaded application, it needs additionally 600 CPU cycles (I've read it somewhere for SUN Java 1.3 on Windows)

        Yonik, please allow some time to think / to discuss.

        I'll try also to provide 'concurrent' LRU; but this issue is FIFO related.

        Show
        funtick Fuad Efendi added a comment - Why? We have PoC at least, and we know the bottleneck! We need improvement: avoid some synchronization; it is extremely easy with FIFO. We may try to improve LRU too. Not everything in JAVA is extremely good: for instance, synchronization. Even for single-threaded application, it needs additionally 600 CPU cycles (I've read it somewhere for SUN Java 1.3 on Windows) Yonik, please allow some time to think / to discuss. I'll try also to provide 'concurrent' LRU; but this issue is FIFO related.
        Hide
        funtick Fuad Efendi added a comment -

        get() method do not need to be synchronized for FIFO cache. Unsynchronized object retrieval is not structural modification of Insertion-Ordered LinkedHashMap. Unsynchronized cache improves performance of multithreaded applications linear to number of CPUs.

        Show
        funtick Fuad Efendi added a comment - get() method do not need to be synchronized for FIFO cache. Unsynchronized object retrieval is not structural modification of Insertion-Ordered LinkedHashMap. Unsynchronized cache improves performance of multithreaded applications linear to number of CPUs.
        Hide
        noble.paul Noble Paul added a comment -

        OK.Let us try to study the failure cases .
        get() does no state modification.

        So if the get is unsynchronized the worst case scenario is that we may get an Exception (I dunno what ConcurrentModificationException? ArrayIndexOutOfBoundsException? ). How often does it happen? What is the probability?

        What is the big deal ?(the system won't crash). We can catch the Exception and return null as if the entry was not found . That means that Solr may have to recompute results where it did not have to (this is a cost).

        here we totally eliminated the cost of synchronization (benefit). I guess if you do a cost benefit analysis this does not turn out to be as bad as it is projected to be.

        And after all the user is knowingly using this fully aware of the cost. is there anything else I have not considered

        Show
        noble.paul Noble Paul added a comment - OK.Let us try to study the failure cases . get() does no state modification. So if the get is unsynchronized the worst case scenario is that we may get an Exception (I dunno what ConcurrentModificationException? ArrayIndexOutOfBoundsException? ). How often does it happen? What is the probability? What is the big deal ?(the system won't crash). We can catch the Exception and return null as if the entry was not found . That means that Solr may have to recompute results where it did not have to (this is a cost). here we totally eliminated the cost of synchronization (benefit). I guess if you do a cost benefit analysis this does not turn out to be as bad as it is projected to be. And after all the user is knowingly using this fully aware of the cost. is there anything else I have not considered
        Hide
        funtick Fuad Efendi added a comment -

        ...and at least one of the threads modifies the map structurally, it <i>must</i> be synchronized externally.

        • so that thread running on insert must be synchronized and not thread running on retrieve. Again, we need synchronize insert only to avoid memory leaks, not more. SOLR does not iterate Map.
        Show
        funtick Fuad Efendi added a comment - ...and at least one of the threads modifies the map structurally, it <i>must</i> be synchronized externally. so that thread running on insert must be synchronized and not thread running on retrieve . Again, we need synchronize insert only to avoid memory leaks, not more. SOLR does not iterate Map.
        Hide
        funtick Fuad Efendi added a comment -

        ConcurrentModificationException is thrown only when we iterate Map and another thread modified structure; with LRU each get() modifies structure, with FIFO - only inserts...

        Show
        funtick Fuad Efendi added a comment - ConcurrentModificationException is thrown only when we iterate Map and another thread modified structure; with LRU each get() modifies structure, with FIFO - only inserts...
        Hide
        shalinmangar Shalin Shekhar Mangar added a comment -

        I think Fuad has a point. In an Insertion ordered LinkedHashMap, get makes no structural modification and if we synchronize put/remove, we should be fine. The cache warming is already thread-safe and we don't have iterators anywhere. Am I missing something here?

        Show
        shalinmangar Shalin Shekhar Mangar added a comment - I think Fuad has a point. In an Insertion ordered LinkedHashMap, get makes no structural modification and if we synchronize put/remove, we should be fine. The cache warming is already thread-safe and we don't have iterators anywhere. Am I missing something here?
        Hide
        markrmiller@gmail.com Mark Miller added a comment -

        Sent to the mailing List:

        >>Depends on if you want to go by the javadoc or not - it says that if any of the threads accessing the map concurrently make a structural change, the map must be synchronized. It's clear that get does not >>make a structural change when using insertion order, but can any other threads possibly make a mapping change while calling get? Sounds like yes, and so the contract would seem to indicate you >>synchronize...

        >>Put can modify shared variables that get accesses - sounds like dangerous ground to me. If it works, its got to be sneaky enough to warrant code smell...

        >>- Mark

        Further:

        Here is just one of possibly many problems - a put call can resize and make an entirely new table array. A get call can do something like the following with the table array:

        table[indexFor(hash, table.length)

        Do to execution reordering / memory barriers, would seem to me that the table being indexed into and the table.length may not be the values you were hoping for...

        Show
        markrmiller@gmail.com Mark Miller added a comment - Sent to the mailing List: >>Depends on if you want to go by the javadoc or not - it says that if any of the threads accessing the map concurrently make a structural change, the map must be synchronized. It's clear that get does not >>make a structural change when using insertion order, but can any other threads possibly make a mapping change while calling get? Sounds like yes, and so the contract would seem to indicate you >>synchronize... >>Put can modify shared variables that get accesses - sounds like dangerous ground to me. If it works, its got to be sneaky enough to warrant code smell... >>- Mark Further: Here is just one of possibly many problems - a put call can resize and make an entirely new table array. A get call can do something like the following with the table array: table[indexFor(hash, table.length) Do to execution reordering / memory barriers, would seem to me that the table being indexed into and the table.length may not be the values you were hoping for...
        Hide
        klaasm Mike Klaas added a comment -

        I haven't looked at the proposed code at all, but it is possible to design this kind of datastructure, with much care:

        http://www.ddj.com/hpc-high-performance-computing/208801974

        Show
        klaasm Mike Klaas added a comment - I haven't looked at the proposed code at all, but it is possible to design this kind of datastructure, with much care: http://www.ddj.com/hpc-high-performance-computing/208801974
        Hide
        shalinmangar Shalin Shekhar Mangar added a comment -

        Yes, I'm sorry for speaking out of turn here.

        Show
        shalinmangar Shalin Shekhar Mangar added a comment - Yes, I'm sorry for speaking out of turn here.
        Hide
        noble.paul Noble Paul added a comment -

        OK . I take my words back. LinkedHashMap is backed a by a HashMap . The table can get resized during the lifetime of a map (and hence during a get operation also). So the get() may return a wrong value (which is unacceptable) and it may never throw an Exception.

        If we ever want to have an LRUCache with a non -synchronized get() we must have it backed by a ConcurrentHashMap

        Show
        noble.paul Noble Paul added a comment - OK . I take my words back. LinkedHashMap is backed a by a HashMap . The table can get resized during the lifetime of a map (and hence during a get operation also). So the get() may return a wrong value (which is unacceptable) and it may never throw an Exception. If we ever want to have an LRUCache with a non -synchronized get() we must have it backed by a ConcurrentHashMap
        Hide
        funtick Fuad Efendi added a comment - - edited

        Mark, thanks for going deeper. Yes, resize may change ( Entry[ ] ) table, key will disappear from bucket and _get() returns null:

            /**
             * The table, resized as necessary. Length MUST Always be a power of two.
             */
            transient Entry[] table;
        ...
            public V get(Object key) {
        	if (key == null)
        	    return getForNullKey();
                int hash = hash(key.hashCode());
                for (Entry<K,V> e = table[indexFor(hash, table.length)];
                     e != null;
                     e = e.next) {
                    Object k;
                    if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
                        return e.value;
                }
                return null;
            }
        

        Might get null instead of MyValue (during resize by another thread); not a big deal! It is still Thread-Safe. Will add new entry (overwrite existing) in such extremely rare cases.

        • get() will never return wrong value (Entry object is immutable in our case):
          if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
                          return e.value;
          
        Show
        funtick Fuad Efendi added a comment - - edited Mark, thanks for going deeper. Yes, resize may change ( Entry[ ] ) table , key will disappear from bucket and _get() returns null: /** * The table, resized as necessary. Length MUST Always be a power of two. */ transient Entry[] table; ... public V get( Object key) { if (key == null ) return getForNullKey(); int hash = hash(key.hashCode()); for (Entry<K,V> e = table[indexFor(hash, table.length)]; e != null ; e = e.next) { Object k; if (e.hash == hash && ((k = e.key) == key || key.equals(k))) return e.value; } return null ; } Might get null instead of MyValue (during resize by another thread); not a big deal! It is still Thread-Safe. Will add new entry (overwrite existing) in such extremely rare cases. get() will never return wrong value (Entry object is immutable in our case): if (e.hash == hash && ((k = e.key) == key || key.equals(k))) return e.value;
        Hide
        shalinmangar Shalin Shekhar Mangar added a comment -

        Fuad, before going further on this, we must also consider users who use alternate JVM implementations. Since the contract is to synchronize, we cannot base our decision on Sun JDK's implementation.

        Show
        shalinmangar Shalin Shekhar Mangar added a comment - Fuad, before going further on this, we must also consider users who use alternate JVM implementations. Since the contract is to synchronize, we cannot base our decision on Sun JDK's implementation.
        Hide
        markrmiller@gmail.com Mark Miller added a comment -

        I don't have time to look at this right now, but I'll certainly spend some time tonight Efendi.

        A quick to my point though: I mentioned something that might be done in an implementation that could be an issue - you came back with a specific implementation that may (or may not, I don't have time at the moment) not work anyway. We can't program to an implementation though - we must program to the contract.

        Show
        markrmiller@gmail.com Mark Miller added a comment - I don't have time to look at this right now, but I'll certainly spend some time tonight Efendi. A quick to my point though: I mentioned something that might be done in an implementation that could be an issue - you came back with a specific implementation that may (or may not, I don't have time at the moment) not work anyway. We can't program to an implementation though - we must program to the contract.
        Hide
        yseeley@gmail.com Yonik Seeley added a comment -

        resize() is only the most obvious problem... there are tons of ways this can fail, even if you get around the resize (and your "null" fix for resize() won't work). Some of the failures can be in the form of incorrect results rather than null pointers or exceptions (so you can't just retry).

        I'll reiterate:

        it's pretty much impossible to avoid all forms of synchronization and memory barriers for get() on a shared Map object.

        Show
        yseeley@gmail.com Yonik Seeley added a comment - resize() is only the most obvious problem... there are tons of ways this can fail, even if you get around the resize (and your "null" fix for resize() won't work). Some of the failures can be in the form of incorrect results rather than null pointers or exceptions (so you can't just retry). I'll reiterate: it's pretty much impossible to avoid all forms of synchronization and memory barriers for get() on a shared Map object.
        Hide
        funtick Fuad Efendi added a comment -

        Shalin, we are already using AtomicLong of Java 5; JVM is defferent story... JRockit R27 is JVM from BEA-Oracle, and its JDK 6 powered (rt.jar comes from SUN).
        I just tried to compare synchronized with unsynchronized and found it is the main problem for faceting...
        Another problem: somehow faceting recalculates each time (using filterCache during repeated recalculations), queryCache is not enough...

        Show
        funtick Fuad Efendi added a comment - Shalin, we are already using AtomicLong of Java 5 ; JVM is defferent story... JRockit R27 is JVM from BEA-Oracle, and its JDK 6 powered (rt.jar comes from SUN). I just tried to compare synchronized with unsynchronized and found it is the main problem for faceting... Another problem: somehow faceting recalculates each time (using filterCache during repeated recalculations ), queryCache is not enough...
        Hide
        funtick Fuad Efendi added a comment -

        We need to invite Doug Lea to this discussion...
        http://en.wikipedia.org/wiki/Doug_Lea
        http://gee.cs.oswego.edu/dl/index.html

        We may simply use java.util.concurrent.locks instead of heavy synchronized... we may also use Executor framework instead of single-thread faceting... We may even base SOLR on Apache MINA project.

        Show
        funtick Fuad Efendi added a comment - We need to invite Doug Lea to this discussion... http://en.wikipedia.org/wiki/Doug_Lea http://gee.cs.oswego.edu/dl/index.html We may simply use java.util.concurrent.locks instead of heavy synchronized... we may also use Executor framework instead of single-thread faceting... We may even base SOLR on Apache MINA project.
        Hide
        shalinmangar Shalin Shekhar Mangar added a comment -

        I just tried to compare synchronized with unsynchronized and found it is the main problem for faceting...

        Fuad, I am completely with you on this. Everybody will agree that a more efficient implementation will be a very useful addition to Solr. However, Yonik's concerns on the current patch are valid and we cannot go ahead with the current one.

        Show
        shalinmangar Shalin Shekhar Mangar added a comment - I just tried to compare synchronized with unsynchronized and found it is the main problem for faceting... Fuad, I am completely with you on this. Everybody will agree that a more efficient implementation will be a very useful addition to Solr. However, Yonik's concerns on the current patch are valid and we cannot go ahead with the current one.
        Hide
        funtick Fuad Efendi added a comment - - edited

        concerns are probably because of misunderstanding of some contract...

            /**
             * The table, resized as necessary. Length MUST Always be a power of two.
             */
            transient Entry[] table;
        
            void resize(int newCapacity) {
                Entry[] oldTable = table;
                int oldCapacity = oldTable.length;
                if (oldCapacity == MAXIMUM_CAPACITY) {
                    threshold = Integer.MAX_VALUE;
                    return;
                }
        
                Entry[] newTable = new Entry[newCapacity];
                transfer(newTable);
                table = newTable;
                threshold = (int)(newCapacity * loadFactor);
            }
        
        
            public V get(Object key) {
        	if (key == null)
        	    return getForNullKey();
                int hash = hash(key.hashCode());
                for (Entry<K,V> e = table[indexFor(hash, table.length)];
                     e != null;
                     e = e.next) {
                    Object k;
                    if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
                        return e.value;
                }
                return null;
            }
        
        
        • in worst case we will have pointer to old table and even with new one of smaller size we won't get any ArrayIndexOutOfBounds.
        • There is no any contract requiring synchronization on get() of HashMap or LinkedHashMap; it IS application specific.
        • we will never have wrong results because Entry is immutable
            /** 
             * Transfer all entries from current table to newTable.
             */
            void transfer(Entry[] newTable) {
                Entry[] src = table;
                int newCapacity = newTable.length;
                for (int j = 0; j < src.length; j++) {
                    Entry<K,V> e = src[j];
                    if (e != null) {
                        src[j] = null;
                        do {
                            Entry<K,V> next = e.next;
                            int i = indexFor(e.hash, newCapacity);  
                            e.next = newTable[i];
                            newTable[i] = e;
                            e = next;
                        } while (e != null);
                    }
                }
            }
        
        • We won't have even any NullPointerException after src[j] = null.

        P.S.
        Of course, I agree - it is Java internals, and it is not public Map interface-contract - should we avoid to use implementation then? I believe it is specified somewhere in JSR too...

         * @author  Doug Lea
         * @author  Josh Bloch
         * @author  Arthur van Hoff
         * @author  Neal Gafter
         * @version 1.65, 03/03/05
        

        P.P.S.
        Do not forget to look at the top of this discussion:

        description: xxx Cache(maxSize=10000000, initialSize=1000) 
        size : 2668705
        cumulative_inserts : 4246246
        
        • cumulative_inserts is almost double of size which shows that double-inserts are real
        • I checked catalina_out: no any NullPointerException, ArrayIndexOutOfBoundsException, and etc.
        • I don't think we should be worried too much about possible change of Map implementation by SUN ... in this case we should use neither java.lang.String nor java.util.Date (some are placed in wrong packages).
        • about thread safety: some participants are wrongly guessing that making object access totally synchronized will make their code thread-safe... deadlock.
        Show
        funtick Fuad Efendi added a comment - - edited concerns are probably because of misunderstanding of some contract ... /** * The table, resized as necessary. Length MUST Always be a power of two. */ transient Entry[] table; void resize( int newCapacity) { Entry[] oldTable = table; int oldCapacity = oldTable.length; if (oldCapacity == MAXIMUM_CAPACITY) { threshold = Integer .MAX_VALUE; return ; } Entry[] newTable = new Entry[newCapacity]; transfer(newTable); table = newTable; threshold = ( int )(newCapacity * loadFactor); } public V get( Object key) { if (key == null ) return getForNullKey(); int hash = hash(key.hashCode()); for (Entry<K,V> e = table[indexFor(hash, table.length)]; e != null ; e = e.next) { Object k; if (e.hash == hash && ((k = e.key) == key || key.equals(k))) return e.value; } return null ; } in worst case we will have pointer to old table and even with new one of smaller size we won't get any ArrayIndexOutOfBounds. There is no any contract requiring synchronization on get() of HashMap or LinkedHashMap; it IS application specific. we will never have wrong results because Entry is immutable /** * Transfer all entries from current table to newTable. */ void transfer(Entry[] newTable) { Entry[] src = table; int newCapacity = newTable.length; for ( int j = 0; j < src.length; j++) { Entry<K,V> e = src[j]; if (e != null ) { src[j] = null ; do { Entry<K,V> next = e.next; int i = indexFor(e.hash, newCapacity); e.next = newTable[i]; newTable[i] = e; e = next; } while (e != null ); } } } We won't have even any NullPointerException after src [j] = null. P.S. Of course, I agree - it is Java internals, and it is not public Map interface- contract - should we avoid to use implementation then? I believe it is specified somewhere in JSR too... * @author Doug Lea * @author Josh Bloch * @author Arthur van Hoff * @author Neal Gafter * @version 1.65, 03/03/05 P.P.S. Do not forget to look at the top of this discussion: description: xxx Cache(maxSize=10000000, initialSize=1000) size : 2668705 cumulative_inserts : 4246246 cumulative_inserts is almost double of size which shows that double-inserts are real I checked catalina_out: no any NullPointerException, ArrayIndexOutOfBoundsException, and etc. I don't think we should be worried too much about possible change of Map implementation by SUN ... in this case we should use neither java.lang.String nor java.util.Date (some are placed in wrong packages). about thread safety: some participants are wrongly guessing that making object access totally synchronized will make their code thread-safe... deadlock.
        Hide
        klaasm Mike Klaas added a comment -

        [quote]We may simply use java.util.concurrent.locks instead of heavy synchronized... we may also use Executor framework instead of single-thread faceting... We may even base SOLR on Apache MINA project.[/quote]

        Simply replacing synchronized with java.util.concurrent.locks doesn't increase performance. There needs to be a specific strategy for employing these locks in a way that makes sense.

        For instance, one idea would be to create a read/write lock with the put()'s covered by write and get()'s covered by read. This would allow multiple parallel reads and will be thread-safe. Another is to create something like ConcurrentLinkedHashMap.

        These strategies should be tested before trying to create a lock-free get() version, which if even possible, would rely deeply on the implementation (such a structure would have to be created from scratch, I believe). I'd expect anyone that is able to create such a thing be familiar enough wiht memory barriers and such issues to be able to deeply explain the problems with double-checked locking off the top of their head (and immediately see such problems in other code)

        Show
        klaasm Mike Klaas added a comment - [quote] We may simply use java.util.concurrent.locks instead of heavy synchronized... we may also use Executor framework instead of single-thread faceting... We may even base SOLR on Apache MINA project. [/quote] Simply replacing synchronized with java.util.concurrent.locks doesn't increase performance. There needs to be a specific strategy for employing these locks in a way that makes sense. For instance, one idea would be to create a read/write lock with the put()'s covered by write and get()'s covered by read. This would allow multiple parallel reads and will be thread-safe. Another is to create something like ConcurrentLinkedHashMap. These strategies should be tested before trying to create a lock-free get() version, which if even possible, would rely deeply on the implementation (such a structure would have to be created from scratch, I believe). I'd expect anyone that is able to create such a thing be familiar enough wiht memory barriers and such issues to be able to deeply explain the problems with double-checked locking off the top of their head (and immediately see such problems in other code)
        Hide
        funtick Fuad Efendi added a comment -

        Simply replacing synchronized with java.util.concurrent.locks doesn't increase performance. There needs to be a specific strategy for employing these locks in a way that makes sense.

        I absolutely agree...

        ConcurrentHashMap is based on some level of acceptable safety, for specific tasks only

        They do not throw ConcurrentModificationException. However, iterators are designed to be used by only one thread at a time.

        We can try to design specific caches directly implementing Map or ConcurrentMap interfaces. We should define 'safety' levels (for instance, null is not a problem if cache already contains this object added by another thread concurrently; cache elements are explicitly immutable objects; and etc.)
        FIFO looks simplest and it does work indeed; for LRU we need reordering for each get(), OR we can make it weaker using weak (approximate) reordering somehow...

        Show
        funtick Fuad Efendi added a comment - Simply replacing synchronized with java.util.concurrent.locks doesn't increase performance. There needs to be a specific strategy for employing these locks in a way that makes sense. I absolutely agree... ConcurrentHashMap is based on some level of acceptable safety, for specific tasks only They do not throw ConcurrentModificationException. However, iterators are designed to be used by only one thread at a time. We can try to design specific caches directly implementing Map or ConcurrentMap interfaces. We should define 'safety' levels (for instance, null is not a problem if cache already contains this object added by another thread concurrently; cache elements are explicitly immutable objects; and etc.) FIFO looks simplest and it does work indeed; for LRU we need reordering for each get(), OR we can make it weaker using weak (approximate) reordering somehow...
        Hide
        funtick Fuad Efendi added a comment - - edited

        This is extremely simple Concurrent LRU, I spent an hour to create it; it is based on ConcurrentHashMap. I don't use java.util.concurrent.locks, and I am trying to focus on requirements only avoiding implementing unnecessary methods of Map interface (so that I am not following contract very sorry!)

        import java.util.Map;
        import java.util.concurrent.ConcurrentHashMap;
        
        public class ConcurrentLRU<K, V> {
        
        	protected ConcurrentHashMap<K, ValueWrapper<V>> map;
        	protected int maxEntries;
        
        	public ConcurrentLRU(int maxEntries) {
        		map = new ConcurrentHashMap<K, ValueWrapper<V>>();
        		this.maxEntries = maxEntries;
        	}
        
        	public V put(K key, V value) {
        		ValueWrapper<V> wrapper = map.put(key, new ValueWrapper<V>(value));
        		checkRemove();
        		return value;
        	}
        
        	void checkRemove() {
        		if (map.size() <= maxEntries)
        			return;
        		Map.Entry<K, ValueWrapper<V>> eldestEntry = null;
        		for (Map.Entry<K, ValueWrapper<V>> entry : map.entrySet()) {
        			long popularity = entry.getValue().popularity;
        			if (eldestEntry == null || eldestEntry.getValue().popularity > popularity) {
        				eldestEntry = entry;
        			}
        		}
        		map.remove(eldestEntry.getKey(), eldestEntry.getValue());
        	}
        
        	public V get(Object key) {
        		ValueWrapper<V> wrapper = map.get(key);
        		return wrapper == null ? null : wrapper.getValue();
        	}
        
        	public final static class ValueWrapper<V> {
        		static volatile long popularityCounter;
        		volatile long popularity;
        		V value;
        
        		ValueWrapper(V value) {
        			this.value = value;
        			popularity = popularityCounter++;
        		}
        
        		public boolean equals(Object o) {
        			if (!(o instanceof ValueWrapper)) {
        				return false;
        			}
        			return (value == null ? ((ValueWrapper) o).value == null : value.equals(((ValueWrapper) o).value));
        		}
        
        		public int hashCode() {
        			return value.hashCode();
        		}
        
        		public V getValue() {
        			popularity = popularityCounter++;
        			return value;
        		}
        
        	}
        
        }
        

        P.S.
        Do we need to synchronize put() or checkRemove()? The only hypothetical problem is OutOfMemoryException. But it is just first draft, very simplified... we do not need to sort array.

        Show
        funtick Fuad Efendi added a comment - - edited This is extremely simple Concurrent LRU, I spent an hour to create it; it is based on ConcurrentHashMap. I don't use java.util.concurrent.locks, and I am trying to focus on requirements only avoiding implementing unnecessary methods of Map interface (so that I am not following contract very sorry!) import java.util.Map; import java.util.concurrent.ConcurrentHashMap; public class ConcurrentLRU<K, V> { protected ConcurrentHashMap<K, ValueWrapper<V>> map; protected int maxEntries; public ConcurrentLRU( int maxEntries) { map = new ConcurrentHashMap<K, ValueWrapper<V>>(); this .maxEntries = maxEntries; } public V put(K key, V value) { ValueWrapper<V> wrapper = map.put(key, new ValueWrapper<V>(value)); checkRemove(); return value; } void checkRemove() { if (map.size() <= maxEntries) return ; Map.Entry<K, ValueWrapper<V>> eldestEntry = null ; for (Map.Entry<K, ValueWrapper<V>> entry : map.entrySet()) { long popularity = entry.getValue().popularity; if (eldestEntry == null || eldestEntry.getValue().popularity > popularity) { eldestEntry = entry; } } map.remove(eldestEntry.getKey(), eldestEntry.getValue()); } public V get( Object key) { ValueWrapper<V> wrapper = map.get(key); return wrapper == null ? null : wrapper.getValue(); } public final static class ValueWrapper<V> { static volatile long popularityCounter; volatile long popularity; V value; ValueWrapper(V value) { this .value = value; popularity = popularityCounter++; } public boolean equals( Object o) { if (!(o instanceof ValueWrapper)) { return false ; } return (value == null ? ((ValueWrapper) o).value == null : value.equals(((ValueWrapper) o).value)); } public int hashCode() { return value.hashCode(); } public V getValue() { popularity = popularityCounter++; return value; } } } P.S. Do we need to synchronize put() or checkRemove()? The only hypothetical problem is OutOfMemoryException. But it is just first draft, very simplified... we do not need to sort array.
        Hide
        yseeley@gmail.com Yonik Seeley added a comment -

        Fuad, you are on the right track now by using something that is thread-safe (ConcurrentHashMap). A couple of minor points:

        • I don't see the point of the static popularityCounter... that looks like a bug.
        • increments aren't atomic, but losing an increment once in a while should be fine in this scenario

        Anyway, If this works for you, use it!
        It's likely to be very specific to your use-case though (with millions of cache entries, millions of lookups per request, and no evictions). The eviction code looks like it would be relatively expensive.

        Show
        yseeley@gmail.com Yonik Seeley added a comment - Fuad, you are on the right track now by using something that is thread-safe (ConcurrentHashMap). A couple of minor points: I don't see the point of the static popularityCounter... that looks like a bug. increments aren't atomic, but losing an increment once in a while should be fine in this scenario Anyway, If this works for you, use it! It's likely to be very specific to your use-case though (with millions of cache entries, millions of lookups per request, and no evictions). The eviction code looks like it would be relatively expensive.
        Hide
        larsko Lars Kotthoff added a comment -

        Not everything in JAVA is extremely good: for instance, synchronization. Even for single-threaded application, it needs additionally 600 CPU cycles (I've read it somewhere for SUN Java 1.3 on Windows)

        That's probably not true for modern JVMs though – cf. http://www.ibm.com/developerworks/library/j-jtp04223.html.

        ...9x times performance boost...

        How did you measure that exactly? The Solr admin pages will not give you exact measurements. Could you describe the test setup in detail? I'm guessing that you're caching the results of all queries in memory such that no disk access is necessary. Are you using highlighting or anything else that might be CPU-intensive at all? From my personal experience with Solr I wouldn't expect synchronization for the caches to be that big of a performance penalty. In some of my tests with a several GB index where all results where cached and highlighting was turned on I've seen throughputs in excess of 400 searches per second. I think that the performance bottleneck in this case was the network interface for sending the replies.

        absolutely no need to synchronize get() method for FIFO!

        Consider the following case: thread A performs a synchronized put, thread B performs an unsynchronized get on the same key. B gets scheduled before A completes, the returned value will be undefined. Yes, we could do sanity checks to minimise these cases, but that would probably end up being more expensive than the synchronization.

        From JavaDoc: "Note that this implementation is not synchronized. If multiple threads access a linked hash map concurrently, and at least

        one of the threads modifies the map structurally, it must be synchronized externally."

        That's exactly the case here – the update thread modifies the map structurally! It doesn't do this at all times, probably even never after the cache has been populated, but there's no way to know for sure unless you explicitely remove the put method.

        I'm not convinced that we should change the current implementation for the following reasons:

        • Concurrency is traditionally a discipline which is very hard to get right. Furthermore the serious bugs tend to show up only when you really get race conditions and the like, i.e. when the machine is under heavy load and any disruption will hit you seriously.
        • You've already started to amend your implementation with sanity checks and the like – as I've said before, this might end up being more expensive than synchronization.
        • A FIFO cache might become a bottleneck itself – if the cache is very large and the most frequently accessed item is inserted just after the cache is created, all accesses will need to traverse all the other entries before getting that item.

        That said, if you can show conclusively (e.g. with a profiler) that the synchronized access is indeed the bottleneck and incurs a heavy penalty on performance, then I'm all for investigating this further.

        Show
        larsko Lars Kotthoff added a comment - Not everything in JAVA is extremely good: for instance, synchronization. Even for single-threaded application, it needs additionally 600 CPU cycles (I've read it somewhere for SUN Java 1.3 on Windows) That's probably not true for modern JVMs though – cf. http://www.ibm.com/developerworks/library/j-jtp04223.html . ...9x times performance boost... How did you measure that exactly? The Solr admin pages will not give you exact measurements. Could you describe the test setup in detail? I'm guessing that you're caching the results of all queries in memory such that no disk access is necessary. Are you using highlighting or anything else that might be CPU-intensive at all? From my personal experience with Solr I wouldn't expect synchronization for the caches to be that big of a performance penalty. In some of my tests with a several GB index where all results where cached and highlighting was turned on I've seen throughputs in excess of 400 searches per second. I think that the performance bottleneck in this case was the network interface for sending the replies. absolutely no need to synchronize get() method for FIFO! Consider the following case: thread A performs a synchronized put, thread B performs an unsynchronized get on the same key. B gets scheduled before A completes, the returned value will be undefined. Yes, we could do sanity checks to minimise these cases, but that would probably end up being more expensive than the synchronization. From JavaDoc: "Note that this implementation is not synchronized. If multiple threads access a linked hash map concurrently, and at least one of the threads modifies the map structurally, it must be synchronized externally." That's exactly the case here – the update thread modifies the map structurally! It doesn't do this at all times, probably even never after the cache has been populated, but there's no way to know for sure unless you explicitely remove the put method. I'm not convinced that we should change the current implementation for the following reasons: Concurrency is traditionally a discipline which is very hard to get right. Furthermore the serious bugs tend to show up only when you really get race conditions and the like, i.e. when the machine is under heavy load and any disruption will hit you seriously. You've already started to amend your implementation with sanity checks and the like – as I've said before, this might end up being more expensive than synchronization. A FIFO cache might become a bottleneck itself – if the cache is very large and the most frequently accessed item is inserted just after the cache is created, all accesses will need to traverse all the other entries before getting that item. That said, if you can show conclusively (e.g. with a profiler) that the synchronized access is indeed the bottleneck and incurs a heavy penalty on performance, then I'm all for investigating this further.
        Hide
        funtick Fuad Efendi added a comment - - edited

        The Solr admin pages will not give you exact measurements.

        Yes, and I do not need exact measurements! It gives me averageTimePerRequest which improved almost 10 times on production server. Should I right JUnit tests and execute it in a single-threaded environment? Better is to use The Grinder, but I don't have time and spare CPUs.

        I've seen throughputs in excess of 400 searches per second.

        But 'searches per second' is not the same as 'average response time'!!!

        Are you using highlighting or anything else that might be CPU-intensive at all?

        Yes, I am using highlighting. You can see it at http://www.tokenizer.org

        I'm guessing that you're caching the results of all queries in memory such that no disk access is necessary.

        But this is another bug of SOLR!!! I am using extremely large caches but SOLR still recalculates facet intersections. - SOLR-669

        A FIFO cache might become a bottleneck itself - if the cache is very large and the most frequently accessed item is inserted just after the cache is created, all accesses will need to traverse all the other entries before getting that item.

        • sorry, I didn't understand... yes, if cache contains 100000 entries and 'most popular item' is removed... Why 'traverse all the other entries before getting that item'? why 99999 items are less popular (cumulative) than single one (absolute)?

        You probably mean 'LinkedList traversal' but this is not the case. This is why we need to browse JavaSource... LinkedHashMap extends HashMap and there is no any 'traversal',

            public V get(Object key) {
                Entry<K,V> e = (Entry<K,V>)getEntry(key);
                if (e == null)
                    return null;
                e.recordAccess(this);
                return e.value;
            }
        

        Consider the following case: thread A performs a synchronized put, thread B performs an unsynchronized get on the same key. B gets scheduled before A completes, the returned value will be undefined.

        the returned value is well defined: it is either null or correct value.

        That's exactly the case here - the update thread modifies the map structurally!

        Who cares? We are not iterating the map!

        That said, if you can show conclusively (e.g. with a profiler) that the synchronized access is indeed the bottleneck and incurs a heavy penalty on performance, then I'm all for investigating this further.

        What?!!

        Anyway, I believe simplified ConcurrentLRU backed by ConcurrentHashMap is easier to understand and troubleshoot...

        I don't see the point of the static popularityCounter... that looks like a bug.

        No, it is not a bug. it is virtually "checkpoint", like as a timer, one timer for all instances. We can use System.currentTimeMillis() instead, but static volatile long is faster.

        About specific use case: yes... if someone has 0.5 seconds response time for faceted queries I am very happy... I had 15 seconds before going with FIFO.

        Show
        funtick Fuad Efendi added a comment - - edited The Solr admin pages will not give you exact measurements. Yes, and I do not need exact measurements! It gives me averageTimePerRequest which improved almost 10 times on production server. Should I right JUnit tests and execute it in a single-threaded environment? Better is to use The Grinder, but I don't have time and spare CPUs. I've seen throughputs in excess of 400 searches per second. But 'searches per second' is not the same as 'average response time'!!! Are you using highlighting or anything else that might be CPU-intensive at all? Yes, I am using highlighting. You can see it at http://www.tokenizer.org I'm guessing that you're caching the results of all queries in memory such that no disk access is necessary. But this is another bug of SOLR!!! I am using extremely large caches but SOLR still recalculates facet intersections. - SOLR-669 A FIFO cache might become a bottleneck itself - if the cache is very large and the most frequently accessed item is inserted just after the cache is created, all accesses will need to traverse all the other entries before getting that item. sorry, I didn't understand... yes, if cache contains 100000 entries and 'most popular item' is removed... Why 'traverse all the other entries before getting that item'? why 99999 items are less popular (cumulative) than single one (absolute)? You probably mean 'LinkedList traversal' but this is not the case. This is why we need to browse JavaSource... LinkedHashMap extends HashMap and there is no any 'traversal', public V get( Object key) { Entry<K,V> e = (Entry<K,V>)getEntry(key); if (e == null ) return null ; e.recordAccess( this ); return e.value; } Consider the following case: thread A performs a synchronized put, thread B performs an unsynchronized get on the same key. B gets scheduled before A completes, the returned value will be undefined. the returned value is well defined: it is either null or correct value. That's exactly the case here - the update thread modifies the map structurally! Who cares? We are not iterating the map! That said, if you can show conclusively (e.g. with a profiler) that the synchronized access is indeed the bottleneck and incurs a heavy penalty on performance, then I'm all for investigating this further. What?!! Anyway, I believe simplified ConcurrentLRU backed by ConcurrentHashMap is easier to understand and troubleshoot... I don't see the point of the static popularityCounter... that looks like a bug. No, it is not a bug. it is virtually "checkpoint", like as a timer, one timer for all instances. We can use System.currentTimeMillis() instead, but static volatile long is faster. About specific use case: yes... if someone has 0.5 seconds response time for faceted queries I am very happy... I had 15 seconds before going with FIFO.
        Hide
        funtick Fuad Efendi added a comment -

        The eviction code looks like it would be relatively expensive

        but get() method of LinkedHashMap reorders whole map!!! (Of course, CPU load is evenly distributed between several get() so that we can't see it) Other implementations even use Arrays.sort() or something similar. I don't see easier solution than that... probably some random-access policy with predictable range of "popularity", we can evict anything 'old' and not necessarily 'eldest'...

        Show
        funtick Fuad Efendi added a comment - The eviction code looks like it would be relatively expensive but get() method of LinkedHashMap reorders whole map!!! (Of course, CPU load is evenly distributed between several get() so that we can't see it) Other implementations even use Arrays.sort() or something similar. I don't see easier solution than that... probably some random-access policy with predictable range of "popularity", we can evict anything 'old' and not necessarily 'eldest'...
        Hide
        yseeley@gmail.com Yonik Seeley added a comment -

        but get() method of LinkedHashMap reorders whole map!!!

        No it doesn't... think linked-list. It moves a single item, which is pretty fast.

        Show
        yseeley@gmail.com Yonik Seeley added a comment - but get() method of LinkedHashMap reorders whole map!!! No it doesn't... think linked-list. It moves a single item, which is pretty fast.
        Hide
        funtick Fuad Efendi added a comment - - edited

        Lars, I used FIFO because it is extremely simple to get unsynchronized get():

        map = new LinkedHashMap(initialSize, 0.75f, true)  - LRU Cache
        (and we need synchronized get())
        map = new LinkedHashMap(initialSize, 0.75f, false) - FIFO
        (and we do not need synchronized get()) 
        

        Yonik, I'll try to improve ConcurrentLRU and to share findings... of course FIFO is not what we need.

        No it doesn't... think linked-list. It moves a single item, which is pretty fast.

        yes, so I wrote 'evenly distributed between several get() so we can't see it' - it keeps List ordered and we can't unsynchronize it with all subsequences!!!

        Show
        funtick Fuad Efendi added a comment - - edited Lars, I used FIFO because it is extremely simple to get unsynchronized get() : map = new LinkedHashMap(initialSize, 0.75f, true ) - LRU Cache (and we need synchronized get()) map = new LinkedHashMap(initialSize, 0.75f, false ) - FIFO (and we do not need synchronized get()) Yonik, I'll try to improve ConcurrentLRU and to share findings... of course FIFO is not what we need. No it doesn't... think linked-list. It moves a single item, which is pretty fast. yes, so I wrote 'evenly distributed between several get() so we can't see it' - it keeps List ordered and we can't unsynchronize it with all subsequences!!!
        Hide
        larsko Lars Kotthoff added a comment -

        the returned value is well defined: it is either null or correct value.

        It is if nothing is modifying the map during the get. If something is modifying the map you don't know how the implementation handles the insert of a new value. It might copy the object, and you'd end up with half an object or even an invalid memory location. That's why the javadoc says that you must synchronize accesses if anything modifies the map – this is not limited to iterators.

        Show
        larsko Lars Kotthoff added a comment - the returned value is well defined: it is either null or correct value. It is if nothing is modifying the map during the get. If something is modifying the map you don't know how the implementation handles the insert of a new value. It might copy the object, and you'd end up with half an object or even an invalid memory location. That's why the javadoc says that you must synchronize accesses if anything modifies the map – this is not limited to iterators.
        Hide
        yseeley@gmail.com Yonik Seeley added a comment -

        the returned value is well defined: it is either null or correct value.

        No, it is not! Your analysis seems to ignore the java memory model (partially constructed objects and all that). I don't know how many different ways to say it.... please do yourself a favor and read up on the java memory model (and the book I previously referenced is great for this). This is hard stuff (at the lowest levels).

        Show
        yseeley@gmail.com Yonik Seeley added a comment - the returned value is well defined: it is either null or correct value. No, it is not! Your analysis seems to ignore the java memory model (partially constructed objects and all that). I don't know how many different ways to say it.... please do yourself a favor and read up on the java memory model (and the book I previously referenced is great for this). This is hard stuff (at the lowest levels).
        Hide
        funtick Fuad Efendi added a comment -

        It is if nothing is modifying the map during the get. If something is modifying the map you don't know how the implementation handles the insert of a new value. It might copy the object, and you'd end up with half an object or even an invalid memory location. That's why the javadoc says that you must synchronize accesses if anything modifies the map - this is not limited to iterators.

        JavaDoc does not say that. Instead, it says (I am repeating):

        ..and at least one of the threads modifies the map structurally, it must be synchronized externally.

        • only thread doing structural modification must be synchronized. In case of LinkedHashMap, for instance, we need to synchronize inserts in order to avoid Entry instances referencing themselves (orphans).

        you don't know how the implementation handles the insert of a new value

        I know exactly: SOLR does not modify 'value' during 'insert', Map.Entry instances are immutable in SOLR, etc. Table resize is main problem - but after analyzing source code I don't see any problem. Consern that 'wrong value will be returned for a key' is not applicable. And JavaDocs does not say anything about that. Collections internally use Map.Entry in an immutable way, do not change it.

        Show
        funtick Fuad Efendi added a comment - It is if nothing is modifying the map during the get. If something is modifying the map you don't know how the implementation handles the insert of a new value. It might copy the object, and you'd end up with half an object or even an invalid memory location. That's why the javadoc says that you must synchronize accesses if anything modifies the map - this is not limited to iterators. JavaDoc does not say that. Instead, it says (I am repeating): ..and at least one of the threads modifies the map structurally, it must be synchronized externally. only thread doing structural modification must be synchronized. In case of LinkedHashMap, for instance, we need to synchronize inserts in order to avoid Entry instances referencing themselves (orphans). you don't know how the implementation handles the insert of a new value I know exactly: SOLR does not modify 'value' during 'insert', Map.Entry instances are immutable in SOLR, etc. Table resize is main problem - but after analyzing source code I don't see any problem. Consern that 'wrong value will be returned for a key' is not applicable. And JavaDocs does not say anything about that. Collections internally use Map.Entry in an immutable way, do not change it.
        Hide
        funtick Fuad Efendi added a comment -

        No, it is not! Your analysis seems to ignore the java memory model (partially constructed objects and all that). I don't know how many different ways to say it.... please do yourself a favor and read up on the java memory model (and the book I previously referenced is great for this). This is hard stuff (at the lowest levels).

        Ok. May be I can get reference to wrong object type, or even object scheduled for finalization... But we are not inserting into Map 'partially constructed objects', isn't it?

        Simplest scenario: Thread A tries to get variable (4 bytes of address in JVM) pointing to object O. Another thread B concurrently assigns null to that variable. Isn't it solved at CPU level yet? Or, may be on 64bit system thread B assigns zero to first 2 bytes, and then to another 2 bytes?

        I need to study this book... BTW, I am running JVM with '-server' option.

        Show
        funtick Fuad Efendi added a comment - No, it is not! Your analysis seems to ignore the java memory model (partially constructed objects and all that). I don't know how many different ways to say it.... please do yourself a favor and read up on the java memory model (and the book I previously referenced is great for this). This is hard stuff (at the lowest levels). Ok. May be I can get reference to wrong object type, or even object scheduled for finalization... But we are not inserting into Map 'partially constructed objects', isn't it? Simplest scenario: Thread A tries to get variable (4 bytes of address in JVM) pointing to object O. Another thread B concurrently assigns null to that variable. Isn't it solved at CPU level yet? Or, may be on 64bit system thread B assigns zero to first 2 bytes, and then to another 2 bytes? I need to study this book... BTW, I am running JVM with '-server' option.
        Hide
        larsko Lars Kotthoff added a comment -

        The way I understand the javadoc is

        ..and at least one of the threads modifies the map structurally, it must be synchronized externally.

        The it references the map. It says explicitely that the map must be synchronized, i.e. all operations on it. It does not say that only the thread modifying it must be synchronized. Consider the case where only one thread modifies the map. You're saying that in this case synchronization would not be necessary, as only modifications themselves need to be synchronized and there is only one thread doing that. The javadoc explicitely says that it is necessary.

        Show
        larsko Lars Kotthoff added a comment - The way I understand the javadoc is ..and at least one of the threads modifies the map structurally, it must be synchronized externally. The it references the map. It says explicitely that the map must be synchronized, i.e. all operations on it. It does not say that only the thread modifying it must be synchronized. Consider the case where only one thread modifies the map. You're saying that in this case synchronization would not be necessary, as only modifications themselves need to be synchronized and there is only one thread doing that. The javadoc explicitely says that it is necessary.
        Hide
        funtick Fuad Efendi added a comment -

        The it references the map. It says explicitely that the map must be synchronized.

        • I agree, thanks for pointing to it. Synchronize!

        BTW, Joshua Bloch developed Arrays.sort(), and bug was found after 9 years. Nothing is perfect.

        ConcurrentLRU looks extremely simple and easy to improve. Should we check SUN's bug database before using ConcurrentHashMap? It has some related...

        Show
        funtick Fuad Efendi added a comment - The it references the map. It says explicitely that the map must be synchronized. I agree, thanks for pointing to it. Synchronize! BTW, Joshua Bloch developed Arrays.sort(), and bug was found after 9 years. Nothing is perfect. ConcurrentLRU looks extremely simple and easy to improve. Should we check SUN's bug database before using ConcurrentHashMap? It has some related...
        Hide
        craigmcc Craig McClanahan added a comment -

        Just a quick comment from the Peanut Gallery.

        Instead of arguing about what the semantics of what the JavaDocs for java.util.Map and friends actually say, and worrying about whether particular implementations actually follow the rules but provide reasonable performance, this seems like a good opportunity to design and build an application specific data structure that has the behavioral characteristics you want (Fuad is after fastest-possible reads, everybody is after reasonable behavior in the face of concurrent writes). My only recommendation in this regard would be this: don't make a custom implementation say "implements java.util.LinkedHashMap" (again, or whatever implementation is currently in use) unless it does actually implement the documented semantics for java.util.LinkedHashMap.

        It's fine to have a custom FifoCacheMap (or whatever name you like) class that does not implement java.util.Map. It's not fine to say you implement an interface but then break the documented contract.

        Show
        craigmcc Craig McClanahan added a comment - Just a quick comment from the Peanut Gallery. Instead of arguing about what the semantics of what the JavaDocs for java.util.Map and friends actually say, and worrying about whether particular implementations actually follow the rules but provide reasonable performance, this seems like a good opportunity to design and build an application specific data structure that has the behavioral characteristics you want (Fuad is after fastest-possible reads, everybody is after reasonable behavior in the face of concurrent writes). My only recommendation in this regard would be this: don't make a custom implementation say "implements java.util.LinkedHashMap" (again, or whatever implementation is currently in use) unless it does actually implement the documented semantics for java.util.LinkedHashMap. It's fine to have a custom FifoCacheMap (or whatever name you like) class that does not implement java.util.Map. It's not fine to say you implement an interface but then break the documented contract.
        Hide
        yseeley@gmail.com Yonik Seeley added a comment -

        It's fine to have a custom FifoCacheMap (or whatever name you like) class that does not implement java.util.Map.

        Right. Any solr cache must currently implement the SolrCache interface, and LinkedHashMap is simply an implementation detail of the LRUCache implementation of SolrCache (which has no relationship to the Map interface).

        Show
        yseeley@gmail.com Yonik Seeley added a comment - It's fine to have a custom FifoCacheMap (or whatever name you like) class that does not implement java.util.Map. Right. Any solr cache must currently implement the SolrCache interface, and LinkedHashMap is simply an implementation detail of the LRUCache implementation of SolrCache (which has no relationship to the Map interface).
        Hide
        funtick Fuad Efendi added a comment - - edited

        Fuad is after fastest-possible reads, everybody is after reasonable behavior in the face of concurrent writes

        Thanks and sorry for runtime errors;

        FIFO looks strange at first, but... for large cache (100000 items), most popular item can be mistakenly removed... but I don't think there are any 'most popular facets' etc.; it's evenly distributed in most cases.

        Another issue: SOLR always tries recalculate facets even with extremely large filterCache & queryResultCache, even the same faceted query shows always the same long response times.

        It is if nothing is modifying the map during the get. If something is modifying the map you don't know how the implementation handles the insert of a new value. It might copy the object, and you'd end up with half an object or even an invalid memory location. That's why the javadoc says that you must synchronize accesses if anything modifies the map - this is not limited to iterators.

        I agree of course... However, we are not dealing with unknown implementation of java.util.Map clonig (java.lang.Cloneable) objects somehow or using some weird object introspection etc....

        Show
        funtick Fuad Efendi added a comment - - edited Fuad is after fastest-possible reads, everybody is after reasonable behavior in the face of concurrent writes Thanks and sorry for runtime errors; FIFO looks strange at first, but... for large cache (100000 items), most popular item can be mistakenly removed... but I don't think there are any 'most popular facets' etc.; it's evenly distributed in most cases. Another issue: SOLR always tries recalculate facets even with extremely large filterCache & queryResultCache, even the same faceted query shows always the same long response times. It is if nothing is modifying the map during the get. If something is modifying the map you don't know how the implementation handles the insert of a new value. It might copy the object, and you'd end up with half an object or even an invalid memory location. That's why the javadoc says that you must synchronize accesses if anything modifies the map - this is not limited to iterators. I agree of course... However, we are not dealing with unknown implementation of java.util.Map clonig (java.lang.Cloneable) objects somehow or using some weird object introspection etc....
        Hide
        funtick Fuad Efendi added a comment -

        BTW there is almost no any functional difference between LRU and FIFO. And there is huge difference between LRU (Least Recently Used) and LFU (Least Frequently Used).
        It's easy to implement ConcurrentLFU based on provided ConcurrentLRU template; of course, following the main contract org.apache.solr.search.SolrCache.

        Show
        funtick Fuad Efendi added a comment - BTW there is almost no any functional difference between LRU and FIFO. And there is huge difference between LRU (Least Recently Used) and LFU (Least Frequently Used). It's easy to implement ConcurrentLFU based on provided ConcurrentLRU template; of course, following the main contract org.apache.solr.search.SolrCache.
        Hide
        noble.paul Noble Paul added a comment -

        I Fuad. This is a worthwhile effort to undertake. Why don't you implement a LRUCache implementation and post it (as a java file) . Let everyone review and see if it can be improved

        Show
        noble.paul Noble Paul added a comment - I Fuad. This is a worthwhile effort to undertake. Why don't you implement a LRUCache implementation and post it (as a java file) . Let everyone review and see if it can be improved
        Hide
        funtick Fuad Efendi added a comment -

        Paul, I want to do it... understand me, since February I am having constant performance problems with faceted queries (15-20 seconds response time), I ordered new server with 32 Gb & 2x quad-core (8x times more power!) but it didn't improve performance; finally I commented sync in LRUCache and made it FIFO... I was very impatient with this post, just tried to share very real staff...

        Show
        funtick Fuad Efendi added a comment - Paul, I want to do it... understand me, since February I am having constant performance problems with faceted queries (15-20 seconds response time), I ordered new server with 32 Gb & 2x quad-core (8x times more power!) but it didn't improve performance; finally I commented sync in LRUCache and made it FIFO... I was very impatient with this post, just tried to share very real staff...
        Hide
        noble.paul Noble Paul added a comment -

        A FIFOCache using ConcurrentHashMap.
        Please give feedback

        Show
        noble.paul Noble Paul added a comment - A FIFOCache using ConcurrentHashMap. Please give feedback
        Hide
        noble.paul Noble Paul added a comment -

        Another one this is LRU. uses ConcurrentHashMap again

        • Gets are free
        • Puts are expensive (very expensive) after the high watermark . Till then it is free
        • To lighten the load on put an extra thread is employed to do a concurrent mark and sweep
        • there is a high-water-mark and a low-water-mark . The extra thread cleans anything if low-water-mark is crossed. Put must removes if level crosses high-water-mark
        Show
        noble.paul Noble Paul added a comment - Another one this is LRU. uses ConcurrentHashMap again Gets are free Puts are expensive (very expensive) after the high watermark . Till then it is free To lighten the load on put an extra thread is employed to do a concurrent mark and sweep there is a high-water-mark and a low-water-mark . The extra thread cleans anything if low-water-mark is crossed. Put must removes if level crosses high-water-mark
        Hide
        noble.paul Noble Paul added a comment -

        bug fix

        Show
        noble.paul Noble Paul added a comment - bug fix
        Hide
        funtick Fuad Efendi added a comment -

        Thanks Paul, I am always trying to simplify... AtomicReference is a pointer to (approximately) least recently used 'key'...

        Show
        funtick Fuad Efendi added a comment - Thanks Paul, I am always trying to simplify... AtomicReference is a pointer to (approximately) least recently used 'key'...
        Hide
        funtick Fuad Efendi added a comment -

        bug fix

        Show
        funtick Fuad Efendi added a comment - bug fix
        Hide
        funtick Fuad Efendi added a comment -

        another bug... and AtomicReference is generic... never used it before. Could be even 'weaker' if we use 'hashcode' which is long (and atomic) instead of Key which is object (and unsafe), and distribution of hashcode is ok...

        Show
        funtick Fuad Efendi added a comment - another bug... and AtomicReference is generic... never used it before. Could be even 'weaker' if we use 'hashcode' which is long (and atomic) instead of Key which is object (and unsafe), and distribution of hashcode is ok...
        Hide
        noble.paul Noble Paul added a comment - - edited

        There are a few obvious issues w/ your patch.
        Effendi your implementation is wrong

        	 if (wrapper.lastAccessed < pointer.lastAccessed) {
        			ref.set(new Pointer(key, wrapper.lastAccessed));
        }
        	

        will always evaluate to false. And the reference will always have one value
        We must be removing the entry which was accessed first (not last).. To get that you must maintian a linkedlist the way linkedhashmap maintains. No other shortcut. Keeping one reference will not work.

        In that implementation there is always a contention on the head so gets are bound to be slow. That is why i did not go that route
        And the static volatile counter is not threadsafe.
        static in general is not a good idea

        There is no need to use a WeakReference anywhere. This is a cache. So it must maintain strong reference

        Show
        noble.paul Noble Paul added a comment - - edited There are a few obvious issues w/ your patch. Effendi your implementation is wrong if (wrapper.lastAccessed < pointer.lastAccessed) { ref.set( new Pointer(key, wrapper.lastAccessed)); } will always evaluate to false. And the reference will always have one value We must be removing the entry which was accessed first (not last).. To get that you must maintian a linkedlist the way linkedhashmap maintains. No other shortcut. Keeping one reference will not work. In that implementation there is always a contention on the head so gets are bound to be slow. That is why i did not go that route And the static volatile counter is not threadsafe. static in general is not a good idea There is no need to use a WeakReference anywhere. This is a cache. So it must maintain strong reference
        Hide
        noble.paul Noble Paul added a comment -

        I have opened SOLR-667 for LRU implementations

        Show
        noble.paul Noble Paul added a comment - I have opened SOLR-667 for LRU implementations
        Hide
        funtick Fuad Efendi added a comment - - edited

        Noble, thanks for feedback!

        Of course my code is buggy but I only wanted to illustrate simplest idea; I am extremely busy with other staff (Liferay) and can't focus on SOLR improvements... may be during weekend.

        ...will always evaluate to false. And the reference will always have one value

        • yes, this is bug. There are other bugs too...

        We must be removing the entry which was accessed first (not last)..

        I mean (and code too) the same; probably wrong wording

        And the static volatile counter is not threadsafe.

        Do we really-really need thread safety here? By using 'volatile' I only prevent some JVMs from trying to optimize some code (and cause problems).

        There is no need to use a WeakReference anywhere

        Agree...

        To get that you must maintian a linkedlist the way linkedhashmap maintains. No other shortcut.

        May be... but looks similar to Arrays.sort(), or TreeSet, and etc.... I am trying to avoid this. 'No other shortcut' - may be, but I am unsure.

        Thanks!

        Show
        funtick Fuad Efendi added a comment - - edited Noble, thanks for feedback! Of course my code is buggy but I only wanted to illustrate simplest idea; I am extremely busy with other staff (Liferay) and can't focus on SOLR improvements... may be during weekend. ...will always evaluate to false. And the reference will always have one value yes, this is bug. There are other bugs too... We must be removing the entry which was accessed first (not last).. I mean (and code too) the same; probably wrong wording And the static volatile counter is not threadsafe. Do we really-really need thread safety here? By using 'volatile' I only prevent some JVMs from trying to optimize some code (and cause problems). There is no need to use a WeakReference anywhere Agree... To get that you must maintian a linkedlist the way linkedhashmap maintains. No other shortcut. May be... but looks similar to Arrays.sort(), or TreeSet, and etc.... I am trying to avoid this. 'No other shortcut' - may be, but I am unsure. Thanks!
        Hide
        funtick Fuad Efendi added a comment -

        I don't think ConcurrentHashMap will improve performance, and ConcurrentMap is not what SOLR needs:

            V putIfAbsent(K key, V value);
            V replace(K key, V value);
            boolean replace(K key, V oldValue, V newValue);
        

        There is also some(...) overhead with oldValue and the state of the hash table at some point; additional memory requirements; etc... can we design something plain-simpler being focused on SOLR specific requirements? Without all functionality of Map etc...

        Show
        funtick Fuad Efendi added a comment - I don't think ConcurrentHashMap will improve performance, and ConcurrentMap is not what SOLR needs: V putIfAbsent(K key, V value); V replace(K key, V value); boolean replace(K key, V oldValue, V newValue); There is also some(...) overhead with oldValue and the state of the hash table at some point ; additional memory requirements; etc... can we design something plain-simpler being focused on SOLR specific requirements? Without all functionality of Map etc...
        Hide
        funtick Fuad Efendi added a comment -

        Guys at LingPipe (Natural Language Processing) http://alias-i.com/ are using excellent Map implementations with optimistic concurrency strategy:
        http://alias-i.com/lingpipe/docs/api/com/aliasi/util/FastCache.html
        http://alias-i.com/lingpipe/docs/api/com/aliasi/util/HardFastCache.html

        Show
        funtick Fuad Efendi added a comment - Guys at LingPipe (Natural Language Processing) http://alias-i.com/ are using excellent Map implementations with optimistic concurrency strategy: http://alias-i.com/lingpipe/docs/api/com/aliasi/util/FastCache.html http://alias-i.com/lingpipe/docs/api/com/aliasi/util/HardFastCache.html
        Hide
        yseeley@gmail.com Yonik Seeley added a comment -

        closing as duplicate since we have a concurrent cache impl now.

        Show
        yseeley@gmail.com Yonik Seeley added a comment - closing as duplicate since we have a concurrent cache impl now.

          People

          • Assignee:
            Unassigned
            Reporter:
            funtick Fuad Efendi
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Time Tracking

              Estimated:
              Original Estimate - 672h
              672h
              Remaining:
              Remaining Estimate - 672h
              672h
              Logged:
              Time Spent - Not Specified
              Not Specified

                Development