Solr
  1. Solr
  2. SOLR-5661

PriorityQueue has OOM (Requested array size exceeds VM limit) issue

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 4.3.1, 4.4, 4.5, 4.5.1, 4.6
    • Fix Version/s: 4.7, 6.0
    • Labels:
      None
    • Environment:

      JDK 7

      Description

      It look like JDK7 change the design for max_array_length logic, it isn't max_jint, and it should be max_jint - header_size(type).

      If you deliver the Integer.MaxValue to create the PriorityQueue and have enough memory, you will find it is ok in JVM6 but not work in JVM7.

      JVM7 will throw OOM error while do array rang checking.

      It should the compatible issue between JVM6 and JVM7.

      Maybe need protect in the code logic, throw OOM look like big issues for customer.

      1. patch-5661.txt
        1 kB
        Raintung Li

        Activity

        Hide
        Michael McCandless added a comment -

        Hmm, how were you able to get an OOME from Solr?

        Lucene's IndexSearcher tries to prevent this, when you ask for Integer.MAX_VALUE as the topN hits to the search method, it drops that to the maxDoc for the reader.

        Still, we should fix oal.util.PriorityQueue to use ArrayUtil.MAX_ARRAY_LENGTH, not Integer.MAX_VALUE.

        Show
        Michael McCandless added a comment - Hmm, how were you able to get an OOME from Solr? Lucene's IndexSearcher tries to prevent this, when you ask for Integer.MAX_VALUE as the topN hits to the search method, it drops that to the maxDoc for the reader. Still, we should fix oal.util.PriorityQueue to use ArrayUtil.MAX_ARRAY_LENGTH, not Integer.MAX_VALUE.
        Hide
        Raintung Li added a comment -

        If you have multiple shard for one collection, send the query url for max integer rowid, it can easy replicate.
        Caused by: java.lang.OutOfMemoryError: Requested array size exceeds VM limit
        at org.apache.lucene.util.PriorityQueue.<init>(PriorityQueue.java:64)
        at org.apache.lucene.util.PriorityQueue.<init>(PriorityQueue.java:37)
        at org.apache.solr.handler.component.ShardFieldSortedHitQueue.<init>(ShardDoc.java:113)
        at org.apache.solr.handler.component.QueryComponent.mergeIds(QueryComponent.java:790)
        at org.apache.solr.handler.component.QueryComponent.handleRegularResponses(QueryComponent.java:649)
        at org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:628)
        at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:311)
        at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1859)
        at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:710)
        at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:413)

        Show
        Raintung Li added a comment - If you have multiple shard for one collection, send the query url for max integer rowid, it can easy replicate. Caused by: java.lang.OutOfMemoryError: Requested array size exceeds VM limit at org.apache.lucene.util.PriorityQueue.<init>(PriorityQueue.java:64) at org.apache.lucene.util.PriorityQueue.<init>(PriorityQueue.java:37) at org.apache.solr.handler.component.ShardFieldSortedHitQueue.<init>(ShardDoc.java:113) at org.apache.solr.handler.component.QueryComponent.mergeIds(QueryComponent.java:790) at org.apache.solr.handler.component.QueryComponent.handleRegularResponses(QueryComponent.java:649) at org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:628) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:311) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1859) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:710) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:413)
        Hide
        Raintung Li added a comment -

        add the protection logic.

        Show
        Raintung Li added a comment - add the protection logic.
        Hide
        Michael McCandless added a comment -

        I think that's a good low-level fix for Lucene; but I'm not sure how to fix Solr here: somewhere it should limit the size it passes to the PQ constructor?

        Show
        Michael McCandless added a comment - I think that's a good low-level fix for Lucene; but I'm not sure how to fix Solr here: somewhere it should limit the size it passes to the PQ constructor?
        Hide
        ASF subversion and git services added a comment -

        Commit 1561369 from Michael McCandless in branch 'dev/trunk'
        [ https://svn.apache.org/r1561369 ]

        SOLR-5661: catch too-large priority queue

        Show
        ASF subversion and git services added a comment - Commit 1561369 from Michael McCandless in branch 'dev/trunk' [ https://svn.apache.org/r1561369 ] SOLR-5661 : catch too-large priority queue
        Hide
        ASF subversion and git services added a comment -

        Commit 1561370 from Michael McCandless in branch 'dev/branches/branch_4x'
        [ https://svn.apache.org/r1561370 ]

        SOLR-5661: catch too-large priority queue

        Show
        ASF subversion and git services added a comment - Commit 1561370 from Michael McCandless in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1561370 ] SOLR-5661 : catch too-large priority queue
        Hide
        Michael McCandless added a comment -

        Thanks Raintung, I committed your patch (tweaked the message) to 4.7 & 5.0, but I'll leave this open for someone else to fix the Solr part ...

        Show
        Michael McCandless added a comment - Thanks Raintung, I committed your patch (tweaked the message) to 4.7 & 5.0, but I'll leave this open for someone else to fix the Solr part ...
        Hide
        Shalin Shekhar Mangar added a comment -

        If you have multiple shard for one collection, send the query url for max integer rowid, it can easy replicate.
        Caused by: java.lang.OutOfMemoryError: Requested array size exceeds VM limit
        at org.apache.lucene.util.PriorityQueue.<init>(PriorityQueue.java:64)
        at org.apache.lucene.util.PriorityQueue.<init>(PriorityQueue.java:37)
        at org.apache.solr.handler.component.ShardFieldSortedHitQueue.<init>(ShardDoc.java:113)
        at org.apache.solr.handler.component.QueryComponent.mergeIds(QueryComponent.java:790)

        Raintung, please correct me if I am wrong but the max integer value is being used in the constructor only because you have explicitly asked for that many rows? Solr doesn't pass such a big value to the PriorityQueue constructor by itself.

        Also, I think the fix committed to Lucene takes care of the problem on Solr's side as well. Even if we implement a check on Solr's side, the error message can't be any different.

        Show
        Shalin Shekhar Mangar added a comment - If you have multiple shard for one collection, send the query url for max integer rowid, it can easy replicate. Caused by: java.lang.OutOfMemoryError: Requested array size exceeds VM limit at org.apache.lucene.util.PriorityQueue.<init>(PriorityQueue.java:64) at org.apache.lucene.util.PriorityQueue.<init>(PriorityQueue.java:37) at org.apache.solr.handler.component.ShardFieldSortedHitQueue.<init>(ShardDoc.java:113) at org.apache.solr.handler.component.QueryComponent.mergeIds(QueryComponent.java:790) Raintung, please correct me if I am wrong but the max integer value is being used in the constructor only because you have explicitly asked for that many rows? Solr doesn't pass such a big value to the PriorityQueue constructor by itself. Also, I think the fix committed to Lucene takes care of the problem on Solr's side as well. Even if we implement a check on Solr's side, the error message can't be any different.
        Hide
        Raintung Li added a comment -

        Yes, I ask the many rows that it is max integer. The root node will collect the results from different node(shards) to combin and meger the results. For this case, will directly create the queue .

        Fix the Lucene side, it is enough.

        For solr issue, should be how to handle the biggest rowid, for biggest rowid should have the different logic. Can't direct create the biggest queue.

        Show
        Raintung Li added a comment - Yes, I ask the many rows that it is max integer. The root node will collect the results from different node(shards) to combin and meger the results. For this case, will directly create the queue . Fix the Lucene side, it is enough. For solr issue, should be how to handle the biggest rowid, for biggest rowid should have the different logic. Can't direct create the biggest queue.
        Hide
        Raintung Li added a comment -

        I create the other issue SOLR-5674 to distinguish track biggest rows issue.

        Show
        Raintung Li added a comment - I create the other issue SOLR-5674 to distinguish track biggest rows issue.
        Hide
        Uwe Schindler added a comment -

        Move issue to Solr 4.9.

        Show
        Uwe Schindler added a comment - Move issue to Solr 4.9.
        Hide
        Shalin Shekhar Mangar added a comment -

        This was resolved in 4.7

        Show
        Shalin Shekhar Mangar added a comment - This was resolved in 4.7

          People

          • Assignee:
            Michael McCandless
            Reporter:
            Raintung Li
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development