Solr
  1. Solr
  2. SOLR-4479

TermVectorComponent NPE when running Solr Cloud

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 4.1
    • Fix Version/s: 5.1, 6.0
    • Component/s: None
    • Labels:
      None

      Description

      When running Solr Cloud (just simply 2 shards - as described in wiki), got NPE
      java.lang.NullPointerException
      at org.apache.solr.handler.component.TermVectorComponent.finishStage(TermVectorComponent.java:437)
      at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:317)
      at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
      at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:242)
      at org.apache.solr.core.SolrCore.execute(SolrCore.java:1816)
      at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:448)
      at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:269)
      at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1307)
      at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:453)
      at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
      at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:560)
      at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
      at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1072)
      at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:382)
      at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
      at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1006)
      at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
      at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
      ..... Skipped

      To reproduce, follow the guide in wiki (http://wiki.apache.org/solr/SolrCloud), add some documents and then request http://localhost:8983/solr/collection1/tvrh?q=*%3A*

      If I include term search vector component in search handler, I get (on second shard):
      SEVERE: null:java.lang.NullPointerException
      at org.apache.solr.handler.component.TermVectorComponent.process(TermVectorComponent.java:321)
      at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:206)
      at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
      at org.apache.solr.core.SolrCore.execute(SolrCore.java:1699)

      1. SOLR-4479.patch
        1 kB
        Timothy Potter

        Activity

        Hide
        Dimitris Karas added a comment -

        I got the same exactly error by doing a search on multiple shards and the term vector com. on.

        Show
        Dimitris Karas added a comment - I got the same exactly error by doing a search on multiple shards and the term vector com. on.
        Hide
        Mark Miller added a comment -

        Not sure this component is distrib aware. Anyone?

        Show
        Mark Miller added a comment - Not sure this component is distrib aware. Anyone?
        Hide
        Dimitris Karas added a comment -

        In Wiki: http://wiki.apache.org/solr/DistributedSearch#Distributed_Searching says that this component support distributed search

        Show
        Dimitris Karas added a comment - In Wiki: http://wiki.apache.org/solr/DistributedSearch#Distributed_Searching says that this component support distributed search
        Hide
        Dimitris Karas added a comment -

        Seems the problem in my case was that i was using 4.2 but with an index created by solr 4.1. By indexing again with Solr 4.2 i got term vectors in my responses. cheers

        Show
        Dimitris Karas added a comment - Seems the problem in my case was that i was using 4.2 but with an index created by solr 4.1. By indexing again with Solr 4.2 i got term vectors in my responses. cheers
        Hide
        Yakov added a comment -

        Hi,

        problem is still exists on 4.4. Reindexing, as Dimitris suggested is not fixing the problem in my case. On fresh index it produces:

        "trace":"java.lang.NullPointerException\n\tat 
        org.apache.solr.handler.component.TermVectorComponent.finishStage(TermVectorComponent.java:437)\n\tat 
        org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:317)\n\tat 
        org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)\n\tat 
        org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:241)\n\tat
         org.apache.solr.core.SolrCore.execute(SolrCore.java:1904)\n\tat 
        org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:659)\n\tat 
        org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:362)\n\tat 
        org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:158)\n\tat 
        org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)\n\tat 
        org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)\n\tat 
        org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)\n\tat 
        org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)\n\tat 
        org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)\n\tat 
        org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99)\n\tat 
        org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:936)\n\tat 
        org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)\n\tat 
        org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407)\n\tat 
        org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1004)\n\tat 
        org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589)\n\tat 
        org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310)\n\tat 
        java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)\n\tat 
        java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)\n\tat 
        java.lang.Thread.run(Thread.java:722)\n",
        
            "code":500
        

        Actually, this problem is really important for me, how could I help to fix it faster?

        Show
        Yakov added a comment - Hi, problem is still exists on 4.4. Reindexing, as Dimitris suggested is not fixing the problem in my case. On fresh index it produces: "trace":"java.lang.NullPointerException\n\tat org.apache.solr.handler.component.TermVectorComponent.finishStage(TermVectorComponent.java:437)\n\tat org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:317)\n\tat org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)\n\tat org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:241)\n\tat org.apache.solr.core.SolrCore.execute(SolrCore.java:1904)\n\tat org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:659)\n\tat org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:362)\n\tat org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:158)\n\tat org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)\n\tat org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)\n\tat org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)\n\tat org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)\n\tat org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)\n\tat org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99)\n\tat org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:936)\n\tat org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)\n\tat org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407)\n\tat org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1004)\n\tat org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589)\n\tat org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310)\n\tat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)\n\tat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)\n\tat java.lang.Thread.run(Thread.java:722)\n", "code":500 Actually, this problem is really important for me, how could I help to fix it faster?
        Hide
        Raúl Grande added a comment -

        Hi,

        We have the same problem using SolrCloud v. 4.6.1

        Any solution to this so far?

        Show
        Raúl Grande added a comment - Hi, We have the same problem using SolrCloud v. 4.6.1 Any solution to this so far?
        Hide
        Stanislav Sandalnikov added a comment -

        Hi Raul,

        In our case this was solved by adding shards.qt=/tvrh to the query.

        Show
        Stanislav Sandalnikov added a comment - Hi Raul, In our case this was solved by adding shards.qt=/tvrh to the query.
        Hide
        Raúl Grande added a comment -

        Thanks a lot!

        It worked fine with that param!

        Show
        Raúl Grande added a comment - Thanks a lot! It worked fine with that param!
        Hide
        Pierre Gossé added a comment -

        Hi,

        I'm having the same issue on 4.8.1,

        I have only one field with termvectors, and around half of documents have that field.
        Queries with tv=false display the error, queries with tv=true doesn't

        shards.qt=/tvrh doesn't solve the issue

        Show
        Pierre Gossé added a comment - Hi, I'm having the same issue on 4.8.1, I have only one field with termvectors, and around half of documents have that field. Queries with tv=false display the error, queries with tv=true doesn't shards.qt=/tvrh doesn't solve the issue
        Hide
        Yakov added a comment -

        Try to remove / from parameter like shards.qt=tvrh

        Show
        Yakov added a comment - Try to remove / from parameter like shards.qt=tvrh
        Hide
        Timothy Potter added a comment -

        I'm bumping into this with some of the Spark integration work I'm doing and know that Shalin is super busy with other stuff, so I'll take it up.

        Show
        Timothy Potter added a comment - I'm bumping into this with some of the Spark integration work I'm doing and know that Shalin is super busy with other stuff, so I'll take it up.
        Hide
        Shalin Shekhar Mangar added a comment -

        Thanks Tim!

        Show
        Shalin Shekhar Mangar added a comment - Thanks Tim!
        Hide
        Timothy Potter added a comment -

        Adding the shards.qt to the query params avoids this NPE, i.e. the following query works for me on 4.10 branch:

        bin/solr -e cloud -noprompt
        java -Durl=http://localhost:8983/solr/gettingstarted/update -jar example/exampledocs/post.jar example/exampledocs/*.xml
        curl "http://localhost:8983/solr/gettingstarted_shard1_replica1/tvrh?q=*%3A*&wt=json&indent=true&tv.fl=name&rows=50&shards.qt=/tvrh&shards.info=true"
        

        So the easiest solution here would be to update the /tvrh requestHandler definition in solrconfig.xml to be:

          <requestHandler name="/tvrh" class="solr.SearchHandler" startup="lazy">
            <lst name="defaults">
              <str name="df">text</str>
              <bool name="tv">true</bool>
              <str name="shards.qt">/tvrh</str>
            </lst>
            <arr name="last-components">
              <str>tvComponent</str>
            </arr>
          </requestHandler>
        

        But this has me thinking about whether there's a bigger bug at play here? Specifically, if Solr is in distributed mode, then the shards.qt parameter should default to the same path as the top-level request handler (/tvrh in this example). I tried the same with the /spell request handler and same result, the underlying distributed shard requests all went to /select and since the SpellChecking component is not wired into /select by default, there's really no spell checking happening on each shard.

        In other words, if you send a distributed query to /tvrh without the shards.qt parameter, then the underlying shard requests are sent to /select and not /tvrh on each replica. The work-around is simple but seems like the default behavior should be to work without shards.qt???

        Show
        Timothy Potter added a comment - Adding the shards.qt to the query params avoids this NPE, i.e. the following query works for me on 4.10 branch: bin/solr -e cloud -noprompt java -Durl=http: //localhost:8983/solr/gettingstarted/update -jar example/exampledocs/post.jar example/exampledocs/*.xml curl "http: //localhost:8983/solr/gettingstarted_shard1_replica1/tvrh?q=*%3A*&wt=json&indent= true &tv.fl=name&rows=50&shards.qt=/tvrh&shards.info= true " So the easiest solution here would be to update the /tvrh requestHandler definition in solrconfig.xml to be: <requestHandler name= "/tvrh" class= "solr.SearchHandler" startup= "lazy" > <lst name= "defaults" > <str name= "df" >text</str> <bool name= "tv" > true </bool> <str name= "shards.qt" >/tvrh</str> </lst> <arr name= "last-components" > <str>tvComponent</str> </arr> </requestHandler> But this has me thinking about whether there's a bigger bug at play here? Specifically, if Solr is in distributed mode, then the shards.qt parameter should default to the same path as the top-level request handler ( /tvrh in this example). I tried the same with the /spell request handler and same result, the underlying distributed shard requests all went to /select and since the SpellChecking component is not wired into /select by default, there's really no spell checking happening on each shard. In other words, if you send a distributed query to /tvrh without the shards.qt parameter, then the underlying shard requests are sent to /select and not /tvrh on each replica. The work-around is simple but seems like the default behavior should be to work without shards.qt???
        Hide
        Shalin Shekhar Mangar added a comment -

        But this has me thinking about whether there's a bigger bug at play here? Specifically, if Solr is in distributed mode, then the shards.qt parameter should default to the same path as the top-level request handler (/tvrh in this example). I tried the same with the /spell request handler and same result, the underlying distributed shard requests all went to /select and since the SpellChecking component is not wired into /select by default, there's really no spell checking happening on each shard.
        In other words, if you send a distributed query to /tvrh without the shards.qt parameter, then the underlying shard requests are sent to /select and not /tvrh on each replica. The work-around is simple but seems like the default behavior should be to work without shards.qt???

        I think that makes sense.

        The reason behind having shards.qt is that in old-style distributed search, people would put shards=abc,xyz,pqr in the defaults section of the request handler and therefore they need shards.qt to send the non-distrib query to a different handler which does not hard code the shards parameter. So anyone who has this situation currently should already specify a shards.qt parameter different than qt. So defaulting shards.qt the same as qt makes sense.

        Show
        Shalin Shekhar Mangar added a comment - But this has me thinking about whether there's a bigger bug at play here? Specifically, if Solr is in distributed mode, then the shards.qt parameter should default to the same path as the top-level request handler (/tvrh in this example). I tried the same with the /spell request handler and same result, the underlying distributed shard requests all went to /select and since the SpellChecking component is not wired into /select by default, there's really no spell checking happening on each shard. In other words, if you send a distributed query to /tvrh without the shards.qt parameter, then the underlying shard requests are sent to /select and not /tvrh on each replica. The work-around is simple but seems like the default behavior should be to work without shards.qt??? I think that makes sense. The reason behind having shards.qt is that in old-style distributed search, people would put shards=abc,xyz,pqr in the defaults section of the request handler and therefore they need shards.qt to send the non-distrib query to a different handler which does not hard code the shards parameter. So anyone who has this situation currently should already specify a shards.qt parameter different than qt. So defaulting shards.qt the same as qt makes sense.
        Hide
        Noble Paul added a comment -

        If RequestHandler instanceof SearchHandler we can make shards.qt=qt by default. For others, let the requesthandler set the value or let the user configure it explicitly

        Show
        Noble Paul added a comment - If RequestHandler instanceof SearchHandler we can make shards.qt=qt by default. For others, let the requesthandler set the value or let the user configure it explicitly
        Hide
        Timothy Potter added a comment -

        Patch the sets the qt param on the shard request to the path pulled from the request context if shards.qt is not set and the path for the top-level request is not /select. This address the general issue described in the comments of this ticket and all tests pass on trunk with this applied. Definitely need a review of this approach though as it's changing default behavior of query requests.

        Show
        Timothy Potter added a comment - Patch the sets the qt param on the shard request to the path pulled from the request context if shards.qt is not set and the path for the top-level request is not /select . This address the general issue described in the comments of this ticket and all tests pass on trunk with this applied. Definitely need a review of this approach though as it's changing default behavior of query requests.
        Show
        Varun Thacker added a comment - Hi Tim, Not entirely sure, but this comment applies here as well ? https://issues.apache.org/jira/browse/SOLR-6311?focusedCommentId=14085258&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14085258
        Hide
        Timothy Potter added a comment -

        Yes, it applies. I think we have to do something here, so am thinking the SearchHandler can keep track of whether it has any custom search components and if so, it can apply the path as the default if shards.qt is not supplied.

        Show
        Timothy Potter added a comment - Yes, it applies. I think we have to do something here, so am thinking the SearchHandler can keep track of whether it has any custom search components and if so, it can apply the path as the default if shards.qt is not supplied.
        Hide
        Mark Miller added a comment -

        Good to fix this stuff. Getting this to work right used to be kind of like a special secret

        Show
        Mark Miller added a comment - Good to fix this stuff. Getting this to work right used to be kind of like a special secret
        Hide
        Timothy Potter added a comment -

        Fixed with SOLR-6311

        Show
        Timothy Potter added a comment - Fixed with SOLR-6311
        Hide
        Timothy Potter added a comment -

        Bulk close after 5.1 release

        Show
        Timothy Potter added a comment - Bulk close after 5.1 release

          People

          • Assignee:
            Timothy Potter
            Reporter:
            Vitali Kviatkouski
          • Votes:
            4 Vote for this issue
            Watchers:
            12 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development