Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-4479

TermVectorComponent NPE when running Solr Cloud

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 4.1
    • 5.1, 6.0
    • None
    • None

    Description

      When running Solr Cloud (just simply 2 shards - as described in wiki), got NPE
      java.lang.NullPointerException
      at org.apache.solr.handler.component.TermVectorComponent.finishStage(TermVectorComponent.java:437)
      at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:317)
      at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
      at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:242)
      at org.apache.solr.core.SolrCore.execute(SolrCore.java:1816)
      at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:448)
      at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:269)
      at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1307)
      at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:453)
      at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
      at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:560)
      at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
      at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1072)
      at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:382)
      at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
      at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1006)
      at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
      at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
      ..... Skipped

      To reproduce, follow the guide in wiki (http://wiki.apache.org/solr/SolrCloud), add some documents and then request http://localhost:8983/solr/collection1/tvrh?q=*%3A*

      If I include term search vector component in search handler, I get (on second shard):
      SEVERE: null:java.lang.NullPointerException
      at org.apache.solr.handler.component.TermVectorComponent.process(TermVectorComponent.java:321)
      at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:206)
      at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
      at org.apache.solr.core.SolrCore.execute(SolrCore.java:1699)

      Attachments

        1. SOLR-4479.patch
          1 kB
          Timothy Potter

        Activity

          arisre82 Dimitris Karas added a comment -

          I got the same exactly error by doing a search on multiple shards and the term vector com. on.

          arisre82 Dimitris Karas added a comment - I got the same exactly error by doing a search on multiple shards and the term vector com. on.
          markrmiller@gmail.com Mark Miller added a comment -

          Not sure this component is distrib aware. Anyone?

          markrmiller@gmail.com Mark Miller added a comment - Not sure this component is distrib aware. Anyone?
          arisre82 Dimitris Karas added a comment -

          In Wiki: http://wiki.apache.org/solr/DistributedSearch#Distributed_Searching says that this component support distributed search

          arisre82 Dimitris Karas added a comment - In Wiki: http://wiki.apache.org/solr/DistributedSearch#Distributed_Searching says that this component support distributed search
          arisre82 Dimitris Karas added a comment -

          Seems the problem in my case was that i was using 4.2 but with an index created by solr 4.1. By indexing again with Solr 4.2 i got term vectors in my responses. cheers

          arisre82 Dimitris Karas added a comment - Seems the problem in my case was that i was using 4.2 but with an index created by solr 4.1. By indexing again with Solr 4.2 i got term vectors in my responses. cheers
          lodary Yakov added a comment -

          Hi,

          problem is still exists on 4.4. Reindexing, as Dimitris suggested is not fixing the problem in my case. On fresh index it produces:

          "trace":"java.lang.NullPointerException\n\tat 
          org.apache.solr.handler.component.TermVectorComponent.finishStage(TermVectorComponent.java:437)\n\tat 
          org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:317)\n\tat 
          org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)\n\tat 
          org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:241)\n\tat
           org.apache.solr.core.SolrCore.execute(SolrCore.java:1904)\n\tat 
          org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:659)\n\tat 
          org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:362)\n\tat 
          org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:158)\n\tat 
          org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)\n\tat 
          org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)\n\tat 
          org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)\n\tat 
          org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)\n\tat 
          org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)\n\tat 
          org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99)\n\tat 
          org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:936)\n\tat 
          org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)\n\tat 
          org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407)\n\tat 
          org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1004)\n\tat 
          org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589)\n\tat 
          org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310)\n\tat 
          java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)\n\tat 
          java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)\n\tat 
          java.lang.Thread.run(Thread.java:722)\n",
          
              "code":500
          

          Actually, this problem is really important for me, how could I help to fix it faster?

          lodary Yakov added a comment - Hi, problem is still exists on 4.4. Reindexing, as Dimitris suggested is not fixing the problem in my case. On fresh index it produces: "trace":"java.lang.NullPointerException\n\tat org.apache.solr.handler.component.TermVectorComponent.finishStage(TermVectorComponent.java:437)\n\tat org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:317)\n\tat org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)\n\tat org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:241)\n\tat org.apache.solr.core.SolrCore.execute(SolrCore.java:1904)\n\tat org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:659)\n\tat org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:362)\n\tat org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:158)\n\tat org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)\n\tat org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)\n\tat org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)\n\tat org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)\n\tat org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)\n\tat org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99)\n\tat org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:936)\n\tat org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)\n\tat org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407)\n\tat org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1004)\n\tat org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589)\n\tat org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310)\n\tat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)\n\tat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)\n\tat java.lang.Thread.run(Thread.java:722)\n", "code":500 Actually, this problem is really important for me, how could I help to fix it faster?
          raulgrande83 Raúl Grande added a comment -

          Hi,

          We have the same problem using SolrCloud v. 4.6.1

          Any solution to this so far?

          raulgrande83 Raúl Grande added a comment - Hi, We have the same problem using SolrCloud v. 4.6.1 Any solution to this so far?

          Hi Raul,

          In our case this was solved by adding shards.qt=/tvrh to the query.

          sndl Stanislav Sandalnikov added a comment - Hi Raul, In our case this was solved by adding shards.qt=/tvrh to the query.
          raulgrande83 Raúl Grande added a comment -

          Thanks a lot!

          It worked fine with that param!

          raulgrande83 Raúl Grande added a comment - Thanks a lot! It worked fine with that param!

          Hi,

          I'm having the same issue on 4.8.1,

          I have only one field with termvectors, and around half of documents have that field.
          Queries with tv=false display the error, queries with tv=true doesn't

          shards.qt=/tvrh doesn't solve the issue

          pigo Pierre Gossé added a comment - Hi, I'm having the same issue on 4.8.1, I have only one field with termvectors, and around half of documents have that field. Queries with tv=false display the error, queries with tv=true doesn't shards.qt=/tvrh doesn't solve the issue
          lodary Yakov added a comment -

          Try to remove / from parameter like shards.qt=tvrh

          lodary Yakov added a comment - Try to remove / from parameter like shards.qt=tvrh
          thelabdude Timothy Potter added a comment -

          I'm bumping into this with some of the Spark integration work I'm doing and know that Shalin is super busy with other stuff, so I'll take it up.

          thelabdude Timothy Potter added a comment - I'm bumping into this with some of the Spark integration work I'm doing and know that Shalin is super busy with other stuff, so I'll take it up.

          Thanks Tim!

          shalin Shalin Shekhar Mangar added a comment - Thanks Tim!
          thelabdude Timothy Potter added a comment -

          Adding the shards.qt to the query params avoids this NPE, i.e. the following query works for me on 4.10 branch:

          bin/solr -e cloud -noprompt
          java -Durl=http://localhost:8983/solr/gettingstarted/update -jar example/exampledocs/post.jar example/exampledocs/*.xml
          curl "http://localhost:8983/solr/gettingstarted_shard1_replica1/tvrh?q=*%3A*&wt=json&indent=true&tv.fl=name&rows=50&shards.qt=/tvrh&shards.info=true"
          

          So the easiest solution here would be to update the /tvrh requestHandler definition in solrconfig.xml to be:

            <requestHandler name="/tvrh" class="solr.SearchHandler" startup="lazy">
              <lst name="defaults">
                <str name="df">text</str>
                <bool name="tv">true</bool>
                <str name="shards.qt">/tvrh</str>
              </lst>
              <arr name="last-components">
                <str>tvComponent</str>
              </arr>
            </requestHandler>
          

          But this has me thinking about whether there's a bigger bug at play here? Specifically, if Solr is in distributed mode, then the shards.qt parameter should default to the same path as the top-level request handler (/tvrh in this example). I tried the same with the /spell request handler and same result, the underlying distributed shard requests all went to /select and since the SpellChecking component is not wired into /select by default, there's really no spell checking happening on each shard.

          In other words, if you send a distributed query to /tvrh without the shards.qt parameter, then the underlying shard requests are sent to /select and not /tvrh on each replica. The work-around is simple but seems like the default behavior should be to work without shards.qt???

          thelabdude Timothy Potter added a comment - Adding the shards.qt to the query params avoids this NPE, i.e. the following query works for me on 4.10 branch: bin/solr -e cloud -noprompt java -Durl=http: //localhost:8983/solr/gettingstarted/update -jar example/exampledocs/post.jar example/exampledocs/*.xml curl "http: //localhost:8983/solr/gettingstarted_shard1_replica1/tvrh?q=*%3A*&wt=json&indent= true &tv.fl=name&rows=50&shards.qt=/tvrh&shards.info= true " So the easiest solution here would be to update the /tvrh requestHandler definition in solrconfig.xml to be: <requestHandler name= "/tvrh" class= "solr.SearchHandler" startup= "lazy" > <lst name= "defaults" > <str name= "df" >text</str> <bool name= "tv" > true </bool> <str name= "shards.qt" >/tvrh</str> </lst> <arr name= "last-components" > <str>tvComponent</str> </arr> </requestHandler> But this has me thinking about whether there's a bigger bug at play here? Specifically, if Solr is in distributed mode, then the shards.qt parameter should default to the same path as the top-level request handler ( /tvrh in this example). I tried the same with the /spell request handler and same result, the underlying distributed shard requests all went to /select and since the SpellChecking component is not wired into /select by default, there's really no spell checking happening on each shard. In other words, if you send a distributed query to /tvrh without the shards.qt parameter, then the underlying shard requests are sent to /select and not /tvrh on each replica. The work-around is simple but seems like the default behavior should be to work without shards.qt???

          But this has me thinking about whether there's a bigger bug at play here? Specifically, if Solr is in distributed mode, then the shards.qt parameter should default to the same path as the top-level request handler (/tvrh in this example). I tried the same with the /spell request handler and same result, the underlying distributed shard requests all went to /select and since the SpellChecking component is not wired into /select by default, there's really no spell checking happening on each shard.
          In other words, if you send a distributed query to /tvrh without the shards.qt parameter, then the underlying shard requests are sent to /select and not /tvrh on each replica. The work-around is simple but seems like the default behavior should be to work without shards.qt???

          I think that makes sense.

          The reason behind having shards.qt is that in old-style distributed search, people would put shards=abc,xyz,pqr in the defaults section of the request handler and therefore they need shards.qt to send the non-distrib query to a different handler which does not hard code the shards parameter. So anyone who has this situation currently should already specify a shards.qt parameter different than qt. So defaulting shards.qt the same as qt makes sense.

          shalin Shalin Shekhar Mangar added a comment - But this has me thinking about whether there's a bigger bug at play here? Specifically, if Solr is in distributed mode, then the shards.qt parameter should default to the same path as the top-level request handler (/tvrh in this example). I tried the same with the /spell request handler and same result, the underlying distributed shard requests all went to /select and since the SpellChecking component is not wired into /select by default, there's really no spell checking happening on each shard. In other words, if you send a distributed query to /tvrh without the shards.qt parameter, then the underlying shard requests are sent to /select and not /tvrh on each replica. The work-around is simple but seems like the default behavior should be to work without shards.qt??? I think that makes sense. The reason behind having shards.qt is that in old-style distributed search, people would put shards=abc,xyz,pqr in the defaults section of the request handler and therefore they need shards.qt to send the non-distrib query to a different handler which does not hard code the shards parameter. So anyone who has this situation currently should already specify a shards.qt parameter different than qt. So defaulting shards.qt the same as qt makes sense.
          noble.paul Noble Paul added a comment -

          If RequestHandler instanceof SearchHandler we can make shards.qt=qt by default. For others, let the requesthandler set the value or let the user configure it explicitly

          noble.paul Noble Paul added a comment - If RequestHandler instanceof SearchHandler we can make shards.qt=qt by default. For others, let the requesthandler set the value or let the user configure it explicitly
          thelabdude Timothy Potter added a comment -

          Patch the sets the qt param on the shard request to the path pulled from the request context if shards.qt is not set and the path for the top-level request is not /select. This address the general issue described in the comments of this ticket and all tests pass on trunk with this applied. Definitely need a review of this approach though as it's changing default behavior of query requests.

          thelabdude Timothy Potter added a comment - Patch the sets the qt param on the shard request to the path pulled from the request context if shards.qt is not set and the path for the top-level request is not /select . This address the general issue described in the comments of this ticket and all tests pass on trunk with this applied. Definitely need a review of this approach though as it's changing default behavior of query requests.
          varun Varun Thacker added a comment - Hi Tim, Not entirely sure, but this comment applies here as well ? https://issues.apache.org/jira/browse/SOLR-6311?focusedCommentId=14085258&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14085258
          thelabdude Timothy Potter added a comment -

          Yes, it applies. I think we have to do something here, so am thinking the SearchHandler can keep track of whether it has any custom search components and if so, it can apply the path as the default if shards.qt is not supplied.

          thelabdude Timothy Potter added a comment - Yes, it applies. I think we have to do something here, so am thinking the SearchHandler can keep track of whether it has any custom search components and if so, it can apply the path as the default if shards.qt is not supplied.
          markrmiller@gmail.com Mark Miller added a comment -

          Good to fix this stuff. Getting this to work right used to be kind of like a special secret

          markrmiller@gmail.com Mark Miller added a comment - Good to fix this stuff. Getting this to work right used to be kind of like a special secret
          thelabdude Timothy Potter added a comment -

          Fixed with SOLR-6311

          thelabdude Timothy Potter added a comment - Fixed with SOLR-6311
          thelabdude Timothy Potter added a comment -

          Bulk close after 5.1 release

          thelabdude Timothy Potter added a comment - Bulk close after 5.1 release

          People

            thelabdude Timothy Potter
            vkviatkouski Vitali Kviatkouski
            Votes:
            4 Vote for this issue
            Watchers:
            12 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: