Solr
  1. Solr
  2. SOLR-5700

Improve error handling of remote queries (proxied requests)

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.7, 6.0
    • Component/s: SolrCloud
    • Labels:
      None

      Description

      The current remoteQuery code in SolrDispatchFilter yields error messages like the following:

      org.apache.solr.servlet.SolrDispatchFilter: null:org.apache.solr.common.SolrException: Error trying to proxy request for url: http://localhost:8983/solr/myCollection/update
      at org.apache.solr.servlet.SolrDispatchFilter.remoteQuery(SolrDispatchFilter.java:580)
      at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:288)
      at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:169)
      at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
      at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
      at org.apache.solr.servlet.ProxyUserFilter.doFilter(ProxyUserFilter.java:241)
      at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
      at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
      at org.apache.solr.servlet.SolrHadoopAuthenticationFilter$2.doFilter(SolrHadoopAuthenticationFilter.java:140)
      at org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:384)
      at org.apache.solr.servlet.SolrHadoopAuthenticationFilter.doFilter(SolrHadoopAuthenticationFilter.java:145)
      at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
      at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
      at org.apache.solr.servlet.HostnameFilter.doFilter(HostnameFilter.java:86)
      at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
      at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
      at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
      at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
      at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
      at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103)
      at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
      at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
      at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:861)
      at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:606)
      at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
      at java.lang.Thread.run(Thread.java:724)
      Caused by: java.io.IOException: Server returned HTTP response code: 401 for URL: http://search-testing-c4-secure-4.ent.cloudera.com:8983/solr/sentryCollection/update?stream.body=%3Cadd%3E%3Cdoc%3E%3Cfield+name%3D%22id%22%3E1383855038349doc1%3C%2Ffield%3E%3Cfield+name%3D%22description%22%3Efirst+test+document+1383855038349%3C%2Ffield%3E%3C%2Fdoc%3E%3C%2Fadd%3E&doAs=user1
      at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
      at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
      at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
      at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
      at sun.net.www.protocol.http.HttpURLConnection$6.run(HttpURLConnection.java:1674)
      at sun.net.www.protocol.http.HttpURLConnection$6.run(HttpURLConnection.java:1672)
      at java.security.AccessController.doPrivileged(Native Method)
      at sun.net.www.protocol.http.HttpURLConnection.getChainedException(HttpURLConnection.java:1670)
      at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1243)
      at org.apache.solr.servlet.SolrDispatchFilter.remoteQuery(SolrDispatchFilter.java:567)
      ... 25 more
      Caused by: java.io.IOException: Server returned HTTP response code: 401 for URL: http://localhost:8983/solr/myCollection/update?stream.body=%3Cadd%3E%3Cdoc%3E%3Cfield+name%3D%22id%22%3E1383855038349doc1%3C%2Ffield%3E%3Cfield+name%3D%22description%22%3Efirst+test+document+1383855038349%3C%2Ffield%3E%3C%2Fdoc%3E%3C%2Fadd%3E&doAs=user1
      at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1625)
      at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:468)
      at org.apache.solr.servlet.SolrDispatchFilter.remoteQuery(SolrDispatchFilter.java:550)
      ... 25 more

      In this case, the request handler threw an exception, and all the user got back was an error code, no message. They would actually have to dig through the logs on the remote machine to the see the error.

      I tried for a bit to get the error message with HttpURLConnection, but wasn't successful. Instead, I used httpclient, like SolrServer uses. This works, as SolrServer already gives reasonable error messages.

      This approach of using httpclient has another advantage as well: because the httpclient is created via the HttpClientUtil in the same way as the other http clients, any configuration settings are automatically picked up. For example, I have an HttpClientConfigurer that I wrote to handle kerberos connections; with this approach, the forwarded requests just work with kerberos. With the old approach, I would have to modify the remoteQuery code to do kerberos-specific things.

      1. SOLR-5700.patch
        11 kB
        Steve Davids
      2. SOLR-5700.patch
        11 kB
        Gregory Chanan
      3. SOLR-5700v2.patch
        11 kB
        Gregory Chanan

        Issue Links

          Activity

          Hide
          Gregory Chanan added a comment -

          Here's a patch against trunk, as well as a test case that checks that reasonable exceptions are returned.

          Show
          Gregory Chanan added a comment - Here's a patch against trunk, as well as a test case that checks that reasonable exceptions are returned.
          Hide
          Mark Miller added a comment -

          Hmm...I'm running into an issue with BasicDistributedZk2Test#testNodeWithoutCollectionForwarding - consistent fail.

          I also think you want to use method.abort for any case the streams won't actually be fully read.

          Show
          Mark Miller added a comment - Hmm...I'm running into an issue with BasicDistributedZk2Test#testNodeWithoutCollectionForwarding - consistent fail. I also think you want to use method.abort for any case the streams won't actually be fully read.
          Hide
          Gregory Chanan added a comment -

          Thanks Mark, I'll look into the test failure and method.abort. I haven't run the tests in a few days.

          Show
          Gregory Chanan added a comment - Thanks Mark, I'll look into the test failure and method.abort. I haven't run the tests in a few days.
          Hide
          Gregory Chanan added a comment -

          Here's another rev at this patch.

          This fixes the test failure and calls abort if the remote query is not successful.

          Show
          Gregory Chanan added a comment - Here's another rev at this patch. This fixes the test failure and calls abort if the remote query is not successful.
          Hide
          Mark Miller added a comment -

          Hmm...I've seen some weird test fails while testing this out. I don't know that its not just bad luck or this patch, so I'll spend some more time later running the tests. Need to see if I see the same things with a clean check out or what.

          Show
          Mark Miller added a comment - Hmm...I've seen some weird test fails while testing this out. I don't know that its not just bad luck or this patch, so I'll spend some more time later running the tests. Need to see if I see the same things with a clean check out or what.
          Hide
          Gregory Chanan added a comment -

          Interesting. FWIW I put a divide by zero in the remoteQuery code and ran the tests and the following tests failed:
          org.apache.solr.cloud.AliasIntegrationTest.testDistribSearch
          org.apache.solr.cloud.BasicDistributedZk2Test.testDistribSearch
          org.apache.solr.cloud.hdfs.StressHdfsTest.testDistribSearch

          So if you are seeing these tests fail, it could be due to this patch, otherwise probably not.

          Show
          Gregory Chanan added a comment - Interesting. FWIW I put a divide by zero in the remoteQuery code and ran the tests and the following tests failed: org.apache.solr.cloud.AliasIntegrationTest.testDistribSearch org.apache.solr.cloud.BasicDistributedZk2Test.testDistribSearch org.apache.solr.cloud.hdfs.StressHdfsTest.testDistribSearch So if you are seeing these tests fail, it could be due to this patch, otherwise probably not.
          Hide
          Steve Davids added a comment -

          Created a new patch that always consumes the response's HttpEntity and cleaned up the test a little.

          After applying the patch to the trunk there was only one error that occurred for me:

             [junit4] Tests with failures:
             [junit4]   - org.apache.solr.cloud.OverseerRolesTest.testDistribSearch
             [junit4] 
             [junit4] 
             [junit4] JVM J0:     2.43 ..   720.78 =   718.34s
             [junit4] JVM J1:     2.43 ..   720.87 =   718.44s
             [junit4] JVM J2:     1.94 ..   718.97 =   717.03s
             [junit4] JVM J3:     2.18 ..   721.04 =   718.86s
             [junit4] JVM J4:     2.43 ..   720.07 =   717.64s
             [junit4] JVM J5:     2.43 ..   723.96 =   721.53s
             [junit4] JVM J6:     2.43 ..   719.30 =   716.88s
             [junit4] JVM J7:     2.43 ..   730.55 =   728.12s
             [junit4] Execution time total: 12 minutes 10 seconds
             [junit4] Tests summary: 369 suites, 1596 tests, 1 error, 26 ignored (13 assumptions)
          

          I believe the the OverseerRolesTest is a known flakey test associated with SOLR-5476.

          Show
          Steve Davids added a comment - Created a new patch that always consumes the response's HttpEntity and cleaned up the test a little. After applying the patch to the trunk there was only one error that occurred for me: [junit4] Tests with failures: [junit4] - org.apache.solr.cloud.OverseerRolesTest.testDistribSearch [junit4] [junit4] [junit4] JVM J0: 2.43 .. 720.78 = 718.34s [junit4] JVM J1: 2.43 .. 720.87 = 718.44s [junit4] JVM J2: 1.94 .. 718.97 = 717.03s [junit4] JVM J3: 2.18 .. 721.04 = 718.86s [junit4] JVM J4: 2.43 .. 720.07 = 717.64s [junit4] JVM J5: 2.43 .. 723.96 = 721.53s [junit4] JVM J6: 2.43 .. 719.30 = 716.88s [junit4] JVM J7: 2.43 .. 730.55 = 728.12s [junit4] Execution time total: 12 minutes 10 seconds [junit4] Tests summary: 369 suites, 1596 tests, 1 error, 26 ignored (13 assumptions) I believe the the OverseerRolesTest is a known flakey test associated with SOLR-5476 .
          Hide
          Mark Miller added a comment -

          I've had some freaky luck with that patch applied - but I've convinced myself it's some wicked coincidence. I'll commit and if there is indeed some crazy interaction, the jenkins cluster will ferret it out pretty quickly.

          Show
          Mark Miller added a comment - I've had some freaky luck with that patch applied - but I've convinced myself it's some wicked coincidence. I'll commit and if there is indeed some crazy interaction, the jenkins cluster will ferret it out pretty quickly.
          Hide
          ASF subversion and git services added a comment -

          Commit 1566174 from Mark Miller in branch 'dev/trunk'
          [ https://svn.apache.org/r1566174 ]

          SOLR-5700: Improve error handling of remote queries (proxied requests).

          Show
          ASF subversion and git services added a comment - Commit 1566174 from Mark Miller in branch 'dev/trunk' [ https://svn.apache.org/r1566174 ] SOLR-5700 : Improve error handling of remote queries (proxied requests).
          Hide
          Mark Miller added a comment - - edited

          Steve Davids - whoops - didn't see your comment - I was on an old page. Thanks! I'll merge in your work.

          OverseerRolesTest is a known flakey test

          Yeah, we should probably ignore it until something is done. It's failing on jenkins regularly and on my machine over the 50% of the time.

          Show
          Mark Miller added a comment - - edited Steve Davids - whoops - didn't see your comment - I was on an old page. Thanks! I'll merge in your work. OverseerRolesTest is a known flakey test Yeah, we should probably ignore it until something is done. It's failing on jenkins regularly and on my machine over the 50% of the time.
          Hide
          ASF subversion and git services added a comment -

          Commit 1566176 from Mark Miller in branch 'dev/branches/branch_4x'
          [ https://svn.apache.org/r1566176 ]

          SOLR-5700: Improve error handling of remote queries (proxied requests).

          Show
          ASF subversion and git services added a comment - Commit 1566176 from Mark Miller in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1566176 ] SOLR-5700 : Improve error handling of remote queries (proxied requests).
          Hide
          ASF subversion and git services added a comment -

          Commit 1566179 from Mark Miller in branch 'dev/trunk'
          [ https://svn.apache.org/r1566179 ]

          SOLR-5700: Always consumes the response's HttpEntity and cleaned up the test a little.

          Show
          ASF subversion and git services added a comment - Commit 1566179 from Mark Miller in branch 'dev/trunk' [ https://svn.apache.org/r1566179 ] SOLR-5700 : Always consumes the response's HttpEntity and cleaned up the test a little.
          Hide
          ASF subversion and git services added a comment -

          Commit 1566180 from Mark Miller in branch 'dev/branches/branch_4x'
          [ https://svn.apache.org/r1566180 ]

          SOLR-5700: Always consumes the response's HttpEntity and cleaned up the test a little.

          Show
          ASF subversion and git services added a comment - Commit 1566180 from Mark Miller in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1566180 ] SOLR-5700 : Always consumes the response's HttpEntity and cleaned up the test a little.
          Hide
          Mark Miller added a comment -

          Thanks guys, much appreciated!

          Show
          Mark Miller added a comment - Thanks guys, much appreciated!

            People

            • Assignee:
              Mark Miller
              Reporter:
              Gregory Chanan
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development