Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 6.4, 7.0
    • Component/s: metrics
    • Security Level: Public (Default Security Level. Issues are Public)
    • Labels:
      None

      Description

      Use instrumented equivalents of PooledHttpClientConnectionManager and others from metrics-httpclient library.

      1. SOLR_9877_branch_6x_hostport_fix.patch
        2 kB
        Shalin Shekhar Mangar
      2. SOLR-9877_branch_6x.patch
        27 kB
        Shalin Shekhar Mangar
      3. SOLR-9877.patch
        22 kB
        Shalin Shekhar Mangar
      4. SOLR-9877.patch
        23 kB
        Shalin Shekhar Mangar
      5. solr-http-metrics.png
        99 kB
        Shalin Shekhar Mangar

        Activity

        Hide
        shalinmangar Shalin Shekhar Mangar added a comment -

        Patch that adds instrumentation for HttpShardHandlerFactory. I'm going to add metrics to UpdateShardHandler along similar lines. The metrics-http library is added to solr in the patch but I am going to remove it since it is not flexible enough for our API. Instead, I've added solr specific sub-classes of PoolingHttpClientConnectionManager and HttpRequestExecutor which implement SolrMetricProducer interface.

        Show
        shalinmangar Shalin Shekhar Mangar added a comment - Patch that adds instrumentation for HttpShardHandlerFactory. I'm going to add metrics to UpdateShardHandler along similar lines. The metrics-http library is added to solr in the patch but I am going to remove it since it is not flexible enough for our API. Instead, I've added solr specific sub-classes of PoolingHttpClientConnectionManager and HttpRequestExecutor which implement SolrMetricProducer interface.
        Hide
        shalinmangar Shalin Shekhar Mangar added a comment -

        Attached is a screenshot of http metrics collected by the last patch

        Show
        shalinmangar Shalin Shekhar Mangar added a comment - Attached is a screenshot of http metrics collected by the last patch
        Hide
        shalinmangar Shalin Shekhar Mangar added a comment -
        • Added instrumentation to update shard handler
        • Removed dependency on dropwizard-httpclient library

        Precommit passes. This is ready.

        Show
        shalinmangar Shalin Shekhar Mangar added a comment - Added instrumentation to update shard handler Removed dependency on dropwizard-httpclient library Precommit passes. This is ready.
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 254473bf33ee7ce33a47c9229396902e812736e5 in lucene-solr's branch refs/heads/master from Shalin Shekhar Mangar
        [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=254473b ]

        SOLR-9877: Use instrumented http client and connection pool

        Show
        jira-bot ASF subversion and git services added a comment - Commit 254473bf33ee7ce33a47c9229396902e812736e5 in lucene-solr's branch refs/heads/master from Shalin Shekhar Mangar [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=254473b ] SOLR-9877 : Use instrumented http client and connection pool
        Hide
        mkhludnev Mikhail Khludnev added a comment -

        Hello,

        I notice reproducing failure https://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/18637/testReport/

           [junit4]   2> NOTE: reproduce with: ant test  -Dtestcase=SolrCmdDistributorTest -Dtests.method=test -Dtests.seed=234F7262BA762194 -Dtests.multiplier=3 -Dtests.slow=true -Dtests.locale=ki-KE -Dtests.timezone=Etc/GMT+4 -Dtests.asserts=true -Dtests.file.encoding=ISO-8859-1
           [junit4] FAILURE 7.82s J1 | SolrCmdDistributorTest.test <<<
           [junit4]    > Throwable #1: java.lang.AssertionError: expected:<1> but was:<0>
           [junit4]    > 	at __randomizedtesting.SeedInfo.seed([234F7262BA762194:AB1B4DB8148A4C6C]:0)
           [junit4]    > 	at org.apache.solr.update.SolrCmdDistributorTest.test(SolrCmdDistributorTest.java:169)
        

        since https://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/18631/changes
        It appears with

           [junit4]   2> 1122459 ERROR (updateExecutor-1740-thread-1) [    ] o.a.s.u.StreamingSolrClients error
           [junit4]   2> java.lang.AssertionError
           [junit4]   2> 	at org.apache.solr.util.stats.InstrumentedHttpRequestExecutor.execute(InstrumentedHttpRequestExecutor.java:56)
           [junit4]   2> 	at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:271)
           [junit4]   2> 	at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:184)
           [junit4]   2> 	at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:88)
           [junit4]   2> 	at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
           [junit4]   2> 	at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:184)
           [junit4]   2> 	at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
           [junit4]   2> 	at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:55)
           [junit4]   2> 	at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.sendUpdateStream(ConcurrentUpdateSolrClient.java:323)
           [junit4]   2> 	at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:186)
           [junit4]   2> 	at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229)
           [junit4]   2> 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1161)
           [junit4]   2> 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
           [junit4]   2> 	at java.base/java.lang.Thread.run(Thread.java:844)
           [junit4]   2> 1122461 INFO  (qtp979734836-10317) [    ] o.a.s.c.S.Request [collection1]  webapp=/_z path=/select params={q=*:*&wt=javabin&version=2} hits=0 status=0 QTime=0
        

        Can it be related to this jira?

        Show
        mkhludnev Mikhail Khludnev added a comment - Hello, I notice reproducing failure https://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/18637/testReport/ [junit4] 2> NOTE: reproduce with: ant test -Dtestcase=SolrCmdDistributorTest -Dtests.method=test -Dtests.seed=234F7262BA762194 -Dtests.multiplier=3 -Dtests.slow= true -Dtests.locale=ki-KE -Dtests.timezone=Etc/GMT+4 -Dtests.asserts= true -Dtests.file.encoding=ISO-8859-1 [junit4] FAILURE 7.82s J1 | SolrCmdDistributorTest.test <<< [junit4] > Throwable #1: java.lang.AssertionError: expected:<1> but was:<0> [junit4] > at __randomizedtesting.SeedInfo.seed([234F7262BA762194:AB1B4DB8148A4C6C]:0) [junit4] > at org.apache.solr.update.SolrCmdDistributorTest.test(SolrCmdDistributorTest.java:169) since https://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/18631/changes It appears with [junit4] 2> 1122459 ERROR (updateExecutor-1740-thread-1) [ ] o.a.s.u.StreamingSolrClients error [junit4] 2> java.lang.AssertionError [junit4] 2> at org.apache.solr.util.stats.InstrumentedHttpRequestExecutor.execute(InstrumentedHttpRequestExecutor.java:56) [junit4] 2> at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:271) [junit4] 2> at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:184) [junit4] 2> at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:88) [junit4] 2> at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110) [junit4] 2> at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:184) [junit4] 2> at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82) [junit4] 2> at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:55) [junit4] 2> at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.sendUpdateStream(ConcurrentUpdateSolrClient.java:323) [junit4] 2> at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:186) [junit4] 2> at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229) [junit4] 2> at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1161) [junit4] 2> at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [junit4] 2> at java.base/java.lang. Thread .run( Thread .java:844) [junit4] 2> 1122461 INFO (qtp979734836-10317) [ ] o.a.s.c.S.Request [collection1] webapp=/_z path=/select params={q=*:*&wt=javabin&version=2} hits=0 status=0 QTime=0 Can it be related to this jira?
        Hide
        shalinmangar Shalin Shekhar Mangar added a comment -

        Yes, I'll fix, thanks!

        Show
        shalinmangar Shalin Shekhar Mangar added a comment - Yes, I'll fix, thanks!
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit c2292faaf1f4993bf1cec666f4286ac71f786506 in lucene-solr's branch refs/heads/master from Shalin Shekhar Mangar
        [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=c2292fa ]

        SOLR-9877: Remove assertion because many tests use UpdateShardHandler without metrics

        Show
        jira-bot ASF subversion and git services added a comment - Commit c2292faaf1f4993bf1cec666f4286ac71f786506 in lucene-solr's branch refs/heads/master from Shalin Shekhar Mangar [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=c2292fa ] SOLR-9877 : Remove assertion because many tests use UpdateShardHandler without metrics
        Hide
        markrmiller@gmail.com Mark Miller added a comment -

        SolrCmdDistributorTest seems to be exposing a race of some kind or something. It's fails on 99% of my local runs.

        InstrumentedHttpRequestExecutor hits an NPE in Timer timer(HttpRequest request), I'd guess because the metricsRegistry has not been assigned yet.

        Show
        markrmiller@gmail.com Mark Miller added a comment - SolrCmdDistributorTest seems to be exposing a race of some kind or something. It's fails on 99% of my local runs. InstrumentedHttpRequestExecutor hits an NPE in Timer timer(HttpRequest request), I'd guess because the metricsRegistry has not been assigned yet.
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 662be93ed11abebaff1d13711f3bffca478ba61e in lucene-solr's branch refs/heads/master from Shalin Shekhar Mangar
        [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=662be93 ]

        SOLR-9877: Null check for metric registry before attempting to use it

        Show
        jira-bot ASF subversion and git services added a comment - Commit 662be93ed11abebaff1d13711f3bffca478ba61e in lucene-solr's branch refs/heads/master from Shalin Shekhar Mangar [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=662be93 ] SOLR-9877 : Null check for metric registry before attempting to use it
        Hide
        shalinmangar Shalin Shekhar Mangar added a comment -

        Here's a patch that applies on branch_6x. It required a different approach than master because 6.x uses an older deprecated http client API. I ended up extending DefaultHttpClient to override the createRequestExecutor() method which creates and sets up the InstrumentedHttpRequestExecutor.

        This does not use the HttpClientFactory and its methods introduced in SOLR-6625 but firstly the factory's static setter is never used in Solr and secondly, I'll open an issue to get rid of it completely.

        Show
        shalinmangar Shalin Shekhar Mangar added a comment - Here's a patch that applies on branch_6x. It required a different approach than master because 6.x uses an older deprecated http client API. I ended up extending DefaultHttpClient to override the createRequestExecutor() method which creates and sets up the InstrumentedHttpRequestExecutor. This does not use the HttpClientFactory and its methods introduced in SOLR-6625 but firstly the factory's static setter is never used in Solr and secondly, I'll open an issue to get rid of it completely.
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit a50ebcb412b1a884b826b62418e9f5d8b3c1f40c in lucene-solr's branch refs/heads/branch_6x from Shalin Shekhar Mangar
        [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=a50ebcb ]

        SOLR-9877: Use instrumented http client and connection pool

        Show
        jira-bot ASF subversion and git services added a comment - Commit a50ebcb412b1a884b826b62418e9f5d8b3c1f40c in lucene-solr's branch refs/heads/branch_6x from Shalin Shekhar Mangar [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=a50ebcb ] SOLR-9877 : Use instrumented http client and connection pool
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit f65dc06180bdb02cfbfa048e2f08d1183c250d5d in lucene-solr's branch refs/heads/branch_6x from Shalin Shekhar Mangar
        [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=f65dc06 ]

        SOLR-9877: Null check for metric registry before attempting to use it

        (cherry picked from commit 662be93)

        Show
        jira-bot ASF subversion and git services added a comment - Commit f65dc06180bdb02cfbfa048e2f08d1183c250d5d in lucene-solr's branch refs/heads/branch_6x from Shalin Shekhar Mangar [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=f65dc06 ] SOLR-9877 : Null check for metric registry before attempting to use it (cherry picked from commit 662be93)
        Hide
        shalinmangar Shalin Shekhar Mangar added a comment -

        Re-opening because the host/port is not recorded for outgoing http requests correctly on branch_6x due to the httpclient API being different than master.

        Show
        shalinmangar Shalin Shekhar Mangar added a comment - Re-opening because the host/port is not recorded for outgoing http requests correctly on branch_6x due to the httpclient API being different than master.
        Hide
        shalinmangar Shalin Shekhar Mangar added a comment -

        Patch which unwraps the EntityEnclosingRequestWrapper to get the right URI that has host/port information.

        Show
        shalinmangar Shalin Shekhar Mangar added a comment - Patch which unwraps the EntityEnclosingRequestWrapper to get the right URI that has host/port information.
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit fd2c8cb125c1955940bd33f19ee06b4230f38a36 in lucene-solr's branch refs/heads/branch_6x from Shalin Shekhar Mangar
        [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=fd2c8cb ]

        SOLR-9877: Unwrap the EntityEnclosingRequestWrapper to get the right URI which has host/port information

        Show
        jira-bot ASF subversion and git services added a comment - Commit fd2c8cb125c1955940bd33f19ee06b4230f38a36 in lucene-solr's branch refs/heads/branch_6x from Shalin Shekhar Mangar [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=fd2c8cb ] SOLR-9877 : Unwrap the EntityEnclosingRequestWrapper to get the right URI which has host/port information
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 3eab1b4839e30d5a82923afeff1bc19bf8e6b25f in lucene-solr's branch refs/heads/master from Shalin Shekhar Mangar
        [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=3eab1b4 ]

        SOLR-9877: Add a null check for target

        Show
        jira-bot ASF subversion and git services added a comment - Commit 3eab1b4839e30d5a82923afeff1bc19bf8e6b25f in lucene-solr's branch refs/heads/master from Shalin Shekhar Mangar [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=3eab1b4 ] SOLR-9877 : Add a null check for target
        Hide
        shalinmangar Shalin Shekhar Mangar added a comment -

        I added a null check for original request on branch_6x and the target on master in the commit.

        Show
        shalinmangar Shalin Shekhar Mangar added a comment - I added a null check for original request on branch_6x and the target on master in the commit.

          People

          • Assignee:
            shalinmangar Shalin Shekhar Mangar
            Reporter:
            shalinmangar Shalin Shekhar Mangar
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development