Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-10362

"Memory Pool not found" error when reporting JVM metrics

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 6.5, 7.0
    • Fix Version/s: 6.6, 7.0
    • Component/s: metrics
    • Security Level: Public (Default Security Level. Issues are Public)
    • Labels:
      None

      Description

      These test failures are likely caused by a JVM bug. We should catch and work around it to be able report other existing metrics:

      https://jenkins.thetaphi.de/job/Lucene-Solr-6.x-Linux/3138/testReport/junit/org.apache.solr.handler.admin/MetricsHandlerTest/testCompact/
      
      java.lang.InternalError: Memory Pool not found
      	at __randomizedtesting.SeedInfo.seed([8F4813A324434093:A1485FF45CBE4A6C]:0)
      	at sun.management.MemoryPoolImpl.getUsage0(Native Method)
      	at sun.management.MemoryPoolImpl.getUsage(MemoryPoolImpl.java:96)
      	at com.codahale.metrics.jvm.MemoryUsageGaugeSet$18.getValue(MemoryUsageGaugeSet.java:177)
      	at com.codahale.metrics.jvm.MemoryUsageGaugeSet$18.getValue(MemoryUsageGaugeSet.java:174)
      	at org.apache.solr.util.stats.MetricUtils.convertGauge(MetricUtils.java:215)
      	at org.apache.solr.util.stats.MetricUtils.lambda$toMaps$4(MetricUtils.java:142)
      	at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184)
      	at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
      	at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
      	at java.util.TreeMap$KeySpliterator.forEachRemaining(TreeMap.java:2746)
      	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
      	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
      	at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
      	at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
      	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
      	at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
      	at org.apache.solr.util.stats.MetricUtils.toMaps(MetricUtils.java:135)
      	at org.apache.solr.util.stats.MetricUtils.toNamedList(MetricUtils.java:117)
      	at org.apache.solr.handler.admin.MetricsHandler.handleRequestBody(MetricsHandler.java:85)
      	at org.apache.solr.handler.admin.MetricsHandlerTest.testCompact(MetricsHandlerTest.java:160)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      

      See here for a possible explanation (thanks Hoss!): https://bugs.openjdk.java.net/browse/JDK-8025089

      1. SOLR-10362.patch
        4 kB
        Andrzej Bialecki

        Activity

        Hide
        ab Andrzej Bialecki added a comment -

        This happens when accessing a gauge implementation that is provided by metrics (MemoryUsageGaugeSet), so we can catch this only in MetricUtils.convertGauge - it would be better to catch this earlier but that would mean reimplementing the gauge set.

        Show
        ab Andrzej Bialecki added a comment - This happens when accessing a gauge implementation that is provided by metrics ( MemoryUsageGaugeSet ), so we can catch this only in MetricUtils.convertGauge - it would be better to catch this earlier but that would mean reimplementing the gauge set.
        Hide
        ab Andrzej Bialecki added a comment - - edited

        Tentative workaround that catches the error and logs it. Uwe Schindler, this may be worth mentioning to the Oracle guys since it happened in a JDK9 run.

        Show
        ab Andrzej Bialecki added a comment - - edited Tentative workaround that catches the error and logs it. Uwe Schindler , this may be worth mentioning to the Oracle guys since it happened in a JDK9 run.
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit cb20eae1789442286f680f8dcfaf914394aed7a3 in lucene-solr's branch refs/heads/master from Andrzej Bialecki
        [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=cb20eae ]

        SOLR-10362: "Memory Pool not found" error when reporting JVM metrics.

        Show
        jira-bot ASF subversion and git services added a comment - Commit cb20eae1789442286f680f8dcfaf914394aed7a3 in lucene-solr's branch refs/heads/master from Andrzej Bialecki [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=cb20eae ] SOLR-10362 : "Memory Pool not found" error when reporting JVM metrics.
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit d010a9da75cdec1f7e4fd8a906e7ff2114aea33d in lucene-solr's branch refs/heads/branch_6x from Andrzej Bialecki
        [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=d010a9d ]

        SOLR-10362: "Memory Pool not found" error when reporting JVM metrics.

        Show
        jira-bot ASF subversion and git services added a comment - Commit d010a9da75cdec1f7e4fd8a906e7ff2114aea33d in lucene-solr's branch refs/heads/branch_6x from Andrzej Bialecki [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=d010a9d ] SOLR-10362 : "Memory Pool not found" error when reporting JVM metrics.
        Hide
        hossman Hoss Man added a comment -

        This happens when accessing a gauge implementation that is provided by metrics (MemoryUsageGaugeSet), so we can catch this only in MetricUtils.convertGauge - it would be better to catch this earlier but that would mean reimplementing the gauge set.

        InternalError is a really broad scope of error to ignore/log just because we assume it must be this specific situation ... can we at least improve the catch block to check the name/class of the Gauge to confirm it is in fact part of the MemoryUsageGaugeSet – and if not re-throw as is ?

        or at the very least: check that the InternalError mentions "Memory Pool" ???

        Show
        hossman Hoss Man added a comment - This happens when accessing a gauge implementation that is provided by metrics (MemoryUsageGaugeSet), so we can catch this only in MetricUtils.convertGauge - it would be better to catch this earlier but that would mean reimplementing the gauge set. InternalError is a really broad scope of error to ignore/log just because we assume it must be this specific situation ... can we at least improve the catch block to check the name/class of the Gauge to confirm it is in fact part of the MemoryUsageGaugeSet – and if not re-throw as is ? or at the very least: check that the InternalError mentions "Memory Pool" ???
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 7c2e908fc2ad4089d1c36b820131641e6b696de7 in lucene-solr's branch refs/heads/branch_6x from Andrzej Bialecki
        [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=7c2e908 ]

        SOLR-10362 Be more specific when catching this exception.

        Show
        jira-bot ASF subversion and git services added a comment - Commit 7c2e908fc2ad4089d1c36b820131641e6b696de7 in lucene-solr's branch refs/heads/branch_6x from Andrzej Bialecki [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=7c2e908 ] SOLR-10362 Be more specific when catching this exception.
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 30f7914c3b8ed990fcc0812f10de21722e96469f in lucene-solr's branch refs/heads/master from Andrzej Bialecki
        [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=30f7914 ]

        SOLR-10362 Be more specific when catching this exception.

        Show
        jira-bot ASF subversion and git services added a comment - Commit 30f7914c3b8ed990fcc0812f10de21722e96469f in lucene-solr's branch refs/heads/master from Andrzej Bialecki [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=30f7914 ] SOLR-10362 Be more specific when catching this exception.

          People

          • Assignee:
            ab Andrzej Bialecki
            Reporter:
            ab Andrzej Bialecki
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development