HBase
  1. HBase
  2. HBASE-11767

[0.94] Unnecessary garbage produced by schema metrics during scanning

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.94.23
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Near the end of StoreScanner.next(...) we find this gem:

          } finally {
            if (cumulativeMetric > 0 && metric != null) {
              RegionMetricsStorage.incrNumericMetric(this.metricNamePrefix + metric,
                  cumulativeMetric);
            }
          }
      

      So, for each row generated we build up a new metric string, that will be identical for each invocation of the StoreScanner anyway (a store scanner is valid for at most one region and one operation).

      1. 11767.txt
        3 kB
        Lars Hofhansl

        Activity

        Hide
        Hudson added a comment -

        FAILURE: Integrated in HBase-0.94-security #512 (See https://builds.apache.org/job/HBase-0.94-security/512/)
        HBASE-11767 [0.94] Unnecessary garbage produced by schema metrics during scanning. (larsh: rev 0ea4b86b07b32d46b23f4f35de032370d64dd021)

        • src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java
        • src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
        Show
        Hudson added a comment - FAILURE: Integrated in HBase-0.94-security #512 (See https://builds.apache.org/job/HBase-0.94-security/512/ ) HBASE-11767 [0.94] Unnecessary garbage produced by schema metrics during scanning. (larsh: rev 0ea4b86b07b32d46b23f4f35de032370d64dd021) src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
        Hide
        Hudson added a comment -

        SUCCESS: Integrated in HBase-0.94 #1402 (See https://builds.apache.org/job/HBase-0.94/1402/)
        HBASE-11767 [0.94] Unnecessary garbage produced by schema metrics during scanning. (larsh: rev 0ea4b86b07b32d46b23f4f35de032370d64dd021)

        • src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
        • src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java
        Show
        Hudson added a comment - SUCCESS: Integrated in HBase-0.94 #1402 (See https://builds.apache.org/job/HBase-0.94/1402/ ) HBASE-11767 [0.94] Unnecessary garbage produced by schema metrics during scanning. (larsh: rev 0ea4b86b07b32d46b23f4f35de032370d64dd021) src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java
        Hide
        Hudson added a comment -

        FAILURE: Integrated in HBase-0.94-JDK7 #171 (See https://builds.apache.org/job/HBase-0.94-JDK7/171/)
        HBASE-11767 [0.94] Unnecessary garbage produced by schema metrics during scanning. (larsh: rev 0ea4b86b07b32d46b23f4f35de032370d64dd021)

        • src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
        • src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java
        Show
        Hudson added a comment - FAILURE: Integrated in HBase-0.94-JDK7 #171 (See https://builds.apache.org/job/HBase-0.94-JDK7/171/ ) HBASE-11767 [0.94] Unnecessary garbage produced by schema metrics during scanning. (larsh: rev 0ea4b86b07b32d46b23f4f35de032370d64dd021) src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java
        Hide
        Lars Hofhansl added a comment -

        Some more stats. The creates 5 unneeded objects per row. The char[] array created under the hood (because the Stringbuilder has to be resized) is that largest source of garbage during scanning. This patch removes that completely.

        Show
        Lars Hofhansl added a comment - Some more stats. The creates 5 unneeded objects per row. The char[] array created under the hood (because the Stringbuilder has to be resized) is that largest source of garbage during scanning. This patch removes that completely.
        Hide
        Lars Hofhansl added a comment -

        Committed to 0.94.
        Thanks for having a look Matteo Bertozzi

        Show
        Lars Hofhansl added a comment - Committed to 0.94. Thanks for having a look Matteo Bertozzi
        Hide
        Lars Hofhansl added a comment -

        Verified that String/StringBuilder/Char[] are no longer created per row... Saving a whooping 520MB of garbage in my 20m row scan test.
        (With the recommended young gen sizes for HBase - around 256mb - this save two entire young collections.)

        Show
        Lars Hofhansl added a comment - Verified that String/StringBuilder/Char[] are no longer created per row... Saving a whooping 520MB of garbage in my 20m row scan test. (With the recommended young gen sizes for HBase - around 256mb - this save two entire young collections.)
        Hide
        Matteo Bertozzi added a comment -

        +1

        Show
        Matteo Bertozzi added a comment - +1
        Hide
        Lars Hofhansl added a comment -

        Not pretty (because I cannot change the API in 0.94).

        Exploits the fact that we pass SchemaMetrics.METRIC_NEXTSIZE through the layers and hence uses identify to check that and then prebuild the metric name for the duration of the scanner.

        Show
        Lars Hofhansl added a comment - Not pretty (because I cannot change the API in 0.94). Exploits the fact that we pass SchemaMetrics.METRIC_NEXTSIZE through the layers and hence uses identify to check that and then prebuild the metric name for the duration of the scanner.

          People

          • Assignee:
            Lars Hofhansl
            Reporter:
            Lars Hofhansl
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development