Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-26108

add option to disable scanMetrics in TableSnapshotInputFormat

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.3.5
    • 2.3.6, 3.0.0-alpha-2, 2.4.5
    • None
    • None

    Description

      When running spark job with TableSnapshotInputFormat, we found that scan is very slower. We found that scanMetrics is hardcoded as enabled, spark's 

      newAPIHadoopRDD uses DummyReporter in hadoop, which causes the following exception and 80% cpu time is spent on this exception handling. 

      Need to provide an option to disable scanMetrics.
      java.base@11.0.5/java.lang.Throwable.fillInStackTrace(Native Method)
      java.base@11.0.5/java.lang.Throwable.fillInStackTrace(Throwable.java:787) => holding Monitor(java.util.MissingResourceException@258206255})
      java.base@11.0.5/java.lang.Throwable.<init>(Throwable.java:292)
      java.base@11.0.5/java.lang.Exception.<init>(Exception.java:84)
      java.base@11.0.5/java.lang.RuntimeException.<init>(RuntimeException.java:80)
      java.base@11.0.5/java.util.MissingResourceException.<init>(MissingResourceException.java:85)
      java.base@11.0.5/java.util.ResourceBundle.throwMissingResourceException(ResourceBundle.java:2055)
      java.base@11.0.5/java.util.ResourceBundle.getBundleImpl(ResourceBundle.java:1689)
      java.base@11.0.5/java.util.ResourceBundle.getBundleImpl(ResourceBundle.java:1593)
      java.base@11.0.5/java.util.ResourceBundle.getBundle(ResourceBundle.java:1284)
      app//org.apache.hadoop.mapreduce.util.ResourceBundles.getBundle(ResourceBundles.java:37)
      app//org.apache.hadoop.mapreduce.util.ResourceBundles.getValue(ResourceBundles.java:56) => holding Monitor(java.lang.Class@545605549})
      app//org.apache.hadoop.mapreduce.util.ResourceBundles.getCounterGroupName(ResourceBundles.java:77)
      app//org.apache.hadoop.mapreduce.counters.CounterGroupFactory.newGroup(CounterGroupFactory.java:94)
      app//org.apache.hadoop.mapreduce.counters.AbstractCounters.getGroup(AbstractCounters.java:227)
      app//org.apache.hadoop.mapreduce.counters.AbstractCounters.findCounter(AbstractCounters.java:154)
      app//org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl$DummyReporter.getCounter(TaskAttemptContextImpl.java:110)
      app//org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl.getCounter(TaskAttemptContextImpl.java:76)
      org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl.updateCounters(TableRecordReaderImpl.java:311)
      org.apache.hadoop.hbase.mapreduce.TableSnapshotInputFormat$TableSnapshotRegionRecordReader.nextKeyValue(TableSnapshotInputFormat.java:167)

      Attachments

        Activity

          People

            huaxiangsun Huaxiang Sun
            huaxiangsun Huaxiang Sun
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: