[HBASE-20628] SegmentScanner does over-comparing when one flushing - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Resolved
Priority: Critical
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 3.0.0-alpha-1, 2.1.0, 2.0.1
Component/s: Performance
Labels:
None

Hadoop Flags:

Reviewed

Description

Flushing memstore is taking too long. It looks like we are doing a bunch of comparing out of a new facility in hbase2, the Segment scanner at flush time.

Below is a patch from anoop.hbase. I had a similar more hacky version. Both undo the extra comparing we were seeing in perf tests.

anastas and eshcar. Need your help please.

As I read it, we are trying to flush the memstore snapshot (default, no IMC case). There is only ever going to be one Segment involved (even if IMC is enabled); the snapshot Segment. But the getScanners is returning a list (of one) Scanners and the scan is via the generic SegmentScanner which is all about a bunch of stuff we don't need when doing a flush so it seems to do more work than is necessary. It also supports scanning backwards which is not needed when trying to flush memstore.

Do you see a problem doing a version of Anoops patch (whether IMC or not)? It makes a big difference in general throughput when the below patch is in place. Thanks.

diff --git a/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreSnapshot.java b/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreSnapshot.java
index cbd60e5da3..c3dd972254 100644
--- a/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreSnapshot.java
+++ b/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreSnapshot.java
@@ -40,7 +40,8 @@ public class MemStoreSnapshot implements Closeable {
     this.cellsCount = snapshot.getCellsCount();
     this.memStoreSize = snapshot.getMemStoreSize();
     this.timeRangeTracker = snapshot.getTimeRangeTracker();
-    this.scanners = snapshot.getScanners(Long.MAX_VALUE, Long.MAX_VALUE);
+    //this.scanners = snapshot.getScanners(Long.MAX_VALUE, Long.MAX_VALUE);
+    this.scanners = snapshot.getScannersForSnapshot();
     this.tagsPresent = snapshot.isTagsPresent();
   }

diff --git a/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Segment.java b/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Segment.java
index 70074bf3b4..279c4e50c8 100644
--- a/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Segment.java
+++ b/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Segment.java
@@ -33,6 +33,7 @@ import org.apache.hadoop.hbase.KeyValueUtil;
 import org.apache.hadoop.hbase.io.TimeRange;
 import org.apache.hadoop.hbase.util.Bytes;
 import org.apache.hadoop.hbase.util.ClassSize;
+import org.apache.hadoop.hbase.util.CollectionBackedScanner;
 import org.apache.yetus.audience.InterfaceAudience;
 import org.slf4j.Logger;
 import org.apache.hbase.thirdparty.com.google.common.annotations.VisibleForTesting;
@@ -130,6 +131,10 @@ public abstract class Segment {
     return Collections.singletonList(new SegmentScanner(this, readPoint, order));
   }

+  public List<KeyValueScanner> getScannersForSnapshot() {
+    return Collections.singletonList(new CollectionBackedScanner(this.cellSet.get(), comparator));
+  }
+
   /**
    * @return whether the segment has any cells
    */

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HBASE-20628.branch-2.0.001.patch
23/May/18 18:16
2 kB
Michael Stack
HBASE-20628.branch-2.0.001.patch
23/May/18 18:13
2 kB
Michael Stack
HBASE-20628.branch-2.0.001 (1).patch
25/May/18 15:52
2 kB
Michael Stack
HBASE-20628.branch-2.0.002.patch
27/May/18 04:09
4 kB
Michael Stack
HBASE-20628.branch-2.0.003.patch
01/Jun/18 20:39
6 kB
Michael Stack
HBASE-20628.branch-2.0.004.patch
04/Jun/18 04:54
6 kB
Michael Stack
hits.003.png
01/Jun/18 21:00
12 kB
Michael Stack
hits-20628.png
23/May/18 18:15
12 kB
Michael Stack
Screen Shot 2018-05-25 at 9.38.00 AM.png
25/May/18 16:38
89 kB
Michael Stack

Issue Links

is related to

HBASE-20483 [PERFORMANCE] Flushing is 2x slower in hbase2.

Resolved

links to

Review Board (branch-2.0)

Activity

People

Assignee:: Michael Stack

Reporter:: Michael Stack

Votes:: 0 Vote for this issue

Watchers:: 8 Start watching this issue

Dates

Created:: 23/May/18 17:17

Updated:: 07/Jun/18 16:08

Resolved:: 04/Jun/18 16:51