Details
-
Sub-task
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
None
-
None
-
Reviewed
Description
Flushing memstore is taking too long. It looks like we are doing a bunch of comparing out of a new facility in hbase2, the Segment scanner at flush time.
Below is a patch from anoop.hbase. I had a similar more hacky version. Both undo the extra comparing we were seeing in perf tests.
anastas and eshcar. Need your help please.
As I read it, we are trying to flush the memstore snapshot (default, no IMC case). There is only ever going to be one Segment involved (even if IMC is enabled); the snapshot Segment. But the getScanners is returning a list (of one) Scanners and the scan is via the generic SegmentScanner which is all about a bunch of stuff we don't need when doing a flush so it seems to do more work than is necessary. It also supports scanning backwards which is not needed when trying to flush memstore.
Do you see a problem doing a version of Anoops patch (whether IMC or not)? It makes a big difference in general throughput when the below patch is in place. Thanks.
diff --git a/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreSnapshot.java b/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreSnapshot.java index cbd60e5da3..c3dd972254 100644 --- a/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreSnapshot.java +++ b/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreSnapshot.java @@ -40,7 +40,8 @@ public class MemStoreSnapshot implements Closeable { this.cellsCount = snapshot.getCellsCount(); this.memStoreSize = snapshot.getMemStoreSize(); this.timeRangeTracker = snapshot.getTimeRangeTracker(); - this.scanners = snapshot.getScanners(Long.MAX_VALUE, Long.MAX_VALUE); + //this.scanners = snapshot.getScanners(Long.MAX_VALUE, Long.MAX_VALUE); + this.scanners = snapshot.getScannersForSnapshot(); this.tagsPresent = snapshot.isTagsPresent(); } diff --git a/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Segment.java b/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Segment.java index 70074bf3b4..279c4e50c8 100644 --- a/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Segment.java +++ b/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Segment.java @@ -33,6 +33,7 @@ import org.apache.hadoop.hbase.KeyValueUtil; import org.apache.hadoop.hbase.io.TimeRange; import org.apache.hadoop.hbase.util.Bytes; import org.apache.hadoop.hbase.util.ClassSize; +import org.apache.hadoop.hbase.util.CollectionBackedScanner; import org.apache.yetus.audience.InterfaceAudience; import org.slf4j.Logger; import org.apache.hbase.thirdparty.com.google.common.annotations.VisibleForTesting; @@ -130,6 +131,10 @@ public abstract class Segment { return Collections.singletonList(new SegmentScanner(this, readPoint, order)); } + public List<KeyValueScanner> getScannersForSnapshot() { + return Collections.singletonList(new CollectionBackedScanner(this.cellSet.get(), comparator)); + } + /** * @return whether the segment has any cells */
Attachments
Attachments
Issue Links
- is related to
-
HBASE-20483 [PERFORMANCE] Flushing is 2x slower in hbase2.
- Resolved
- links to