Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-26488

Memory leak when MemStore retry flushing

    XMLWordPrintableJSON

Details

    • Reviewed

    Description

      HStore.flushCache use following DefaultStoreFlusher.flushSnapshot to flush memStore and retry this method when this method throws exception(the default retry count is 10),but increasing reference count of MemStoreLABImpl which used by DefaultMemStore.snapshot in following line 55 and decreasing reference count of MemStoreLABImpl in following line 82 is asymmetrical:

      47 public List<Path> flushSnapshot(MemStoreSnapshot snapshot, long cacheFlushId,
      48      MonitoredTask status, ThroughputController throughputController,
      49      FlushLifeCycleTracker tracker) throws IOException {
      50    ArrayList<Path> result = new ArrayList<>();
      51   int cellsCount = snapshot.getCellsCount();
      52    if (cellsCount == 0) return result; // don't flush if there are no entries
      53
      54    // Use a store scanner to find which rows to flush.
      55    InternalScanner scanner = createScanner(snapshot.getScanners(), tracker);
      56    StoreFileWriter writer;
      57    try {
      58      // TODO:  We can fail in the below block before we complete adding this flush to
      59      //        list of store files.  Add cleanup of anything put on filesystem if we fail.
      60      synchronized (flushLock) {
      61        status.setStatus("Flushing " + store + ": creating writer");
      62       // Write the map out to the disk
      63        writer = store.createWriterInTmp(cellsCount,
      64            store.getColumnFamilyDescriptor().getCompressionType(), false, true,
      65           snapshot.isTagsPresent(), false);
      66        IOException e = null;
      67        try {
      68          performFlush(scanner, writer, throughputController);
      69        } catch (IOException ioe) {
      70          e = ioe;
      71          // throw the exception out
      72          throw ioe;
      73        } finally {
      74          if (e != null) {
      75            writer.close();
      76          } else {
      77            finalizeWriter(writer, cacheFlushId, status);
      78          }
      79        }
      80      }
      81    } finally {
      82      scanner.close();
      83    }
      ......
      

      The asymmetry because above line 55 snapshot.getScanners ,which is following MemStoreSnapshot.getScanners

             public List<KeyValueScanner> getScanners() {
                     return scanners;
             }
      

      MemStoreSnapshot.getScanners does not create new SegmentScanner, instead it reuses the SegmentScanner in following line 43 created in MemStoreSnapshot ctor:

      38 public MemStoreSnapshot(long id, ImmutableSegment snapshot) {
      39    this.id = id;
      40    this.cellsCount = snapshot.getCellsCount();
      41    this.memStoreSize = snapshot.getMemStoreSize();
      42    this.timeRangeTracker = snapshot.getTimeRangeTracker();
      43    this.scanners = snapshot.getSnapshotScanners();
      44    this.tagsPresent = snapshot.isTagsPresent();
      45 }
      

      So when flushing MemStore by DefaultStoreFlusher.flushSnapshot at first time, the reference count of MemStoreLABImpl which used by DefaultMemStore.snapshot is 1, and if DefaultStoreFlusher.flushSnapshot throws exception, the SegmentScanner of DefaultMemStore.snapshot is closed at
      the end of DefaultStoreFlusher.flushSnapshot , and the reference count of corresponding MemStoreLABImpl is decreased to 0.

      When retry flushing MemStore by DefaultStoreFlusher.flushSnapshot at second time, if it success, the SegmentScanner of DefaultMemStore.snapshot is closed again at the end of DefaultStoreFlusher.flushSnapshot , and the reference count of corresponding MemStoreLABImpl is decreased to -1. When flush is completed, DefaultMemStore.clearSnapshot is called to close DefaultMemStore.snapshot and corresponding MemStoreLABImpl , but in the following line 268, reference count is -1 not 0, so MemStoreLABImpl.recycleChunks is not called and the Chunks used by this MemStoreLABImpl is leaked.

      261  public void close() {
      262    if (!this.closed.compareAndSet(false, true)) {
      263      return;
      264    }
      265    // We could put back the chunks to pool for reusing only when there is no
      266    // opening scanner which will read their data
      267    int count  = openScannerCount.get();
      268    if(count == 0) {
      269      recycleChunks();
      270    }
      271  }
      

      My fix is for MemStoreSnapshot.getScanners, we call snapshot.getSnapshotScanners to create a new SegmentScanner, and we do not create SegmentScanner in MemStoreSnapshot ctor.

      Attachments

        Issue Links

          Activity

            People

              comnetwork chenglei
              comnetwork chenglei
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: