Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-22324

loss a mass of data when the sequenceId of cells greater than Integer.Max, because MemStoreMergerSegmentsIterator can not merge segments

    XMLWordPrintableJSON

    Details

    • Hadoop Flags:
      Reviewed

      Description

      if your memstore type is CompactingMemStore,MemStoreMergerSegmentsIterator can not merge memstore segments when the seqId of cells greater than Integer.Max, as a result, lossing a mass of data. the reason is that MemStoreMergerSegmentsIterator use Integer.Max as readPt when create Scanner,  but the seqId of cell  may be greater than Integer.MAX_VALUE,  it`s type is long.   code as below:

      public MemStoreMergerSegmentsIterator(List<ImmutableSegment> segments, CellComparator comparator,
          int compactionKVMax) throws IOException {
        super(compactionKVMax);
        // create the list of scanners to traverse over all the data
        // no dirty reads here as these are immutable segments
        AbstractMemStore.addToScanners(segments, Integer.MAX_VALUE, scanners); //bug, should use Long.MAX_VALUE
        heap = new KeyValueHeap(scanners, comparator);
      }
      
      SegmentScanner.java code as below
      protected void updateCurrent() {
        Cell startKV = current;
        Cell next = null;
      
        try {
          while (iter.hasNext()) {
            next = iter.next();
            // here, if seqId>readPoint(Integer.MAX_VALUE), never read cell, as a result, lossing lots of cells
            if (next.getSequenceId() <= this.readPoint) {
              current = next;
              return;// skip irrelevant versions
            }
            if (stopSkippingKVsIfNextRow &&   // for backwardSeek() stay in the
                startKV != null &&        // boundaries of a single row
                segment.compareRows(next, startKV) > 0) {
              current = null;
              return;
            }
          } // end of while
      
          current = null; // nothing found
        } finally {
          if (next != null) {
            // in all cases, remember the last KV we iterated to, needed for reseek()
            last = next;
          }
        }
      }
      
      MemStoreCompactorSegmentsIterator has the same bug
      public MemStoreCompactorSegmentsIterator(List<ImmutableSegment> segments,
          CellComparator comparator, int compactionKVMax, HStore store) throws IOException {
        super(compactionKVMax);
      
        List<KeyValueScanner> scanners = new ArrayList<KeyValueScanner>();
        AbstractMemStore.addToScanners(segments, Integer.MAX_VALUE, scanners);   //bug, should use Long.MAX_VALUE
        // build the scanner based on Query Matcher
        // reinitialize the compacting scanner for each instance of iterator
        compactingScanner = createScanner(store, scanners);
        refillKVS();
      }

        Attachments

          Activity

            People

            • Assignee:
              chen.yang ChenYang
              Reporter:
              HB-CY chenyang
            • Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: