Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-12583

Allow creating reference files even the split row not lies in the storefile range if required

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.0.0, 0.98.9
    • Component/s: None
    • Labels:
    • Hadoop Flags:
      Reviewed

      Description

      Currently in HRegionFileSystem#splitStoreFile we are not creating reference files if the split row not lies in the storefile range that means one of the child region doesn't have any data.

         // Check whether the split row lies in the range of the store file
          // If it is outside the range, return directly.
          if (top) {
            //check if larger than last key.
            KeyValue splitKey = KeyValueUtil.createFirstOnRow(splitRow);
            byte[] lastKey = f.createReader().getLastKey();
            // If lastKey is null means storefile is empty.
            if (lastKey == null) return null;
            if (f.getReader().getComparator().compareFlatKey(splitKey.getBuffer(),
                splitKey.getKeyOffset(), splitKey.getKeyLength(), lastKey, 0, lastKey.length) > 0) {
              return null;
            }
          } else {
            //check if smaller than first key
            KeyValue splitKey = KeyValueUtil.createLastOnRow(splitRow);
            byte[] firstKey = f.createReader().getFirstKey();
            // If firstKey is null means storefile is empty.
            if (firstKey == null) return null;
            if (f.getReader().getComparator().compareFlatKey(splitKey.getBuffer(),
                splitKey.getKeyOffset(), splitKey.getKeyLength(), firstKey, 0, firstKey.length) < 0) {
              return null;
            }
          }
      

      In some cases when split row should be compared with part of rowkey(in composite rowkey) mainly for secondary index tables we need to create reference files even when split row not lies in the storefile range so that they can be rewritten to it's child regions by some custom half store file reader which compare the part of row key with split row.

      The check of comparing split row with storefile range and returning directly can be avoided by having special boolean attribute in table descriptor when it set to true. Or else we can have coprocessor hooks so that in the hooks we can create the references and bypass.

        Attachments

        1. HBASE-12583.patch
          8 kB
          rajeshbabu
        2. HBASE-12583_v3.patch
          10 kB
          rajeshbabu
        3. HBASE-12583_v2.patch
          10 kB
          rajeshbabu
        4. HBASE-12583_branch1.patch
          10 kB
          rajeshbabu
        5. HBASE-12583_branch1_v2.patch
          9 kB
          rajeshbabu
        6. HBASE-12583_addendum.patch
          2 kB
          rajeshbabu
        7. HBASE-12583_98.patch
          10 kB
          rajeshbabu
        8. HBASE-12583_98_v2.patch
          9 kB
          rajeshbabu

          Issue Links

            Activity

              People

              • Assignee:
                rajesh23 rajeshbabu
                Reporter:
                rajesh23 rajeshbabu
              • Votes:
                0 Vote for this issue
                Watchers:
                9 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: