Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-22833

MultiRowRangeFilter should provide a method for creating a filter which is functionally equivalent to multiple prefix filters

    XMLWordPrintableJSON

Details

    • Reviewed
    • Hide
      Provide a public method in MultiRowRangeFilter class to speed the requirement of filtering with multiple row prefixes, it will expand the row prefixes as multiple rowkey ranges by MultiRowRangeFilter, it's more efficient.
      {code}
      public MultiRowRangeFilter(byte[][] rowKeyPrefixes);
      {code}
      Show
      Provide a public method in MultiRowRangeFilter class to speed the requirement of filtering with multiple row prefixes, it will expand the row prefixes as multiple rowkey ranges by MultiRowRangeFilter, it's more efficient. {code} public MultiRowRangeFilter(byte[][] rowKeyPrefixes); {code}

    Description

      HI,

      I think current formal way to make multiple prefix filters is to create a FilterList and add PrefixFilter instances to the list:

      FilterList allFilters = new FilterList(FilterList.Operator.MUST_PASS_ONE);
      allFilters.addFilter(new PrefixFilter(Bytes.toBytes("123")));
      allFilters.addFilter(new PrefixFilter(Bytes.toBytes("456")));
      allFilters.addFilter(new PrefixFilter(Bytes.toBytes("678")));
      scan.setFilter(allFilters);
      

      (c.f., https://stackoverflow.com/questions/41074213/hbase-how-to-specify-multiple-prefix-filters-in-a-single-scan-operation )

      However, in the case of creating a single prefix filter, HBase provides scan.setRowPrefixFilter method.
      This method creates a range filter by setting a start row and a stop row.
      The value of a stop row is decided by calling calculateTheClosestNextRowKeyForPrefix ( c.f., https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Scan.java#L574-L597 )

      MultiRowRangeFilter could leverage a list of start row and stop row pairs and calculateTheClosestNextRowKeyForPrefix could compute the stop row value corresponding to given start row (i.e., a prefix).

      I think this kind of filter (a filter which is functionally equivalent to multiple prefix filters) should be creatable by MultiRowRangeFilter and it's better than the current formal way.

      Cheers,

      Attachments

        Issue Links

          Activity

            People

              titsuki Itsuki Toyota
              titsuki Itsuki Toyota
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: