HBase
  1. HBase
  2. HBASE-5667

RegexStringComparator supports java.util.regex.Pattern flags

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.95.0
    • Component/s: Filters
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      • Add constructor that takes in a Pattern
      • Add Pattern's flags to Writable fields, and actually use them when recomposing the Filter
      1. HBASE-5667.diff
        5 kB
        David Arthur
      2. HBASE-5667.diff
        6 kB
        David Arthur
      3. HBASE-5667.diff
        6 kB
        David Arthur

        Activity

        Hide
        Ted Yu added a comment -

        This is an incompatible change, right ?

        -    this.pattern = Pattern.compile(expr);
        +    int flags = in.readInt();
        +    this.pattern = Pattern.compile(expr, flags);
        

        What's the advantage of passing Pattern directly ?

        Show
        Ted Yu added a comment - This is an incompatible change, right ? - this .pattern = Pattern.compile(expr); + int flags = in.readInt(); + this .pattern = Pattern.compile(expr, flags); What's the advantage of passing Pattern directly ?
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12520321/HBASE-5667.diff
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests:
        org.apache.hadoop.hbase.mapreduce.TestImportTsv
        org.apache.hadoop.hbase.mapred.TestTableMapReduce
        org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1334//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1334//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1334//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12520321/HBASE-5667.diff against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1334//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1334//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1334//console This message is automatically generated.
        Hide
        David Arthur added a comment -

        @Zhihong, yes it would be incompatible if it tried to read Filters that were serialized with the current code. I'm assuming that these Filters are ephemeral and aren't stored anywhere (could be a very wrong assumption).

        Otherwise, the purpose of this patch is to expose the ability to add regex flags to the comparator. E.g., if I want a case-insensitive match I could construct a Filter like:

        new SingleColumnValueFilter(COLUMN_FAMILY, COLUMN_QUALIFIER, CompareOp.EQUAL,
          new RegexStringComparator(Pattern.compile("foo", Pattern.CASE_INSENSITIVE | Pattern.DOTALL)));
        

        Also, in the current code, DOTALL is added to the underlying Pattern, but doesn't appear to be applied when deserializing.

        Show
        David Arthur added a comment - @Zhihong, yes it would be incompatible if it tried to read Filters that were serialized with the current code. I'm assuming that these Filters are ephemeral and aren't stored anywhere (could be a very wrong assumption). Otherwise, the purpose of this patch is to expose the ability to add regex flags to the comparator. E.g., if I want a case-insensitive match I could construct a Filter like: new SingleColumnValueFilter(COLUMN_FAMILY, COLUMN_QUALIFIER, CompareOp.EQUAL, new RegexStringComparator(Pattern.compile( "foo" , Pattern.CASE_INSENSITIVE | Pattern.DOTALL))); Also, in the current code, DOTALL is added to the underlying Pattern, but doesn't appear to be applied when deserializing.
        Hide
        Ted Yu added a comment -

        Maintaining backward compatibility would be desirable.

        Would moving the read of flags to after the following line be better ?

            final String charset = in.readUTF();
        
        Show
        Ted Yu added a comment - Maintaining backward compatibility would be desirable. Would moving the read of flags to after the following line be better ? final String charset = in.readUTF();
        Hide
        stack added a comment -

        Patch looks good. We could commit it as is to trunk since the next release is going to 'incompatible' anyways, 0.96 (Should take the opportunity and make everything subclass versionedwritable but better, Writables should be on their way out)

        Show
        stack added a comment - Patch looks good. We could commit it as is to trunk since the next release is going to 'incompatible' anyways, 0.96 (Should take the opportunity and make everything subclass versionedwritable but better, Writables should be on their way out)
        Hide
        David Arthur added a comment -

        Anything needed from me to get this in?

        Show
        David Arthur added a comment - Anything needed from me to get this in?
        Hide
        stack added a comment -

        @Jimmy Whats going to happen in 0.96 serializing filters? Will these be pb'd? Or should we go to the bother of making all filters VersionedWritables?

        Show
        stack added a comment - @Jimmy Whats going to happen in 0.96 serializing filters? Will these be pb'd? Or should we go to the bother of making all filters VersionedWritables?
        Hide
        Jimmy Xiang added a comment -

        @Stack, I prefer to change them to pb so we should not bother to make them VersionedWritables for now.
        We have lots of filters. We need to abstract them out and have a generic way to define them in pb.

        Show
        Jimmy Xiang added a comment - @Stack, I prefer to change them to pb so we should not bother to make them VersionedWritables for now. We have lots of filters. We need to abstract them out and have a generic way to define them in pb.
        Hide
        Jimmy Xiang added a comment -

        For this patch, it changes the constructor of RegexStringComparator. A Pattern is hard to be pb'd. Can we specify the flags in a different way,
        for example, using string, and/or some primitive parameters?

        Show
        Jimmy Xiang added a comment - For this patch, it changes the constructor of RegexStringComparator. A Pattern is hard to be pb'd. Can we specify the flags in a different way, for example, using string, and/or some primitive parameters?
        Hide
        David Arthur added a comment -

        If I'm not mistaken, a Pattern is just the pattern string and the flags. The constructor could just as easily be (String, int) rather than a Pattern instance if that jives better with pb

        Show
        David Arthur added a comment - If I'm not mistaken, a Pattern is just the pattern string and the flags. The constructor could just as easily be (String, int) rather than a Pattern instance if that jives better with pb
        Hide
        Jimmy Xiang added a comment -

        Pattern is fine if we can get it.

        Show
        Jimmy Xiang added a comment - Pattern is fine if we can get it.
        Hide
        stack added a comment -

        Yeah, it looks like we can do the String plus int for flags as David suggests given thats all it takes to create a Pattern instance: http://docs.oracle.com/javase/6/docs/api/java/util/regex/Pattern.html#compile(java.lang.String, int)

        Show
        stack added a comment - Yeah, it looks like we can do the String plus int for flags as David suggests given thats all it takes to create a Pattern instance: http://docs.oracle.com/javase/6/docs/api/java/util/regex/Pattern.html#compile(java.lang.String , int)
        Hide
        David Arthur added a comment -
        • Changed constructor to (String, int)
        • Added example in docs
        • Removed DOTALL from (String) constructor
        Show
        David Arthur added a comment - Changed constructor to (String, int) Added example in docs Removed DOTALL from (String) constructor
        Hide
        Ted Yu added a comment -

        The new test failures are most likely inherited from TRUNK which were caused by HBASE-5673

        Show
        Ted Yu added a comment - The new test failures are most likely inherited from TRUNK which were caused by HBASE-5673
        Hide
        Ted Yu added a comment -

        I think we should keep the Pattern.DOTALL in (String) constructor:

        -    this.pattern = Pattern.compile(expr, Pattern.DOTALL);
        +    this.pattern = Pattern.compile(expr);
        
        Show
        Ted Yu added a comment - I think we should keep the Pattern.DOTALL in (String) constructor: - this .pattern = Pattern.compile(expr, Pattern.DOTALL); + this .pattern = Pattern.compile(expr);
        Hide
        stack added a comment -

        Yes, and call through to the other constructor so its super(expr, Pattern.DOTALL).

        Show
        stack added a comment - Yes, and call through to the other constructor so its super(expr, Pattern.DOTALL).
        Hide
        David Arthur added a comment -

        Yea on second thought keeping DOTALL is best for compatability. However, I don't like implicit things like that, so I added a comment to the (String) explaining it.

        Show
        David Arthur added a comment - Yea on second thought keeping DOTALL is best for compatability. However, I don't like implicit things like that, so I added a comment to the (String) explaining it.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12520628/HBASE-5667.diff
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests:
        org.apache.hadoop.hbase.mapreduce.TestImportTsv
        org.apache.hadoop.hbase.mapred.TestTableMapReduce
        org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1353//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1353//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1353//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12520628/HBASE-5667.diff against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1353//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1353//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1353//console This message is automatically generated.
        Hide
        Ted Yu added a comment -

        Patch v3 looks good.

        Will integrate later if there is no objection.

        Show
        Ted Yu added a comment - Patch v3 looks good. Will integrate later if there is no objection.
        Hide
        Ted Yu added a comment -

        Patch v3 integrated to TRUNK.

        Thanks for the patch David.

        Thanks for the review Stack.

        Show
        Ted Yu added a comment - Patch v3 integrated to TRUNK. Thanks for the patch David. Thanks for the review Stack.
        Hide
        Hudson added a comment -

        Integrated in HBase-TRUNK #2699 (See https://builds.apache.org/job/HBase-TRUNK/2699/)
        HBASE-5667 RegexStringComparator supports java.util.regex.Pattern flags (David Arthur) (Revision 1307580)

        Result = FAILURE
        tedyu :
        Files :

        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/RegexStringComparator.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/filter/TestSingleColumnValueFilter.java
        Show
        Hudson added a comment - Integrated in HBase-TRUNK #2699 (See https://builds.apache.org/job/HBase-TRUNK/2699/ ) HBASE-5667 RegexStringComparator supports java.util.regex.Pattern flags (David Arthur) (Revision 1307580) Result = FAILURE tedyu : Files : /hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/RegexStringComparator.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/filter/TestSingleColumnValueFilter.java
        Hide
        Hudson added a comment -

        Integrated in HBase-TRUNK-security #155 (See https://builds.apache.org/job/HBase-TRUNK-security/155/)
        HBASE-5667 RegexStringComparator supports java.util.regex.Pattern flags (David Arthur) (Revision 1307580)

        Result = SUCCESS
        tedyu :
        Files :

        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/RegexStringComparator.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/filter/TestSingleColumnValueFilter.java
        Show
        Hudson added a comment - Integrated in HBase-TRUNK-security #155 (See https://builds.apache.org/job/HBase-TRUNK-security/155/ ) HBASE-5667 RegexStringComparator supports java.util.regex.Pattern flags (David Arthur) (Revision 1307580) Result = SUCCESS tedyu : Files : /hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/RegexStringComparator.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/filter/TestSingleColumnValueFilter.java
        Hide
        stack added a comment -

        Committed to trunk a while back.

        Show
        stack added a comment - Committed to trunk a while back.

          People

          • Assignee:
            David Arthur
            Reporter:
            David Arthur
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development