Hive
  1. Hive
  2. HIVE-236

RLIKE/REGEXP should allow matching partial strings

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.5.0
    • Component/s: Query Processor
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      HIVE-236. RLIKE/REGEXP allowing matching of partial strings. (Paul Yang via zshao)

      Description

      The current behavior is that the regexp needs to match the whole string.

      But from mysql: ( http://dev.mysql.com/doc/refman/5.0/en/regexp.html#operator_regexp )

      mysql> SELECT 'fofo' REGEXP '^fo'; -> 1

      We need to make it work the same way as MySQL.

      1. HIVE-236.3.patch
        5 kB
        Paul Yang
      2. HIVE-236.2.patch
        8 kB
        Paul Yang
      3. HIVE-236.1.patch
        5 kB
        Paul Yang

        Issue Links

          Activity

          Hide
          Zheng Shao added a comment -

          Committed. Thanks Paul!

          Show
          Zheng Shao added a comment - Committed. Thanks Paul!
          Hide
          Zheng Shao added a comment -

          Looks great! Will commit after testing.

          Show
          Zheng Shao added a comment - Looks great! Will commit after testing.
          Hide
          Paul Yang added a comment -

          Modified REGEXP to log only the first time when regexp is empty.
          Consolidated tests in udf_regexp.q into a single query.

          Show
          Paul Yang added a comment - Modified REGEXP to log only the first time when regexp is empty. Consolidated tests in udf_regexp.q into a single query.
          Hide
          Zheng Shao added a comment -

          @HIVE-236.2.patch:

          The warning should not be outputted for every call - we had seen problems with too many repeated log messages that filled up the log file.
          Can you add a boolean variable "warned", so that we only warn once?

          Also, it may help if you put all the test cases in a single query. That helps to reduce the time of running tests.

          Show
          Zheng Shao added a comment - @ HIVE-236 .2.patch: The warning should not be outputted for every call - we had seen problems with too many repeated log messages that filled up the log file. Can you add a boolean variable "warned", so that we only warn once? Also, it may help if you put all the test cases in a single query. That helps to reduce the time of running tests.
          Hide
          Paul Yang added a comment -

          Updated udf1.q.out
          Added logging for empty regexp

          Show
          Paul Yang added a comment - Updated udf1.q.out Added logging for empty regexp
          Hide
          Paul Yang added a comment -

          Not quite ready - need to update udf1.q

          Show
          Paul Yang added a comment - Not quite ready - need to update udf1.q
          Hide
          Namit Jain added a comment -

          The changes look good - will commit if the tests pass

          Show
          Namit Jain added a comment - The changes look good - will commit if the tests pass
          Hide
          Paul Yang added a comment -

          Fixed partial matches.
          Added tests for REGEXP.

          Show
          Paul Yang added a comment - Fixed partial matches. Added tests for REGEXP.

            People

            • Assignee:
              Paul Yang
              Reporter:
              Zheng Shao
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development