Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-26047

Vectorized LIKE UDF should use Re2J regex to address JDK-8203458

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None

    Description

      Below pattern is taking a long time to validate regex in java8 with same trace as shown in java bug

      JDK-8203458

       

      import java.util.regex.Pattern;
      public class Test {
        public static void main(String args[]) {
          String pattern = "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa_b"; 
          Pattern CHAIN_PATTERN = Pattern.compile("(%?[^%_\\\\]+%?)+");
          CHAIN_PATTERN.matcher(pattern).matches(); 
        }
      }
      

      Same is reproducible with following SQL

      create table table1(name string);
      insert into table1 (name) values ('aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa_b');
      select * from table1 where name like "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa_b";

      Attachments

        Issue Links

          Activity

            People

              nareshpr Naresh P R
              nareshpr Naresh P R
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 50m
                  50m