Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-7631

Add support for NOT operator in column regex

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 0.12.0
    • None
    • None
    • None

    Description

      Given the following table:
      0: jdbc:hive2://localhost:10000/default> DESCRIBE regex_column_tb;
      -----------------------------------------------------------------

      col_name data_type comment

      -----------------------------------------------------------------

      stage_c1 int None
      stage_c2 int None
      c1 int None
      c2 int None

      -----------------------------------------------------------------

      A simple regex can be used to select certain columns:
      0: jdbc:hive2://localhost:10000/default> SELECT `stage_.*` FROM regex_column_tb;
      --------------------+

      stage_c1 stage_c2

      --------------------+
      --------------------+

      The regex “(?!stage_).*” using the NOT operator is a valid Java regex, but it does not seem to be supported in Hive:
      0: jdbc:hive2://localhost:10000/default> SELECT `(?!stage_).*` FROM regex_column_tb;
      Error: Error while compiling statement: FAILED: ParseException line 1:17 mismatched input ')' expecting FROM near 'stage_' in from clause (state=42000,code=40000)

      The following regex is supported (HIVE-420), but it is not as intuitive:
      0: jdbc:hive2://localhost:10000/default> SELECT `(stage_.*)?.` FROM regex_column_tb;
      --------+

      c1 c2

      --------+
      --------+

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              mdominguez@cloudera.com Mariano Dominguez
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: