Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.4.0
    • Component/s: Query Processor
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      It would be very useful to have a function equivalent to string split in java

      1. HIVE-642.1.patch
        6 kB
        Emil Ibrishimov
      2. HIVE-642.2.patch
        7 kB
        Emil Ibrishimov

        Issue Links

          Activity

          Hide
          Raghotham Murthy added a comment -

          You probably have to check for null strings (different from empty strings) and return null.

          Show
          Raghotham Murthy added a comment - You probably have to check for null strings (different from empty strings) and return null.
          Hide
          Emil Ibrishimov added a comment -

          Oops, fixed. Thanks!

          Show
          Emil Ibrishimov added a comment - Oops, fixed. Thanks!
          Hide
          Namit Jain added a comment -

          +1

          looks good

          Show
          Namit Jain added a comment - +1 looks good
          Hide
          Min Zhou added a comment -

          It's very useful for us .
          some comments:

          1. Can you implement it directly with Text ? Avoiding string decoding and encoding would be faster. Of course that trick may lead to another problem, as String.split uses a regular expression for splitting.
          2. getDisplayString() always return a string in lowercase.
          Show
          Min Zhou added a comment - It's very useful for us . some comments: Can you implement it directly with Text ? Avoiding string decoding and encoding would be faster. Of course that trick may lead to another problem, as String.split uses a regular expression for splitting. getDisplayString() always return a string in lowercase.
          Hide
          Namit Jain added a comment -

          Committed. Thanks Emil

          Show
          Namit Jain added a comment - Committed. Thanks Emil
          Hide
          Emil Ibrishimov added a comment -

          There are some easy (compromise) ways to optimize split:

          1. Check if the regex argument actually contains some "regex specific characters" and if it doesn't, do a straightforward split without converting to strings.
          2. Assume some default value for the second argument (for example - split(str) to be equivalent to split(str, ' ') and optimize for this value
          3. Have two separate split functions - one that does regex and one that splits around plain text.

          I think that 1 is a good choice and can be done rather quickly.

          Show
          Emil Ibrishimov added a comment - There are some easy (compromise) ways to optimize split: 1. Check if the regex argument actually contains some "regex specific characters" and if it doesn't, do a straightforward split without converting to strings. 2. Assume some default value for the second argument (for example - split(str) to be equivalent to split(str, ' ') and optimize for this value 3. Have two separate split functions - one that does regex and one that splits around plain text. I think that 1 is a good choice and can be done rather quickly.
          Hide
          Namit Jain added a comment -

          I saw Min's comments after committing.

          I can file a follow-up jira on that

          Show
          Namit Jain added a comment - I saw Min's comments after committing. I can file a follow-up jira on that
          Hide
          Emil Ibrishimov added a comment -

          Some Generic UDFs don't return lowercase (getDisplayString()) - when, case, coalesce

          Show
          Emil Ibrishimov added a comment - Some Generic UDFs don't return lowercase (getDisplayString()) - when, case, coalesce
          Show
          Namit Jain added a comment - filed https://issues.apache.org/jira/browse/HIVE-664
          Hide
          Zheng Shao added a comment -

          I noticed some of the new UDFs that got added are not listed in http://wiki.apache.org/hadoop/Hive/LanguageManual/UDF
          Can we add those to the wiki?

          Show
          Zheng Shao added a comment - I noticed some of the new UDFs that got added are not listed in http://wiki.apache.org/hadoop/Hive/LanguageManual/UDF Can we add those to the wiki?

            People

            • Assignee:
              Emil Ibrishimov
              Reporter:
              Namit Jain
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development