Details

    • Type: New Feature New Feature
    • Status: Open
    • Priority: Minor Minor
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: UDF
    • Labels:
      None

      Description

      Here some UD(A)Fs which can be incorporated into the Hive distribution:

      UDFArgMax - Find the 0-indexed index of the largest argument. e.g., ARGMAX(4, 5, 3) returns 1.
      UDFBucket - Find the bucket in which the first argument belongs. e.g., BUCKET(x, b_1, b_2, b_3, ...), will return the smallest i such that x > b_

      {i}

      but <= b_

      {i+1}

      . Returns 0 if x is smaller than all the buckets.
      UDFFindInArray - Finds the 1-index of the first element in the array given as the second argument. Returns 0 if not found. Returns NULL if either argument is NULL. E.g., FIND_IN_ARRAY(5, array(1,2,5)) will return 3. FIND_IN_ARRAY(5, array(1,2,3)) will return 0.
      UDFGreatCircleDist - Finds the great circle distance (in km) between two lat/long coordinates (in degrees).
      UDFLDA - Performs LDA inference on a vector given fixed topics.
      UDFNumberRows - Number successive rows starting from 1. Counter resets to 1 whenever any of its parameters changes.
      UDFPmax - Finds the maximum of a set of columns. e.g., PMAX(4, 5, 3) returns 5.
      UDFRegexpExtractAll - Like REGEXP_EXTRACT except that it returns all matches in an array.
      UDFUnescape - Returns the string unescaped (using C/Java style unescaping).
      UDFWhich - Given a boolean array, return the indices which are TRUE.
      UDFJaccard

      UDAFCollect - Takes all the values associated with a row and converts it into a list. Make sure to have: set hive.map.aggr = false;
      UDAFCollectMap - Like collect except that it takes tuples and generates a map.
      UDAFEntropy - Compute the entropy of a column.
      UDAFPearson (BROKEN!!!) - Computes the pearson correlation between two columns.
      UDAFTop - TOP(KEY, VAL) - returns the KEY associated with the largest value of VAL.
      UDAFTopN (BROKEN!!!) - Like TOP except returns a list of the keys associated with the N (passed as the third parameter) largest values of VAL.
      UDAFHistogram

      1. UDFLtrim.java
        3 kB
        Jonathan Chang
      2. UDFRtrim.java
        3 kB
        Jonathan Chang
      3. UDFTrim.java
        3 kB
        Jonathan Chang
      4. UDFStartsWith.java
        2 kB
        Jonathan Chang
      5. UDFEndsWith.java
        2 kB
        Jonathan Chang
      6. UDFFindInString.java
        2 kB
        Jonathan Chang
      7. ext.tar.gz
        19 kB
        Jonathan Chang
      8. core.tar.gz
        19 kB
        Jonathan Chang
      9. udfs.tar.gz
        11 kB
        Jonathan Chang
      10. udfs.tar.gz
        7 kB
        Jonathan Chang

        Issue Links

          Activity

          Jonathan Chang created issue -
          Jonathan Chang made changes -
          Field Original Value New Value
          Attachment udfs.tar.gz [ 12452223 ]
          Jeff Hammerbacher made changes -
          Link This issue is related to HIVE-1549 [ HIVE-1549 ]
          Carl Steinbach made changes -
          Component/s UDF [ 12313585 ]
          Jonathan Chang made changes -
          Attachment udfs.tar.gz [ 12456413 ]
          Jonathan Chang made changes -
          Attachment core.tar.gz [ 12484694 ]
          Attachment ext.tar.gz [ 12484695 ]
          John Sichi made changes -
          Link This issue is related to HIVE-2523 [ HIVE-2523 ]
          John Sichi made changes -
          Link This issue depends on HIVE-2524 [ HIVE-2524 ]
          Jonathan Chang made changes -
          Attachment UDFFindInString.java [ 12542017 ]
          Attachment UDFEndsWith.java [ 12542018 ]
          Attachment UDFStartsWith.java [ 12542019 ]
          Attachment UDFTrim.java [ 12542020 ]
          Attachment UDFRtrim.java [ 12542021 ]
          Attachment UDFLtrim.java [ 12542022 ]
          Gavin made changes -
          Link This issue depends on HIVE-2524 [ HIVE-2524 ]
          Gavin made changes -
          Link This issue depends upon HIVE-2524 [ HIVE-2524 ]

            People

            • Assignee:
              Jonathan Chang
              Reporter:
              Jonathan Chang
            • Votes:
              3 Vote for this issue
              Watchers:
              20 Start watching this issue

              Dates

              • Created:
                Updated:

                Development