Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-4266

Java udf expression returning string in group by can give incorrect results.

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: Impala 2.3.0, Impala 2.5.0, Impala 2.4.0, Impala 2.6.0, Impala 2.7.0, Impala 2.8.0
    • Fix Version/s: Impala 2.8.0
    • Component/s: Backend
    • Labels:

      Description

      I have a simple Java UDF as follows (replaces each occurrence of 's' with 'ss').

      import org.apache.hadoop.hive.ql.exec.UDF;
      import java.text.ParseException;
      import org.apache.hadoop.io.Text;
      
      public class MyReplaceString extends UDF
      {
        public Text evaluate(Text para) throws ParseException {
          if ((null == para) || ("".equals(para.toString()))) {
            return new Text("");
          }
          return new Text(para.toString().replace("s", "ss"));
        }
      }
      
      [localhost:21000] > select * from test_replace_group_by;
      Query: select * from test_escape_group_by
      Query submitted at: 2016-10-10 09:44:33 (Coordinator: http://optimus:25000)
      Query progress can be monitored at: http://optimus:25000/query_plan?query_id=2a42edbc9837b8dc:8d3e5d1500000000
      +------------+
      | s          |
      +------------+
      | blehss     |
      | blahss     |
      | longstring |
      | short      |
      | tataehss   |
      +------------+
      Fetched 5 row(s) in 6.92s
      [localhost:21000] > 
      
      [localhost:21000] > create function my_replace_string(string) returns string location '/tmp/hive_udf_replace.jar' symbol='MyReplaceString';
      
      ------- CORRECT RESULT---------------
      [localhost:21000] > select my_replace_string(s) as es from test_replace_group_by;
      Query: select my_escape_string(s) as es from test_escape_group_by
      Query submitted at: 2016-10-10 09:46:17 (Coordinator: http://optimus:25000)
      Query progress can be monitored at: http://optimus:25000/query_plan?query_id=9149a3a3924604f9:dc97fcf200000000
      +-------------+
      | es          |
      +-------------+
      | blehssss    |
      | tataehssss  |
      | blahssss    |
      | longsstring |
      | sshort      |
      +-------------+
      Fetched 5 row(s) in 0.12s
      
      -------------- INCORRECT-------------
      [localhost:21000] > select my_replace_string(s) as es from test_replace_group_by group by es;
      Query: select my_escape_string(s) as es from test_escape_group_by group by es
      Query submitted at: 2016-10-10 09:46:24 (Coordinator: http://optimus:25000)
      Query progress can be monitored at: http://optimus:25000/query_plan?query_id=2c413bed4dccd122:cf81137600000000
      +-------------+
      | es          |
      +-------------+
      | sshorttring |  <------
      | tataehssss  |
      | sshort      |
      | blehssss    |
      | blahssss    |
      +-------------+
      Fetched 5 row(s) in 0.22s
      [localhost:21000] > 
      

      Tried disabling codegen/streaming preaggs but it didn't help.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                tarmstrong Tim Armstrong
                Reporter:
                bharathv bharath v
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: