Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-21119

String UDAF and count distinct in the same select give error

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Labels:
      None

      Description

      With the attached UDAF the following query crashes on hive.
      CRASHES

      select rs_max(genderkey),count(distinct genderkey) from as_adventure.dimgender;
      

      WORKS

      select rs_max(genderkey) from as_adventure.dimgender;
      

      The table looks like

      0: jdbc:hive2://localhost:10000> select * from dimgender;
      OK
      INFO  : Compiling command(queryId=hive_20190111225125_486e6e6b-97fa-4dda-9688-a733180bcfe7): select * from dimgender
      INFO  : Concurrency mode is disabled, not creating a lock manager
      INFO  : Semantic Analysis Completed (retrial = false)
      INFO  : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:dimgender.genderkey, type:string, comment:null), FieldSchema(name:dimgender.gendername, type:string, comment:null)], properties:null)
      INFO  : Completed compiling command(queryId=hive_20190111225125_486e6e6b-97fa-4dda-9688-a733180bcfe7); Time taken: 0.2 seconds
      INFO  : Concurrency mode is disabled, not creating a lock manager
      INFO  : Executing command(queryId=hive_20190111225125_486e6e6b-97fa-4dda-9688-a733180bcfe7): select * from dimgender
      INFO  : Completed executing command(queryId=hive_20190111225125_486e6e6b-97fa-4dda-9688-a733180bcfe7); Time taken: 0.004 seconds
      INFO  : OK
      INFO  : Concurrency mode is disabled, not creating a lock manager
      +----------------------+-----------------------+
      | dimgender.genderkey  | dimgender.gendername  |
      +----------------------+-----------------------+
      | M                    | Male                  |
      | F                    | Female                |
      | U                    | Unisex                |
      +----------------------+-----------------------+
      
      Vertex failed, vertexName=Reducer 2, vertexId=vertex_1547169244949_0024_2_01, diagnostics=[Task failed, taskId=task_1547169244949_0024_2_01_000000, diagnostics=[TaskAttempt 0 failed, info=[Error: Error while running task ( failure ) : attempt_1547169244949_0024_2_01_000000_0:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {"key":{"_col0":"F"},"value":{"_col0":"F"}}
      	at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:296)
      	at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
      	at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
      	at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
      	at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
      	at java.security.AccessController.doPrivileged(Native Method)
      
      

      ...

      Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to execute method public boolean com.sample.MaxUDA$Evaluator.merge(java.lang.String) with arguments {F}:argument type mismatch
      	at org.apache.hadoop.hive.ql.exec.FunctionRegistry.invoke(FunctionRegistry.java:1111)
      	at org.apache.hadoop.hive.ql.udf.generic.GenericUDAFBridge$GenericUDAFBridgeEvaluator.merge(GenericUDAFBridge.java:176)
      	at org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator.aggregate(GenericUDAFEvaluator.java:216)
      
      

      PLAN

      +----------------------------------------------------+
      |                      Explain                       |
      +----------------------------------------------------+
      | Plan optimized by CBO.                             |
      |                                                    |
      | Vertex dependency in root stage                    |
      | Reducer 2 <- Map 1 (SIMPLE_EDGE)                   |
      | Reducer 3 <- Reducer 2 (CUSTOM_SIMPLE_EDGE)        |
      |                                                    |
      | Stage-0                                            |
      |   Fetch Operator                                   |
      |     limit:-1                                       |
      |     Stage-1                                        |
      |       Reducer 3                                    |
      |       File Output Operator [FS_6]                  |
      |         Group By Operator [GBY_12] (rows=1 width=368) |
      |           Output:["_col0","_col1"],aggregations:["rs_max(VALUE._col0)","count(VALUE._col1)"] |
      |         <-Reducer 2 [CUSTOM_SIMPLE_EDGE]           |
      |           PARTITION_ONLY_SHUFFLE [RS_11]           |
      |             Group By Operator [GBY_10] (rows=1 width=368) |
      |               Output:["_col0","_col1"],aggregations:["rs_max(_col1)","count(_col0)"] |
      |               Group By Operator [GBY_9] (rows=3 width=2) |
      |                 Output:["_col0","_col1"],aggregations:["rs_max(VALUE._col0)"],keys:KEY._col0 |
      |               <-Map 1 [SIMPLE_EDGE]                |
      |                 SHUFFLE [RS_8]                     |
      |                   PartitionCols:_col0              |
      |                   Group By Operator [GBY_7] (rows=3 width=2) |
      |                     Output:["_col0","_col1"],aggregations:["rs_max(genderkey)"],keys:genderkey |
      |                     Select Operator [SEL_1] (rows=3 width=2) |
      |                       Output:["genderkey"]         |
      |                       TableScan [TS_0] (rows=3 width=2) |
      |                         as_adventure@dimgender,dimgender,Tbl:COMPLETE,Col:NONE,Output:["genderkey"] |
      |                                                    |
      +----------------------------------------------------+
      30 rows selected (0.3 seconds)
      0: jdbc:hive2://localhost:10000> 
      

        Attachments

        1. MaxUDA.java
          2 kB
          Ravi Shetye
        2. run.log
          86 kB
          Ravi Shetye

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              rshetye Ravi Shetye
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated: