Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
2.0.0
-
None
-
None
Description
As jdere reports:
current_timestamp() udf returns result with different format in some cases. select current_timestamp() returns result with decimal precision: {noformat} hive> select current_timestamp(); OK 2016-04-14 18:26:58.875 Time taken: 0.077 seconds, Fetched: 1 row(s) {noformat} But output format is different for select current_timestamp() from all100k union select current_timestamp() from over100k limit 5; {noformat} hive> select current_timestamp() from all100k union select current_timestamp() from over100k limit 5; Query ID = hrt_qa_20160414182956_c4ed48f2-9913-4b3b-8f09-668ebf55b3e3 Total jobs = 1 Launching Job 1 out of 1 Tez session was closed. Reopening... Session re-established. Status: Running (Executing on YARN cluster with App id application_1460611908643_0624) ---------------------------------------------------------------------------------------------- VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED ---------------------------------------------------------------------------------------------- Map 1 .......... llap SUCCEEDED 1 1 0 0 0 0 Map 4 .......... llap SUCCEEDED 1 1 0 0 0 0 Reducer 3 ...... llap SUCCEEDED 1 1 0 0 0 0 ---------------------------------------------------------------------------------------------- VERTICES: 03/03 [==========================>>] 100% ELAPSED TIME: 0.92 s ---------------------------------------------------------------------------------------------- OK 2016-04-14 18:29:56 Time taken: 10.558 seconds, Fetched: 1 row(s) {noformat} explain plan for select current_timestamp(); {noformat} hive> explain extended select current_timestamp(); OK ABSTRACT SYNTAX TREE: TOK_QUERY TOK_INSERT TOK_DESTINATION TOK_DIR TOK_TMP_FILE TOK_SELECT TOK_SELEXPR TOK_FUNCTION current_timestamp STAGE DEPENDENCIES: Stage-0 is a root stage STAGE PLANS: Stage: Stage-0 Fetch Operator limit: -1 Processor Tree: TableScan alias: _dummy_table Row Limit Per Split: 1 GatherStats: false Select Operator expressions: 2016-04-14 18:30:57.206 (type: timestamp) outputColumnNames: _col0 ListSink Time taken: 0.062 seconds, Fetched: 30 row(s) {noformat} explain plan for select current_timestamp() from all100k union select current_timestamp() from over100k limit 5; {noformat} hive> explain extended select current_timestamp() from all100k union select current_timestamp() from over100k limit 5; OK ABSTRACT SYNTAX TREE: TOK_QUERY TOK_FROM TOK_SUBQUERY TOK_QUERY TOK_FROM TOK_SUBQUERY TOK_UNIONALL TOK_QUERY TOK_FROM TOK_TABREF TOK_TABNAME all100k TOK_INSERT TOK_DESTINATION TOK_DIR TOK_TMP_FILE TOK_SELECT TOK_SELEXPR TOK_FUNCTION current_timestamp TOK_QUERY TOK_FROM TOK_TABREF TOK_TABNAME over100k TOK_INSERT TOK_DESTINATION TOK_DIR TOK_TMP_FILE TOK_SELECT TOK_SELEXPR TOK_FUNCTION current_timestamp _u1 TOK_INSERT TOK_DESTINATION TOK_DIR TOK_TMP_FILE TOK_SELECTDI TOK_SELEXPR TOK_ALLCOLREF _u2 TOK_INSERT TOK_DESTINATION TOK_DIR TOK_TMP_FILE TOK_SELECT TOK_SELEXPR TOK_ALLCOLREF TOK_LIMIT 5 STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 depends on stages: Stage-1 STAGE PLANS: Stage: Stage-1 Tez DagId: hrt_qa_20160414183119_ec8e109e-8975-4799-a142-4a2289f85910:7 Edges: Map 1 <- Union 2 (CONTAINS) Map 4 <- Union 2 (CONTAINS) Reducer 3 <- Union 2 (SIMPLE_EDGE) DagName: Vertices: Map 1 Map Operator Tree: TableScan alias: all100k Statistics: Num rows: 100000 Data size: 15801336 Basic stats: COMPLETE Column stats: COMPLETE GatherStats: false Select Operator Statistics: Num rows: 100000 Data size: 4000000 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: 2016-04-14 18:31:19.0 (type: timestamp) outputColumnNames: _col0 Statistics: Num rows: 200000 Data size: 8000000 Basic stats: COMPLETE Column stats: COMPLETE Group By Operator keys: _col0 (type: timestamp) mode: hash outputColumnNames: _col0 Statistics: Num rows: 1 Data size: 40 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col0 (type: timestamp) null sort order: a sort order: + Map-reduce partition columns: _col0 (type: timestamp) Statistics: Num rows: 1 Data size: 40 Basic stats: COMPLETE Column stats: COMPLETE tag: -1 TopN: 5 TopN Hash Memory Usage: 0.04 auto parallelism: true Execution mode: llap LLAP IO: no inputs Path -> Alias: hdfs://os-r6-qugztu-hive-1-5.novalocal:8020/user/hcat/tests/data/all100k [all100k] Path -> Partition: hdfs://os-r6-qugztu-hive-1-5.novalocal:8020/user/hcat/tests/data/all100k Partition base file name: all100k input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat properties: COLUMN_STATS_ACCURATE {"BASIC_STATS":"true","COLUMN_STATS":{"t":"true","si":"true","i":"true","b":"true","f":"true","d":"true","s":"true","dc":"true","bo":"true","v":"true","c":"true","ts":"true"}} EXTERNAL TRUE bucket_count -1 columns t,si,i,b,f,d,s,dc,bo,v,c,ts,dt columns.comments columns.types tinyint:smallint:int:bigint:float:double:string:decimal(38,18):boolean:varchar(25):char(25):timestamp:date field.delim | file.inputformat org.apache.hadoop.mapred.TextInputFormat file.outputformat org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat location hdfs://os-r6-qugztu-hive-1-5.novalocal:8020/user/hcat/tests/data/all100k name default.all100k numFiles 1 numRows 100000 rawDataSize 15801336 serialization.ddl struct all100k { byte t, i16 si, i32 i, i64 b, float f, double d, string s, decimal(38,18) dc, bool bo, varchar(25) v, char(25) c, timestamp ts, date dt} serialization.format | serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe totalSize 15901336 transient_lastDdlTime 1460612683 serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat properties: COLUMN_STATS_ACCURATE {"BASIC_STATS":"true","COLUMN_STATS":{"t":"true","si":"true","i":"true","b":"true","f":"true","d":"true","s":"true","dc":"true","bo":"true","v":"true","c":"true","ts":"true"}} EXTERNAL TRUE bucket_count -1 columns t,si,i,b,f,d,s,dc,bo,v,c,ts,dt columns.comments columns.types tinyint:smallint:int:bigint:float:double:string:decimal(38,18):boolean:varchar(25):char(25):timestamp:date field.delim | file.inputformat org.apache.hadoop.mapred.TextInputFormat file.outputformat org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat location hdfs://os-r6-qugztu-hive-1-5.novalocal:8020/user/hcat/tests/data/all100k name default.all100k numFiles 1 numRows 100000 rawDataSize 15801336 serialization.ddl struct all100k { byte t, i16 si, i32 i, i64 b, float f, double d, string s, decimal(38,18) dc, bool bo, varchar(25) v, char(25) c, timestamp ts, date dt} serialization.format | serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe totalSize 15901336 transient_lastDdlTime 1460612683 serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe name: default.all100k name: default.all100k Truncated Path -> Alias: hdfs://os-r6-qugztu-hive-1-5.novalocal:8020/user/hcat/tests/data/all100k [all100k] Map 4 Map Operator Tree: TableScan alias: over100k Statistics: Num rows: 100000 Data size: 6631229 Basic stats: COMPLETE Column stats: COMPLETE GatherStats: false Select Operator Statistics: Num rows: 100000 Data size: 4000000 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: 2016-04-14 18:31:19.0 (type: timestamp) outputColumnNames: _col0 Statistics: Num rows: 200000 Data size: 8000000 Basic stats: COMPLETE Column stats: COMPLETE Group By Operator keys: _col0 (type: timestamp) mode: hash outputColumnNames: _col0 Statistics: Num rows: 1 Data size: 40 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col0 (type: timestamp) null sort order: a sort order: + Map-reduce partition columns: _col0 (type: timestamp) Statistics: Num rows: 1 Data size: 40 Basic stats: COMPLETE Column stats: COMPLETE tag: -1 TopN: 5 TopN Hash Memory Usage: 0.04 auto parallelism: true Execution mode: llap LLAP IO: no inputs Path -> Alias: hdfs://os-r6-qugztu-hive-1-5.novalocal:8020/user/hcat/tests/data/over100k [over100k] Path -> Partition: hdfs://os-r6-qugztu-hive-1-5.novalocal:8020/user/hcat/tests/data/over100k Partition base file name: over100k input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat properties: COLUMN_STATS_ACCURATE {"BASIC_STATS":"true","COLUMN_STATS":{"t":"true","si":"true","i":"true","b":"true","f":"true","d":"true","bo":"true","s":"true","bin":"true"}} EXTERNAL TRUE bucket_count -1 columns t,si,i,b,f,d,bo,s,bin columns.comments columns.types tinyint:smallint:int:bigint:float:double:boolean:string:binary field.delim : file.inputformat org.apache.hadoop.mapred.TextInputFormat file.outputformat org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat location hdfs://os-r6-qugztu-hive-1-5.novalocal:8020/user/hcat/tests/data/over100k name default.over100k numFiles 1 numRows 100000 rawDataSize 6631229 serialization.ddl struct over100k { byte t, i16 si, i32 i, i64 b, float f, double d, bool bo, string s, binary bin} serialization.format : serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe totalSize 6731229 transient_lastDdlTime 1460612798 serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat properties: COLUMN_STATS_ACCURATE {"BASIC_STATS":"true","COLUMN_STATS":{"t":"true","si":"true","i":"true","b":"true","f":"true","d":"true","bo":"true","s":"true","bin":"true"}} EXTERNAL TRUE bucket_count -1 columns t,si,i,b,f,d,bo,s,bin columns.comments columns.types tinyint:smallint:int:bigint:float:double:boolean:string:binary field.delim : file.inputformat org.apache.hadoop.mapred.TextInputFormat file.outputformat org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat location hdfs://os-r6-qugztu-hive-1-5.novalocal:8020/user/hcat/tests/data/over100k name default.over100k numFiles 1 numRows 100000 rawDataSize 6631229 serialization.ddl struct over100k { byte t, i16 si, i32 i, i64 b, float f, double d, bool bo, string s, binary bin} serialization.format : serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe totalSize 6731229 transient_lastDdlTime 1460612798 serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe name: default.over100k name: default.over100k Truncated Path -> Alias: hdfs://os-r6-qugztu-hive-1-5.novalocal:8020/user/hcat/tests/data/over100k [over100k] Reducer 3 Execution mode: vectorized, llap Needs Tagging: false Reduce Operator Tree: Group By Operator keys: KEY._col0 (type: timestamp) mode: mergepartial outputColumnNames: _col0 Statistics: Num rows: 1 Data size: 40 Basic stats: COMPLETE Column stats: COMPLETE Limit Number of rows: 5 Statistics: Num rows: 1 Data size: 40 Basic stats: COMPLETE Column stats: COMPLETE File Output Operator compressed: false GlobalTableId: 0 directory: hdfs://os-r6-qugztu-hive-1-5.novalocal:8020/tmp/hive/hrt_qa/ec0773d7-0ac2-45c7-b9cb-568bbed2c49c/hive_2016-04-14_18-31-19_532_3480081382837900888-1/-mr-10001/.hive-staging_hive_2016-04-14_18-31-19_532_3480081382837900888-1/-ext-10002 NumFilesPerFileSink: 1 Statistics: Num rows: 1 Data size: 40 Basic stats: COMPLETE Column stats: COMPLETE Stats Publishing Key Prefix: hdfs://os-r6-qugztu-hive-1-5.novalocal:8020/tmp/hive/hrt_qa/ec0773d7-0ac2-45c7-b9cb-568bbed2c49c/hive_2016-04-14_18-31-19_532_3480081382837900888-1/-mr-10001/.hive-staging_hive_2016-04-14_18-31-19_532_3480081382837900888-1/-ext-10002/ table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat properties: columns _col0 columns.types timestamp escape.delim \ hive.serialization.extend.additional.nesting.levels true serialization.escape.crlf true serialization.format 1 serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe TotalFiles: 1 GatherStats: false MultiFileSpray: false Union 2 Vertex: Union 2 Stage: Stage-0 Fetch Operator limit: 5 Processor Tree: ListSink Time taken: 0.301 seconds, Fetched: 284 row(s) {noformat} Both the queries used return timestamp with YYYY-MM-DD HH:MM:SS.fff format in past releases.