[HIVE-15575] ALTER TABLE CONCATENATE and hive.merge.tezfiles seems busted for UNION ALL output - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Open
Priority: Critical
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Component/s: None
Labels:
None

Description

Hive UNION ALL produces data in sub-directories under the table/partition directories. E.g.

hive (mythdb_hadooppf_17544)> create table source ( foo string, bar string, goo string ) stored as textfile;
OK
Time taken: 0.322 seconds
hive (mythdb_hadooppf_17544)> create table results_partitioned( foo string, bar string, goo string ) partitioned by ( dt string ) stored as orcfile;
OK
Time taken: 0.322 seconds
hive (mythdb_hadooppf_17544)> set hive.merge.tezfiles=false; insert overwrite table results_partitioned partition( dt ) select 'goo', 'bar', 'foo', '1' from source UNION ALL select 'go', 'far', 'moo', '1' from source;
...
Loading data to table mythdb_hadooppf_17544.results_partitioned partition (dt=null)
         Time taken for load dynamic partitions : 311
        Loading partition {dt=1}
         Time taken for adding to write entity : 3
OK
Time taken: 27.659 seconds
hive (mythdb_hadooppf_17544)> dfs -ls -R /tmp/mythdb_hadooppf_17544/results_partitioned;
drwxrwxrwt   - dfsload hdfs          0 2017-01-10 23:13 /tmp/mythdb_hadooppf_17544/results_partitioned/dt=1
drwxrwxrwt   - dfsload hdfs          0 2017-01-10 23:13 /tmp/mythdb_hadooppf_17544/results_partitioned/dt=1/1
-rwxrwxrwt   3 dfsload hdfs        349 2017-01-10 23:13 /tmp/mythdb_hadooppf_17544/results_partitioned/dt=1/1/000000_0
drwxrwxrwt   - dfsload hdfs          0 2017-01-10 23:13 /tmp/mythdb_hadooppf_17544/results_partitioned/dt=1/2
-rwxrwxrwt   3 dfsload hdfs        368 2017-01-10 23:13 /tmp/mythdb_hadooppf_17544/results_partitioned/dt=1/2/000000_0

These results can only be read if mapred.input.dir.recursive=true, as TezCompiler::init() seems to do. But the Hadoop default for this is false. This leads to the following errors:
1. Running CONCATENATE on the partition on the partition causes data-loss.

hive --database mythdb_hadooppf_17544 -e " set mapred.input.dir.recursive; alter table results_partitioned partition ( dt='1' ) concatenate ; set mapred.input.dir.recursive; "
...
OK
Time taken: 2.151 seconds
mapred.input.dir.recursive=false


Status: Running (Executing on YARN cluster with App id application_1481756273279_5088754)

--------------------------------------------------------------------------------
        VERTICES      STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  KILLED
--------------------------------------------------------------------------------
File Merge         SUCCEEDED      0          0        0        0       0       0
--------------------------------------------------------------------------------
VERTICES: 01/01  [>>--------------------------] 0%    ELAPSED TIME: 0.35 s
--------------------------------------------------------------------------------
Loading data to table mythdb_hadooppf_17544.results_partitioned partition (dt=1)
Moved: 'hdfs://cluster-nn1.mygrid.myth.net:8020/tmp/mythdb_hadooppf_17544/results_partitioned/dt=1/1' to trash at: hdfs://cluster-nn1.mygrid.myth.net:8020/user/dfsload/.Trash/Current
Moved: 'hdfs://cluster-nn1.mygrid.myth.net:8020/tmp/mythdb_hadooppf_17544/results_partitioned/dt=1/2' to trash at: hdfs://cluster-nn1.mygrid.myth.net:8020/user/dfsload/.Trash/Current
OK
Time taken: 25.873 seconds

$ hdfs dfs -count -h /tmp/mythdb_hadooppf_17544/results_partitioned/dt=1
           1            0                  0 /tmp/mythdb_hadooppf_17544/results_partitioned/dt=1

2. hive.merge.tezfiles is busted, because the merge-task attempts to merge files across results_partitioned/dt=1/1 and results_partitioned/dt=1/2:

$ hive --database mythdb_hadooppf_17544 -e " set hive.merge.tezfiles=true; insert overwrite table results_partitioned partition( dt ) select 'goo', 'bar', 'foo', '1' from source UNION ALL select 'go', 'far', 'moo', '1' from source; "
...
Query ID = dfsload_20170110233558_51289333-d9da-4851-8671-bfe653d26e45
Total jobs = 3
Launching Job 1 out of 3


Status: Running (Executing on YARN cluster with App id application_1481756273279_5089989)

--------------------------------------------------------------------------------
        VERTICES      STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  KILLED
--------------------------------------------------------------------------------
Map 1 ..........   SUCCEEDED      1          1        0        0       0       0
Map 3 ..........   SUCCEEDED      1          1        0        0       0       0
--------------------------------------------------------------------------------
VERTICES: 02/02  [==========================>>] 100%  ELAPSED TIME: 13.07 s
--------------------------------------------------------------------------------
Stage-4 is filtered out by condition resolver.
Stage-3 is selected by condition resolver.
Stage-5 is filtered out by condition resolver.
Launching Job 3 out of 3


Status: Running (Executing on YARN cluster with App id application_1481756273279_5089989)

--------------------------------------------------------------------------------
        VERTICES      STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  KILLED
--------------------------------------------------------------------------------
File Merge           RUNNING      1          0        1        0       2       0
--------------------------------------------------------------------------------
VERTICES: 00/01  [>>--------------------------] 0%    ELAPSED TIME: 3.06 s
--------------------------------------------------------------------------------
...

The File Merge fails with the following:

TaskAttempt 3 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: Multiple partitions for one merge mapper: hdfs://cluster-nn1.mygrid.myth.net:8020/tmp/mythdb_hadooppf_17544/results_partitioned/.hive-staging_hive_2017-01-10_23-35-58_881_4062579557908207136-1/-ext-10002/dt=1/2 NOT EQUAL TO hdfs://cluster-nn1.mygrid.myth.net:8020/tmp/mythdb_hadooppf_17544/results_partitioned/.hive-staging_hive_2017-01-10_23-35-58_881_4062579557908207136-1/-ext-10002/dt=1/1
        at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
        at org.apache.hadoop.hive.ql.exec.tez.MergeFileTezProcessor.run(MergeFileTezProcessor.java:42)
        at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:362)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:192)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:184)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1738)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:184)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:180)
        at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: Multiple partitions for one merge mapper: hdfs://cluster-nn1.mygrid.myth.net:8020/tmp/mythdb_hadooppf_17544/results_partitioned/.hive-staging_hive_2017-01-10_23-35-58_881_4062579557908207136-1/-ext-10002/dt=1/2 NOT EQUAL TO hdfs://cluster-nn1.mygrid.myth.net:8020/tmp/mythdb_hadooppf_17544/results_partitioned/.hive-staging_hive_2017-01-10_23-35-58_881_4062579557908207136-1/-ext-10002/dt=1/1
        at org.apache.hadoop.hive.ql.exec.tez.MergeFileRecordProcessor.processRow(MergeFileRecordProcessor.java:217)
        at org.apache.hadoop.hive.ql.exec.tez.MergeFileRecordProcessor.run(MergeFileRecordProcessor.java:151)
        at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148)
        ... 14 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: Multiple partitions for one merge mapper: hdfs://cluster-nn1.mygrid.myth.net:8020/tmp/mythdb_hadooppf_17544/results_partitioned/.hive-staging_hive_2017-01-10_23-35-58_881_4062579557908207136-1/-ext-10002/dt=1/2 NOT EQUAL TO hdfs://cluster-nn1.mygrid.myth.net:8020/tmp/mythdb_hadooppf_17544/results_partitioned/.hive-staging_hive_2017-01-10_23-35-58_881_4062579557908207136-1/-ext-10002/dt=1/1
        at org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.processKeyValuePairs(OrcFileMergeOperator.java:159)
        at org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.process(OrcFileMergeOperator.java:62)
        at org.apache.hadoop.hive.ql.exec.tez.MergeFileRecordProcessor.processRow(MergeFileRecordProcessor.java:208)
        ... 16 more
Caused by: java.io.IOException: Multiple partitions for one merge mapper: hdfs://cluster-nn1.mygrid.myth.net:8020/tmp/mythdb_hadooppf_17544/results_partitioned/.hive-staging_hive_2017-01-10_23-35-58_881_4062579557908207136-1/-ext-10002/dt=1/2 NOT EQUAL TO hdfs://cluster-nn1.mygrid.myth.net:8020/tmp/mythdb_hadooppf_17544/results_partitioned/.hive-staging_hive_2017-01-10_23-35-58_881_4062579557908207136-1/-ext-10002/dt=1/1
        at org.apache.hadoop.hive.ql.exec.AbstractFileMergeOperator.checkPartitionsMatch(AbstractFileMergeOperator.java:174)
        at org.apache.hadoop.hive.ql.exec.AbstractFileMergeOperator.fixTmpPath(AbstractFileMergeOperator.java:191)
        at org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.processKeyValuePairs(OrcFileMergeOperator.java:86)
        ... 18 more
]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:0, Vertex vertex_1481756273279_5089989_2_00 [File Merge] killed/failed due to:OWN_TASK_FAILURE]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:0

3. Data produced with Hive UNION ALL will not be readable by Pig/HCatalog, without mapred.input.dir.recursive.

Setting mapred.input.dir.recursive=true in hive-site.xml should resolve the first and third problem. But is this the recommendation? This is intrusive, and doesn't solve #2. The Pig UNION doesn't work this way, as per my limited understanding.

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Mithun Radhakrishnan

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Dates

Created:: 10/Jan/17 23:43

Updated:: 11/Jan/17 13:36