Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
2.7.3
-
None
-
None
-
This scenario occurs on hdp 2.5(hadoop 2.7.3) hdfs on WASB microsoft Azure platform.
The same query yields proper result on regular hdfs on hdp 2.5(hadoop 2.7.3) on premise cluster.
Description
When running multiple hive queries on Tez (see example ) the same mapper task number gets overwritten by the next union query. As seen in the azure snapshot the directories /1 ,2 ...,100 get overwritten again and again since the mapper numbers launch write again and again in the same directories.
But in the on premise hadoop cluster version 2.7.3 . The directories are created as 1_copy_0,1_copy_2 and so on. Creating copies does not overwrite the data.
The queries would be usually 600-1000 queries union together.