Details
-
Bug
-
Status: Open
-
Critical
-
Resolution: Unresolved
-
1.2.0
-
None
-
None
Description
Here's one that knoguchi found and root-caused. This one's a doozy.
Under seemingly random conditions, the temporary output (under _SCRATCH1.234*) for HCat's dynamic partitioner isn't promoted correctly to the final table directory.
The namenode logs indicated a botched directory-rename:
2015-08-02 03:24:29,090 INFO FSNamesystem.audit: allowed=true ugi=myth (auth:TOKEN) via wrkflow@GRID.MYTH.NET (auth:TOKEN) ip=/10.192.100.117 cmd=rename src=/projects/hive/myth.db/myth_table_15m/_SCRATCH2.8772158158263395E-4/tc=1/utc_time=201508020145/part-r-00000 dst=/projects/hive/myth.db/myth_table_15mE-4/tc=1/utc_time=201508020145/part-r-00000 perm=myth:madcaps:rw-r-r- proto=rpc
Note that the table-directory name "myth_table_15m" is appended with "E-4". This'll break anything that uses HDFS-based polling.
knoguchi points out the following code:
119 if ((idHash = conf.get(HCatConstants.HCAT_OUTPUT_ID_HASH)) == null) { 120 idHash = String.valueOf(Math.random()); 121 }
370 String finalLocn = jobLocation.replaceAll(Path.SEPARATOR + SCRATCH_DIR_NAME + "\\d\\.?\\d+","");
The problem is that when Math.random() produces a number <= 10 -3, String.valueOf(double) uses exponential notation. The regex doesn't capture or handle this notation.
The fix belies the debugging-effort.
Attachments
Attachments
Issue Links
- is fixed by
-
HIVE-22771 Partition location incorrectly formed in FileOutputCommitterContainer
- Closed