Details
-
Bug
-
Status: Resolved
-
Blocker
-
Resolution: Invalid
-
2.8.3
-
None
-
None
-
EMR-5.15, Hadoop-2.8.3, Hive-2.3.3, Tez-0.8.4, Beeline.
Target table is defined for ACID transactions with location on S3.
Insert source table is on S3.
Description
Manually modified the yarn-site.xml from within the EMR, set the param yarn.nodemanager.local-dirs to point to s3, reloaded the services on Master and Core nodes. Disk seemed to stay intact but hdfs dfsadmin -report showed nonDFS usage and then finally it failed with below error.
Error: org.apache.hive.service.cli.HiveSQLException: Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map 1, vertexId=vertex_1532581073633_0001_2_00, diagnostics=[Task failed, taskId=task_1532581073633_0001_2_00_000898, diagnostics=[TaskAttempt 0 failed, info=[Error: Error while running task ( failure ) : attempt_1532581073633_0001_2_00_000898_0:org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any valid local directory for output/attempt_1532581073633_0001_2_00_000898_0_10013_1/file.out
at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:441)
at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:151)
at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:132)
at org.apache.tez.runtime.library.common.task.local.output.TezTaskOutputFiles.getSpillFileForWrite(TezTaskOutputFiles.java:207)
at org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.spill(PipelinedSorter.java:545)
...