[IMPALA-5663] TPC-DS data loading fails with Ubuntu 16.04 and Java 8 - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: Impala 2.9.0
Fix Version/s: None
Component/s: Infrastructure
Labels:
None
Environment:
GCE or EC2
Ubuntu 16.04
OpenJDK or Oracle JDK 8

Epic Color:
ghx-label-6

Description

OpenJDK7 is not packaged by Canonical for Ubuntu 16.04, and Oracle made their JDK7 harder to get. However, JDK8 data loading fails in TPC-DS:

ERROR : FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask

logs/cluster/hive/hive.log shows the error below, which previous bugs have called an issue with the disk being out of space, but my disk has at least 45GB left on it

~~IMPALA-3246~~, ~~IMPALA-2856~~, ~~IMPALA-2617~~

FATAL ExecReducer (ExecReducer.java:reduce(264)) - org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {"key":{},"value":{"_col0":48147,"_col1":17805,"_col2":27944,"_col3":606992,"_col4":3193,"_col5":16641,"_col6":10,"_col7":209,"_col8":44757,"_col9":20,"_col10":5.51,"_col11":9.36,"_col12":9.17,"_col13":0,"_col14":183.4,"_col15":110.2,"_col16":187.2,"_col17":3.66,"_col18":0,"_col19":183.4,"_col20":187.06,"_col21":73.2,"_col22":2452013}}
        at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:253)
        at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:444)
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
        at org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:346)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /test-warehouse/tpcds.store_sales/.hive-staging_hive_2017-07-12_22-51-18_139_3687815919405186455-760/_task_tmp.-ext-10000/ss_sold_date_sk=2452013/_tmp.000001_0 could only be replicated to 0 nodes instead of minReplication (=1).  There are 3 datanode(s) running and no node(s) are excluded in this operation.
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1724)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3385)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:683)
        at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.addBlock(AuthorizationProviderProxyClientProtocol.java:214)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:495)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2217)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2213)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2211)

        at org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:751)
        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
        at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
        at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:244)

I am not running out of memory.

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Jim Apple

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 13/Jul/17 22:53

Updated:: 11/Aug/18 17:08

Resolved:: 11/Aug/18 17:08