Description
Exception displays while implementing compaction with alter query.
Steps to reproduce:
1) Create a table :
CREATE TABLE CUSTOMER1 ( C_CUSTKEY INT , C_NAME STRING , C_ADDRESS STRING , C_NATIONKEY INT , C_PHONE STRING , C_ACCTBAL DECIMAL(15,2) , C_MKTSEGMENT STRING , C_COMMENT STRING) stored by 'carbondata';
2) Insert data into the table:
a) insert into customer1 values(1,'vandana','noida',1,'123456789',45987.78,'hello','comment')
b) insert into customer1 values(2,'vandana','noida',2,'123456789',487.78,'hello','comment')
c) insert into customer1 values(3,'geetika','delhi',3,'123456789',487897.78,'hello','comment')
d) insert into customer1 values(4,'sangeeta','delhi',3,'123456789',48789.78,'hello','comment')
3) Perform alter table query:
alter table customer1 add columns (intfield int) TBLPROPERTIES ('DEFAULT.VALUE.intfield'='10');
4) show segments for displaying segments before compaction
show segments for table customer1;
output:
--------------------------------------------------------------------------------------------------
SegmentSequenceId | Status | Load Start Time | Load End Time | Merged To | File Format |
--------------------------------------------------------------------------------------------------
3 | Success | 2018-01-10 16:16:53.611 | 2018-01-10 16:16:54.99 | NA | COLUMNAR_V3 |
2 | Success | 2018-01-10 16:16:46.878 | 2018-01-10 16:16:47.75 | NA | COLUMNAR_V3 |
1 | Success | 2018-01-10 16:16:38.096 | 2018-01-10 16:16:38.972 | NA | COLUMNAR_V3 |
0 | Success | 2018-01-10 16:16:31.979 | 2018-01-10 16:16:33.293 | NA | COLUMNAR_V3 |
--------------------------------------------------------------------------------------------------
4 rows selected (0.029 seconds)
5) alter table query for compaction:
alter table customer1 compact 'minor';
Expected Result: Table should be compacted successfully.
Actual Result:
Error: org.apache.spark.sql.AnalysisException: Compaction failed. Please check logs for more info. Exception in compaction Compaction Failure in Merger Rdd.; (state=,code=0)
thriftserver logs:
18/01/10 16:17:12 ERROR CompactionResultSortProcessor: [Executor task launch worker-36][partitionID:customer1;queryID:15798380253871] Compaction failed: java.lang.Long cannot be cast to java.lang.Integer
java.lang.ClassCastException: java.lang.Long cannot be cast to java.lang.Integer
at org.apache.carbondata.processing.sort.sortdata.SortDataRows.writeDataToFile(SortDataRows.java:273)
at org.apache.carbondata.processing.sort.sortdata.SortDataRows.startSorting(SortDataRows.java:214)
at org.apache.carbondata.processing.merger.CompactionResultSortProcessor.processResult(CompactionResultSortProcessor.java:226)
at org.apache.carbondata.processing.merger.CompactionResultSortProcessor.execute(CompactionResultSortProcessor.java:159)
at org.apache.carbondata.spark.rdd.CarbonMergerRDD$$anon$1.<init>(CarbonMergerRDD.scala:234)
at org.apache.carbondata.spark.rdd.CarbonMergerRDD.internalCompute(CarbonMergerRDD.scala:81)
at org.apache.carbondata.spark.rdd.CarbonRDD.compute(CarbonRDD.scala:60)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:99)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
18/01/10 16:17:12 INFO UnsafeMemoryManager: [Executor task launch worker-36][partitionID:customer1;queryID:15798380253871] Total memory used after task 15798371335347 is 5313 Current tasks running now are : [6856382704941, 14621295743743, 14461639534151, 4378916027096, 15798216567589]
18/01/10 16:17:12 INFO UnsafeMemoryManager: [Executor task launch worker-36][partitionID:customer1;queryID:15798380253871] Total memory used after task 15798371335347 is 5313 Current tasks running now are : [6856382704941, 14621295743743, 14461639534151, 4378916027096, 15798216567589]
18/01/10 16:17:12 INFO UnsafeMemoryManager: [Executor task launch worker-36][partitionID:customer1;queryID:15798380253871] Total memory used after task 15798371335347 is 5313 Current tasks running now are : [6856382704941, 14621295743743, 14461639534151, 4378916027096, 15798216567589]
18/01/10 16:17:12 INFO UnsafeMemoryManager: [Executor task launch worker-36][partitionID:customer1;queryID:15798380253871] Total memory used after task 15798371335347 is 5313 Current tasks running now are : [6856382704941, 14621295743743, 14461639534151, 4378916027096, 15798216567589]
18/01/10 16:17:12 INFO CarbonLoaderUtil: LocalFolderDeletionPool:customer1 Deleted the local store location: /tmp/15798371407704_0 : Time taken: 2
18/01/10 16:17:12 INFO Executor: Finished task 0.0 in stage 75.0 (TID 490). 1037 bytes result sent to driver
18/01/10 16:17:12 INFO TaskSetManager: Finished task 0.0 in stage 75.0 (TID 490) in 39 ms on localhost (executor driver) (1/1)
18/01/10 16:17:12 INFO TaskSchedulerImpl: Removed TaskSet 75.0, whose tasks have all completed, from pool
18/01/10 16:17:12 INFO DAGScheduler: ResultStage 75 (collect at CarbonTableCompactor.scala:211) finished in 0.039 s
18/01/10 16:17:12 INFO DAGScheduler: Job 76 finished: collect at CarbonTableCompactor.scala:211, took 0.063051 s
18/01/10 16:17:12 AUDIT CarbonTableCompactor: [knoldus][hduser][Thread-125]Compaction request failed for table newcarbon.customer1
18/01/10 16:17:12 ERROR CarbonTableCompactor: pool-23-thread-7 Compaction request failed for table newcarbon.customer1
18/01/10 16:17:12 ERROR CarbonTableCompactor: pool-23-thread-7 Exception in compaction thread Compaction Failure in Merger Rdd.
java.lang.Exception: Compaction Failure in Merger Rdd.
at org.apache.carbondata.spark.rdd.CarbonTableCompactor.triggerCompaction(CarbonTableCompactor.scala:269)
at org.apache.carbondata.spark.rdd.CarbonTableCompactor.scanSegmentsAndSubmitJob(CarbonTableCompactor.scala:120)
at org.apache.carbondata.spark.rdd.CarbonTableCompactor.executeCompaction(CarbonTableCompactor.scala:71)
at org.apache.carbondata.spark.rdd.CarbonDataRDDFactory$$anon$2.run(CarbonDataRDDFactory.scala:182)
at org.apache.carbondata.spark.rdd.CarbonDataRDDFactory$.startCompactionThreads(CarbonDataRDDFactory.scala:269)
at org.apache.spark.sql.execution.command.management.CarbonAlterTableCompactionCommand.alterTableForCompaction(CarbonAlterTableCompactionCommand.scala:255)
at org.apache.spark.sql.execution.command.management.CarbonAlterTableCompactionCommand.processData(CarbonAlterTableCompactionCommand.scala:111)
at org.apache.spark.sql.execution.command.DataCommand.run(package.scala:71)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:135)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:132)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:113)
at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:87)
at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:87)
at org.apache.spark.sql.Dataset.<init>(Dataset.scala:185)
at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:64)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:592)
at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:699)
at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:220)
at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:163)
at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:160)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(SparkExecuteStatementOperation.scala:173)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
18/01/10 16:17:12 ERROR CarbonDataRDDFactory$: pool-23-thread-7 Exception in compaction thread Compaction Failure in Merger Rdd.
18/01/10 16:17:12 INFO HdfsFileLock: pool-23-thread-7 Deleted the lock file hdfs://localhost:54310/opt/prestocarbonStore/newcarbon/customer1/compaction.lock
18/01/10 16:17:12 ERROR CarbonAlterTableCompactionCommand: pool-23-thread-7 Exception in start compaction thread. Exception in compaction Compaction Failure in Merger Rdd.
18/01/10 16:17:12 ERROR HdfsFileLock: pool-23-thread-7 Not able to delete the lock file because it is not existed in location hdfs://localhost:54310/opt/prestocarbonStore/newcarbon/customer1/compaction.lock
18/01/10 16:17:12 ERROR SparkExecuteStatementOperation: Error executing query, currentState RUNNING,
org.apache.spark.sql.AnalysisException: Compaction failed. Please check logs for more info. Exception in compaction Compaction Failure in Merger Rdd.;
at org.apache.spark.sql.util.CarbonException$.analysisException(CarbonException.scala:23)
at org.apache.spark.sql.execution.command.management.CarbonAlterTableCompactionCommand.processData(CarbonAlterTableCompactionCommand.scala:120)
at org.apache.spark.sql.execution.command.DataCommand.run(package.scala:71)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:135)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:132)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:113)
at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:87)
at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:87)
at org.apache.spark.sql.Dataset.<init>(Dataset.scala:185)
at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:64)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:592)
at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:699)
at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:220)
at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:163)
at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:160)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(SparkExecuteStatementOperation.scala:173)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
18/01/10 16:17:12 ERROR SparkExecuteStatementOperation: Error running hive query:
org.apache.hive.service.cli.HiveSQLException: org.apache.spark.sql.AnalysisException: Compaction failed. Please check logs for more info. Exception in compaction Compaction Failure in Merger Rdd.;
at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:258)
at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:163)
at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:160)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(SparkExecuteStatementOperation.scala:173)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)