Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
0.16.0
-
None
-
None
-
None
-
Mac OSX 10.11.6 El Capitan
Description
Hello, I'm trying to run a Pig Job that have to do a FULL Join over two large datasets. I'm running in local mode and I'm testing it on a very limited part of the datasets.
I started with a LIMIT of 5000 and it worked, after that I tried with a LIMIT of 10000 and the job failed. It arised this error:
2017-01-27 10:56:14,197 [Thread-1516] WARN org.apache.hadoop.mapred.LocalJobRunner - job_local1966159905_0005 java.lang.Exception: java.io.IOException: java.io.IOException: java.util.ConcurrentModificationException at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529) Caused by: java.io.IOException: java.io.IOException: java.util.ConcurrentModificationException at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.runPipeline(PigGenericMapReduce.java:470) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.processOnePackageOutput(PigGenericMapReduce.java:433) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.reduce(PigGenericMapReduce.java:413) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.reduce(PigGenericMapReduce.java:262) at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171) at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389) at org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.io.IOException: java.util.ConcurrentModificationException at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.StoreFuncDecorator.putNext(StoreFuncDecorator.java:83) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:144) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:97) at org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.write(ReduceTask.java:558) at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89) at org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer$Context.write(WrappedReducer.java:105) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.runPipeline(PigGenericMapReduce.java:468) ... 12 more Caused by: java.util.ConcurrentModificationException at java.util.Hashtable$Enumerator.next(Hashtable.java:1367) at org.apache.pig.data.BinInterSedes.writeMap(BinInterSedes.java:616) at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:478) at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:462) at org.apache.pig.data.utils.SedesHelper.writeGenericTuple(SedesHelper.java:135) at org.apache.pig.data.BinInterSedes.writeTuple(BinInterSedes.java:650) at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:470) at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:462) at org.apache.pig.impl.io.InterRecordWriter.write(InterRecordWriter.java:73) at org.apache.pig.impl.io.InterStorage.putNext(InterStorage.java:88) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.StoreFuncDecorator.putNext(StoreFuncDecorator.java:75) ... 18 more