Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
0.13.1
-
None
-
None
-
None
-
$pig --version Apache Pig version 0.13.1-mapr-1410 (rexported) compiled Nov 05 2014, 10:16:28
Description
Pig fails mysteriously when I specify the root of a large directory tree as the LOAD input in my script. The exception that it throws offers no insight into what's happening. The same script works perfectly when there are fewer files.
It's a very simple script as you can see below:
SET pig.noSplitCombination true; raw_record = LOAD '/data/directory/tree/root' USING PigStorage(','); filtered = FILTER raw_record by $1 == 251068; filtered_data = FOREACH filtered GENERATE (chararray)$0, (chararray)$1, (chararray)$2; STORE filtered_data INTO '/data/output/directory/' USING PigStorage();
Here's the error message I see :
ERROR 2244: Job scope-594 failed, hadoop does not return any error message org.apache.pig.backend.executionengine.ExecException: ERROR 2244: Job scope-594 failed, hadoop does not return any error message at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:178) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:232) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:203) at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81) at org.apache.pig.Main.run(Main.java:608) at org.apache.pig.Main.main(Main.java:156) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
How many files can PIG process at once?