Details
Description
test.pig
-- For Pig 0.9 --A = LOAD 'input/PigStorageSchema/Temp{1,2}/pss*' USING org.apache.pig.piggybank.storage.PigStorageSchema(); -- For Pig 0.10 A = LOAD 'input/PigStorageSchema/Temp{1,2}/pss*' USING PigStorage('\t', '-schema'); DESCRIBE A; DUMP A
Schema file _input/PigStorageSchema/Temp
{1,2}.pig_schema_{"fields":[{"name":"name","type":55,"schema":null,"description":"autogenerated from Pig Field Schema"},{"name":"val","type":10,"schema":null,"description":"autogenerated from Pig Field Schema"}],"version":0,"sortKeys":[],"sortKeyOrders":[]}
Header file _input/PigStorageSchema/Temp{1,2}
/.pig_header_
name val
Sample input file input/PigStorageSchema/Temp1/pss.in
peter 1 samir 3 michael 4 peter 2 peter 4 samir 1
On running the above pig script test.pig with pig 0.10, the following error is received.
012-01-24 04:07:42,210 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1131: Could not find schema file for hdfs://nameNode:8020/user/mitesh/input/PigStorageSchema/Temp{1,2}/pss*
Pig Stack Trace
Pig Stack Trace --------------- ERROR 1131: Could not find schema file for hdfs://nameNode:8020/user/mitesh/input/PigStorageSchema/Temp{1,2}/pss* org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator for alias A at org.apache.pig.PigServer.openIterator(PigServer.java:858) at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:655) at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:303) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:188) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:164) at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81) at org.apache.pig.Main.run(Main.java:567) at org.apache.pig.Main.main(Main.java:111) Caused by: org.apache.pig.PigException: ERROR 1002: Unable to store alias A at org.apache.pig.PigServer.storeEx(PigServer.java:957) at org.apache.pig.PigServer.store(PigServer.java:920) at org.apache.pig.PigServer.openIterator(PigServer.java:833) ... 7 more Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2245: <file test.pig, line 2, column 4> Cannot get schema from loadFunc org.apache.pig.builtin.PigStorage at org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:154) at org.apache.pig.newplan.logical.relational.LOLoad.getSchema(LOLoad.java:109) at org.apache.pig.newplan.logical.relational.LOStore.getSchema(LOStore.java:68) at org.apache.pig.newplan.logical.visitor.SchemaAliasVisitor.validate(SchemaAliasVisitor.java:60) at org.apache.pig.newplan.logical.visitor.SchemaAliasVisitor.visit(SchemaAliasVisitor.java:84) at org.apache.pig.newplan.logical.relational.LOStore.accept(LOStore.java:77) at org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75) at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50) at org.apache.pig.PigServer$Graph.compile(PigServer.java:1618) at org.apache.pig.PigServer$Graph.compile(PigServer.java:1612) at org.apache.pig.PigServer$Graph.access$200(PigServer.java:1335) at org.apache.pig.PigServer.storeEx(PigServer.java:952) ... 9 more Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1131: Could not find schema file for hdfs://nameNode:8020/user/mitesh/input/PigStorageSchema/Temp{1,2}/pss* at org.apache.pig.builtin.JsonMetadata.nullOrException(JsonMetadata.java:222) at org.apache.pig.builtin.JsonMetadata.getSchema(JsonMetadata.java:191) at org.apache.pig.builtin.PigStorage.getSchema(PigStorage.java:438) at org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:150)
Whereas PigStorageSchema() or PigStorage('\t', '-schema') works with wildcard *.
For example, following script works
test2.pig
A = LOAD 'input/PigStorageSchema/Temp*/pss*' USING PigStorage('\t', '-schema'); DESCRIBE A; DUMP A;
As a workaround to make Temp
{1,2} globbing work, the ,(comma) separated multiple input paths (with no globbing)
can given as input.
test2.pig
A = LOAD 'input/PigStorageSchema/Temp1/pss*,input/PigStorageSchema/Temp2/pss*' USING PigStorage('\t', '-schema'); DESCRIBE A; DUMP A;