Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-3079

AvroStorage fails on STORE if "doc" contains hash character

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 0.9.2
    • None
    • piggybank
    • None

    Description

      I have the following Avro schema:

      {
       "name": "test_log",
       "namespace": "com.example.avro",
       "type": "record",
       "doc": "Some long comment containung # character.",
       "fields": [
        {"name": "id", "type": "long"},
        {"name": "field1", "type": "int"}
       ]
      }
      

      The file

      test_log.avro

      contains one record (5673565,123) which I can dump:

      DEFINE AvroStorage org.apache.pig.piggybank.storage.avro.AvroStorage('same', 'test_log.avro');
      data = LOAD 'track_log.avro' USING AvroStorage();
      dump data;
      

      But when I try to store the data...

      DEFINE AvroStorage org.apache.pig.piggybank.storage.avro.AvroStorage('same', 'test_log.avro');
      data = LOAD 'track_log.avro' USING AvroStorage();
      STORE data INTO 'out' USING AvroStorage();
      

      I get the following error:

      Pig Stack Trace
      ---------------
      ERROR 6000:
      <line 3, column 0> Output Location Validation Failed for: 'hdfs://nameservice1/user/schwenk/out More info to follow:
      Expect 2 fields in  character.","fields":[{"name":"id","type":"long"},{"name":"field1","type":"int"}]}
      
      org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002: Unable to store alias data
              at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1596)
              at org.apache.pig.PigServer.registerQuery(PigServer.java:584)
              at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:967)
              at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:386)
              at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:189)
              at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
              at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69)
              at org.apache.pig.Main.run(Main.java:495)
              at org.apache.pig.Main.main(Main.java:111)
              at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
              at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
              at java.lang.reflect.Method.invoke(Method.java:597)
              at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
      Caused by: org.apache.pig.impl.plan.VisitorException: ERROR 6000:
      <line 3, column 0> Output Location Validation Failed for: 'hdfs://nameservice1/user/schwenk/out More info to follow:
      Expect 2 fields in  character.","fields":[{"name":"id","type":"long"},{"name":"field1","type":"int"}]}
              at org.apache.pig.newplan.logical.rules.InputOutputFileValidator$InputOutputFileVisitor.visit(InputOutputFileValidator.java:95)
              at org.apache.pig.newplan.logical.relational.LOStore.accept(LOStore.java:77)
              at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:64)
              at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66)
              at org.apache.pig.newplan.DepthFirstWalker.walk(DepthFirstWalker.java:53)
              at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50)
              at org.apache.pig.newplan.logical.rules.InputOutputFileValidator.validate(InputOutputFileValidator.java:45)
              at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:294)
              at org.apache.pig.PigServer.compilePp(PigServer.java:1360)
              at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1297)
              at org.apache.pig.PigServer.execute(PigServer.java:1289)
              at org.apache.pig.PigServer.access$400(PigServer.java:125)
              at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1591)
              ... 13 more
      Caused by: java.io.IOException: Expect 2 fields in  character.","fields":[{"name":"id","type":"long"},{"name":"field1","type":"int"}]}
              at org.apache.pig.piggybank.storage.avro.AvroStorage.parseSchemaMap(AvroStorage.java:567)
              at org.apache.pig.piggybank.storage.avro.AvroStorage.getOutputFormat(AvroStorage.java:581)
              at org.apache.pig.newplan.logical.rules.InputOutputFileValidator$InputOutputFileVisitor.visit(InputOutputFileValidator.java:80)
              ... 25 more
      

      Attachments

        Activity

          People

            Unassigned Unassigned
            johannesch Johannes Schwenk
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: