Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-2792

Wonderdog stopped working in Pig 0.10.0 (worked in 0.9.2)

    XMLWordPrintableJSON

Details

    Description

      The Pig UDFs in Wonderdog for ElasticSearch integration, which worked in 0.9.2 stopped working in 0.10.0.

      Now in 0.10.0 there is an error, as Wonderdog is unable to read its configuration from the hadoop cache.

      If someone can help identify what the issue is, or advise how Wonderdog or Pig can be modified so that wonderdog works with with Pig 0.10, it would be greatly appreciated.

      This issue is duped in the Wonderdog project here: https://github.com/infochimps-labs/wonderdog/issues/6 https://github.com/infochimps-labs/wonderdog/issues/5 and https://github.com/infochimps-labs/wonderdog/issues/7

      The error is below:

      2012-07-06 16:50:51,501 [main] INFO org.apache.pig.Main - Apache Pig version 0.10.0-SNAPSHOT (rexported) compiled Jun 22 2012, 15:56:16
      2012-07-06 16:50:51,502 [main] INFO org.apache.pig.Main - Logging error messages to: /private/tmp/pig_1341618651472.log
      2012-07-06 16:50:51,829 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: file:///

      {"ok":true}

      % Total % Received % Xferd Average Speed Time Time Time Current
      Dload Upload Total Spent Left Speed

      0 0 0 0 0 0 0 0 -::- -::- -::- 0
      100 11 100 11 0 0 647 0 -::- -::- -::- 733
      2012-07-06 16:50:53,206 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: UNKNOWN
      2012-07-06 16:50:53,379 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
      2012-07-06 16:50:53,403 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
      2012-07-06 16:50:53,403 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
      2012-07-06 16:50:53,441 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
      2012-07-06 16:50:53,449 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
      2012-07-06 16:50:53,494 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
      2012-07-06 16:50:53,560 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
      2012-07-06 16:50:53,587 [Thread-7] WARN org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
      2012-07-06 16:50:53,597 [Thread-7] WARN org.apache.hadoop.mapred.JobClient - No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String).
      ****file:/tmp/emails.json
      2012-07-06 16:50:53,711 [Thread-7] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 2
      2012-07-06 16:50:53,711 [Thread-7] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 2
      2012-07-06 16:50:53,734 [Thread-7] WARN org.apache.hadoop.io.compress.snappy.LoadSnappy - Snappy native library not loaded
      2012-07-06 16:50:53,737 [Thread-7] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 3
      2012-07-06 16:50:54,008 [Thread-8] INFO org.apache.hadoop.mapred.Task - Using ResourceCalculatorPlugin : null
      2012-07-06 16:50:54,023 [Thread-8] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader - Current split being processed file:/tmp/emails.json/part-m-00000:0+33554432
      2012-07-06 16:50:54,029 [Thread-8] INFO com.infochimps.elasticsearch.ElasticSearchOutputFormat - Using field:[message_id] for document ids
      2012-07-06 16:50:54,029 [Thread-8] INFO com.infochimps.elasticsearch.ElasticSearchOutputFormat - Using [null] as es.config
      2012-07-06 16:50:54,029 [Thread-8] INFO com.infochimps.elasticsearch.ElasticSearchOutputFormat - Using [null] as es.plugins.dir
      2012-07-06 16:50:54,033 [Thread-8] WARN org.apache.hadoop.mapred.FileOutputCommitter - Output path is null in cleanup
      2012-07-06 16:50:54,034 [Thread-8] WARN org.apache.hadoop.mapred.LocalJobRunner - job_local_0001
      java.lang.RuntimeException: java.lang.NullPointerException
      at com.infochimps.elasticsearch.ElasticSearchOutputFormat$ElasticSearchRecordWriter.<init>(ElasticSearchOutputFormat.java:133)
      at com.infochimps.elasticsearch.ElasticSearchOutputFormat.getRecordWriter(ElasticSearchOutputFormat.java:262)
      at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.getRecordWriter(PigOutputFormat.java:84)
      at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.<init>(MapTask.java:628)
      at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:753)
      at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
      at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
      Caused by: java.lang.NullPointerException
      at java.util.Hashtable.put(Hashtable.java:394)
      at java.util.Properties.setProperty(Properties.java:143)
      at java.lang.System.setProperty(System.java:746)
      at com.infochimps.elasticsearch.ElasticSearchOutputFormat$ElasticSearchRecordWriter.<init>(ElasticSearchOutputFormat.java:130)
      ... 6 more
      2012-07-06 16:50:54,506 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_local_0001
      2012-07-06 16:50:54,506 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
      2012-07-06 16:50:59,022 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job job_local_0001 has failed! Stop running all dependent jobs
      2012-07-06 16:50:59,023 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
      2012-07-06 16:50:59,024 [main] ERROR org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) failed!
      2012-07-06 16:50:59,024 [main] INFO org.apache.pig.tools.pigstats.SimplePigStats - Detected Local mode. Stats reported below may be incomplete
      2012-07-06 16:50:59,025 [main] INFO org.apache.pig.tools.pigstats.SimplePigStats - Script Statistics:

      HadoopVersion PigVersion UserId StartedAt FinishedAt Features
      1.0.2 0.10.0-SNAPSHOT rjurney 2012-07-06 16:50:53 2012-07-06 16:50:59 UNKNOWN

      Failed!

      Failed Jobs:
      JobId Alias Feature Message Outputs
      job_local_0001 json_emails MAP_ONLY Message: Job failed! Error - NA es://email/email?id=message_id&json=true&size=1000,

      Input(s):
      Failed to read data from "/tmp/emails.json"

      Output(s):
      Failed to produce result in "es://email/email?id=message_id&json=true&size=1000"

      Job DAG:
      job_local_0001

      2012-07-06 16:50:59,025 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed!
      2012-07-06 16:50:59,029 [main] ERROR org.apache.pig.tools.grunt.GruntParser - ERROR 2244: Job failed, hadoop does not return any error message
      2012-07-06 16:50:59,029 [main] ERROR org.apache.pig.tools.grunt.GruntParser - org.apache.pig.backend.executionengine.ExecException: ERROR 2244: Job failed, hadoop does not return any error message
      at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:140)
      at org.apache.pig.tools.grunt.GruntParser.processShCommand(GruntParser.java:1025)
      at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:167)
      at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:189)
      at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
      at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
      at org.apache.pig.Main.run(Main.java:555)
      at org.apache.pig.Main.main(Main.java:111)
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      at java.lang.reflect.Method.invoke(Method.java:597)
      at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

      Details also at logfile: /private/tmp/pig_1341618651472.log
      % Total % Received % Xferd Average Speed Time Time Time Current
      Dload Upload Total Spent Left Speed

      {
      "took" : 75,
      "timed_out" : false,
      0 0 0 0 0 0 0 0 -::- -::- -::- 0 "_shards" :

      { "total" : 5, "successful" : 5, "failed" : 0 }

      ,
      "hits" :

      { "total" : 0, "max_score" : null, "hits" : [ ] }

      }

      100 193 100 193 0 0 2475 0 -::- -::- -::- 2539
      2012-07-06 16:50:59,140 [main] ERROR org.apache.pig.tools.grunt.GruntParser - ERROR 2244: Job failed, hadoop does not return any error message
      2012-07-06 16:50:59,140 [main] ERROR org.apache.pig.tools.grunt.GruntParser - org.apache.pig.backend.executionengine.ExecException: ERROR 2244: Job failed, hadoop does not return any error message
      at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:140)
      at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:193)
      at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
      at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
      at org.apache.pig.Main.run(Main.java:555)
      at org.apache.pig.Main.main(Main.java:111)
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      at java.lang.reflect.Method.invoke(Method.java:597)
      at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

      Attachments

        Activity

          People

            Unassigned Unassigned
            russell.jurney Russell Jurney
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: