[PIG-2792] Wonderdog stopped working in Pig 0.10.0 (worked in 0.9.2) - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Blocker
Resolution: Fixed
Affects Version/s: 0.10.0, 0.11, 0.10.1
Fix Version/s: 0.10.1
Component/s: piggybank
Labels:
- a
- about
- area
- book
- did
- i
- moving
- of
- omg
- technology
- why
- write
Environment:

Hide

Pig with Wonderdog https://github.com/infochimps-labs/wonderdog for elasticsearch integration. Elasticsearch 0.18.6. Pig local mode.

Show
Pig with Wonderdog https://github.com/infochimps-labs/wonderdog for elasticsearch integration. Elasticsearch 0.18.6. Pig local mode.

Description

The Pig UDFs in Wonderdog for ElasticSearch integration, which worked in 0.9.2 stopped working in 0.10.0.

Now in 0.10.0 there is an error, as Wonderdog is unable to read its configuration from the hadoop cache.

If someone can help identify what the issue is, or advise how Wonderdog or Pig can be modified so that wonderdog works with with Pig 0.10, it would be greatly appreciated.

This issue is duped in the Wonderdog project here: https://github.com/infochimps-labs/wonderdog/issues/6 https://github.com/infochimps-labs/wonderdog/issues/5 and https://github.com/infochimps-labs/wonderdog/issues/7

The error is below:

2012-07-06 16:50:51,501 [main] INFO org.apache.pig.Main - Apache Pig version 0.10.0-SNAPSHOT (rexported) compiled Jun 22 2012, 15:56:16
2012-07-06 16:50:51,502 [main] INFO org.apache.pig.Main - Logging error messages to: /private/tmp/pig_1341618651472.log
2012-07-06 16:50:51,829 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: file:///

{"ok":true}

% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed

0 0 0 0 0 0 0 0 -::- -::- -::- 0
100 11 100 11 0 0 647 0 -::- -::- -::- 733
2012-07-06 16:50:53,206 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: UNKNOWN
2012-07-06 16:50:53,379 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2012-07-06 16:50:53,403 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2012-07-06 16:50:53,403 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2012-07-06 16:50:53,441 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
2012-07-06 16:50:53,449 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2012-07-06 16:50:53,494 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2012-07-06 16:50:53,560 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
2012-07-06 16:50:53,587 [Thread-7] WARN org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2012-07-06 16:50:53,597 [Thread-7] WARN org.apache.hadoop.mapred.JobClient - No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String).
****file:/tmp/emails.json
2012-07-06 16:50:53,711 [Thread-7] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 2
2012-07-06 16:50:53,711 [Thread-7] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 2
2012-07-06 16:50:53,734 [Thread-7] WARN org.apache.hadoop.io.compress.snappy.LoadSnappy - Snappy native library not loaded
2012-07-06 16:50:53,737 [Thread-7] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 3
2012-07-06 16:50:54,008 [Thread-8] INFO org.apache.hadoop.mapred.Task - Using ResourceCalculatorPlugin : null
2012-07-06 16:50:54,023 [Thread-8] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader - Current split being processed file:/tmp/emails.json/part-m-00000:0+33554432
2012-07-06 16:50:54,029 [Thread-8] INFO com.infochimps.elasticsearch.ElasticSearchOutputFormat - Using field:[message_id] for document ids
2012-07-06 16:50:54,029 [Thread-8] INFO com.infochimps.elasticsearch.ElasticSearchOutputFormat - Using [null] as es.config
2012-07-06 16:50:54,029 [Thread-8] INFO com.infochimps.elasticsearch.ElasticSearchOutputFormat - Using [null] as es.plugins.dir
2012-07-06 16:50:54,033 [Thread-8] WARN org.apache.hadoop.mapred.FileOutputCommitter - Output path is null in cleanup
2012-07-06 16:50:54,034 [Thread-8] WARN org.apache.hadoop.mapred.LocalJobRunner - job_local_0001
java.lang.RuntimeException: java.lang.NullPointerException
at com.infochimps.elasticsearch.ElasticSearchOutputFormat$ElasticSearchRecordWriter.<init>(ElasticSearchOutputFormat.java:133)
at com.infochimps.elasticsearch.ElasticSearchOutputFormat.getRecordWriter(ElasticSearchOutputFormat.java:262)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.getRecordWriter(PigOutputFormat.java:84)
at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.<init>(MapTask.java:628)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:753)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
Caused by: java.lang.NullPointerException
at java.util.Hashtable.put(Hashtable.java:394)
at java.util.Properties.setProperty(Properties.java:143)
at java.lang.System.setProperty(System.java:746)
at com.infochimps.elasticsearch.ElasticSearchOutputFormat$ElasticSearchRecordWriter.<init>(ElasticSearchOutputFormat.java:130)
... 6 more
2012-07-06 16:50:54,506 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_local_0001
2012-07-06 16:50:54,506 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2012-07-06 16:50:59,022 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job job_local_0001 has failed! Stop running all dependent jobs
2012-07-06 16:50:59,023 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2012-07-06 16:50:59,024 [main] ERROR org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) failed!
2012-07-06 16:50:59,024 [main] INFO org.apache.pig.tools.pigstats.SimplePigStats - Detected Local mode. Stats reported below may be incomplete
2012-07-06 16:50:59,025 [main] INFO org.apache.pig.tools.pigstats.SimplePigStats - Script Statistics:

HadoopVersion PigVersion UserId StartedAt FinishedAt Features
1.0.2 0.10.0-SNAPSHOT rjurney 2012-07-06 16:50:53 2012-07-06 16:50:59 UNKNOWN

Failed!

Failed Jobs:
JobId Alias Feature Message Outputs
job_local_0001 json_emails MAP_ONLY Message: Job failed! Error - NA es://email/email?id=message_id&json=true&size=1000,

Input(s):
Failed to read data from "/tmp/emails.json"

Output(s):
Failed to produce result in "es://email/email?id=message_id&json=true&size=1000"

Job DAG:
job_local_0001

2012-07-06 16:50:59,025 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed!
2012-07-06 16:50:59,029 [main] ERROR org.apache.pig.tools.grunt.GruntParser - ERROR 2244: Job failed, hadoop does not return any error message
2012-07-06 16:50:59,029 [main] ERROR org.apache.pig.tools.grunt.GruntParser - org.apache.pig.backend.executionengine.ExecException: ERROR 2244: Job failed, hadoop does not return any error message
at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:140)
at org.apache.pig.tools.grunt.GruntParser.processShCommand(GruntParser.java:1025)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:167)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:189)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
at org.apache.pig.Main.run(Main.java:555)
at org.apache.pig.Main.main(Main.java:111)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

Details also at logfile: /private/tmp/pig_1341618651472.log
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed

{
"took" : 75,
"timed_out" : false,
0 0 0 0 0 0 0 0 -::- -::- -::- 0 "_shards" :

{ "total" : 5, "successful" : 5, "failed" : 0 }

,
"hits" :

{ "total" : 0, "max_score" : null, "hits" : [ ] }

}

100 193 100 193 0 0 2475 0 -::- -::- -::- 2539
2012-07-06 16:50:59,140 [main] ERROR org.apache.pig.tools.grunt.GruntParser - ERROR 2244: Job failed, hadoop does not return any error message
2012-07-06 16:50:59,140 [main] ERROR org.apache.pig.tools.grunt.GruntParser - org.apache.pig.backend.executionengine.ExecException: ERROR 2244: Job failed, hadoop does not return any error message
at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:140)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:193)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
at org.apache.pig.Main.run(Main.java:555)
at org.apache.pig.Main.main(Main.java:111)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Russell Jurney

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 07/Jul/12 00:06

Updated:: 06/Jan/13 23:57

Resolved:: 07/Jul/12 01:02