Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
1.1.0
-
None
-
None
-
Linux version 3.13.0-37-generic (buildd@kapok) (gcc version 4.8.2 (Ubuntu 4.8.2-19ubuntu1) 64 bit
Hadoop 1.2.1
Description
I found a problem with Giraph 1.1.0 while trying to execute the ShortestPathComputation example.
This is the command given:
$HADOOP_HOME/bin/hadoop jar ~/git/giraph_patched/giraph-examples/target/giraph-examples-1.1.0-for-hadoop-1.2.1-jar-with-dependencies.jar org.apache.giraph.GiraphRunner org.apache.giraph.examples.SimpleShortestPathsComputation -vif org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat -vip /users/hadoop/input/tiny_graph.txt -vof org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op /users/hadoop/output/shortestpath -w 1
And there is the output:
#################################
Warning: $HADOOP_HOME is deprecated.
14/12/15 12:07:36 INFO utils.ConfigurationUtils: No edge input format specified. Ensure your InputFormat does not require one.
14/12/15 12:07:36 INFO utils.ConfigurationUtils: No edge output format specified. Ensure your OutputFormat does not require one.
14/12/15 12:07:36 INFO job.GiraphJob: run: Since checkpointing is disabled (default), do not allow any task retries (setting mapred.map.max.attempts = 0, old value = 4)
14/12/15 12:07:38 INFO job.GiraphJob: Tracking URL: http://VirtualMINT-H023:50030/jobdetails.jsp?jobid=job_201412151205_0001
14/12/15 12:07:38 INFO job.GiraphJob: Waiting for resources... Job will start only when it gets all 2 mappers
14/12/15 12:08:51 INFO job.HaltApplicationUtils$DefaultHaltInstructionsWriter: writeHaltInstructions: To halt after next superstep execute: 'bin/halt-application --zkServer virtualmint-h023:22181 --zkNode /_hadoopBsp/job_201412151205_0001/_haltComputation'
14/12/15 12:08:51 INFO mapred.JobClient: Running job: job_201412151205_0001
14/12/15 12:08:52 INFO mapred.JobClient: map 100% reduce 0%
################################
The computation hangs here until the timeout is reached. Here is what I found while reading the first worker log.
2014-12-15 12:12:16,303 INFO org.apache.giraph.master.BspServiceMaster: createVertexInputSplits: Starting to write input split data to zookeeper with 1 threads
2014-12-15 12:12:16,314 INFO org.apache.giraph.master.BspServiceMaster: createVertexInputSplits: Done writing input split data to zookeeper
2014-12-15 12:12:16,332 INFO org.apache.giraph.comm.netty.NettyClient: Using Netty without authentication.
2014-12-15 12:12:16,341 INFO org.apache.giraph.comm.netty.NettyClient: connectAllAddresses: Successfully added 1 connections, (1 total connected) 0 failed, 0 failures total.
2014-12-15 12:12:16,344 INFO org.apache.giraph.partition.PartitionUtils: computePartitionCount: Creating 1, default would have been 1 partitions.
2014-12-15 12:12:16,373 INFO org.apache.giraph.master.BspServiceMaster: barrierOnWorkerList: 0 out of 1 workers finished on superstep -1 on path /_hadoopBsp/job_201412151211_0001/_vertexInputSplitDoneDir
2014-12-15 12:12:16,375 INFO org.apache.giraph.master.BspServiceMaster: barrierOnWorkerList: Waiting on [virtualmint-h023_1]
2014-12-15 12:12:16,393 INFO org.apache.giraph.comm.netty.NettyServer: start: Using Netty without authentication.
2014-12-15 12:12:16,464 ERROR org.apache.giraph.master.BspServiceMaster: barrierOnWorkerList: Missing chosen workers [Worker(hostname=virtualmint-h023, MRtaskID=1, port=30001)] on superstep -1
2014-12-15 12:12:16,464 ERROR org.apache.giraph.master.MasterThread: masterThread: Master algorithm failed with IllegalStateException
java.lang.IllegalStateException: coordinateVertexInputSplits: Worker failed during input split (currently not supported)
at org.apache.giraph.master.BspServiceMaster.coordinateInputSplits(BspServiceMaster.java:1489)
at org.apache.giraph.master.BspServiceMaster.coordinateSuperstep(BspServiceMaster.java:1656)
at org.apache.giraph.master.MasterThread.run(MasterThread.java:124)
2014-12-15 12:12:16,464 FATAL org.apache.giraph.graph.GraphTaskManager: uncaughtException: OverrideExceptionHandler on thread org.apache.giraph.master.MasterThread, msg = java.lang.IllegalStateException: coordinateVertexInputSplits: Worker failed during input split (currently not supported), exiting...
java.lang.IllegalStateException: java.lang.IllegalStateException: coordinateVertexInputSplits: Worker failed during input split (currently not supported)
at org.apache.giraph.master.MasterThread.run(MasterThread.java:194)
Caused by: java.lang.IllegalStateException: coordinateVertexInputSplits: Worker failed during input split (currently not supported)
at org.apache.giraph.master.BspServiceMaster.coordinateInputSplits(BspServiceMaster.java:1489)
at org.apache.giraph.master.BspServiceMaster.coordinateSuperstep(BspServiceMaster.java:1656)
at org.apache.giraph.master.MasterThread.run(MasterThread.java:124)
2014-12-15 12:12:16,464 WARN org.apache.giraph.zk.ZooKeeperManager: logZooKeeperOutput: Dumping up to last 100 lines of the ZooKeeper process STDOUT and STDERR.
################################
Computation does not even get to first superstep. Giraph cannot find the worker. Giraph-904 patch applied to BspServiceMaster.
I am running the Hadoop 1.2.1 on a single machine with the configuration suggested in the Giraph Quick Start guide. Hadoop itself works fine (tested with wordcount example).