Details
-
Bug
-
Status: Resolved
-
Minor
-
Resolution: Fixed
-
1.0.0
-
None
-
None
-
None
Description
Discovered bug while running PageRankBenchmark in localTestMode, with -Phadoop_1.0, with the following giraph-site.xml:
<configuration> <property> <name>giraph.SplitMasterWorker</name> <value>false</value> </property> <property> <name>giraph.localTestMode</name> <value>true</value> </property> <property> <name>giraph.zkJar</name> <value>/home/eugene/giraph/target/giraph-0.2-SNAPSHOT-jar-with-dependencies.jar</value> </property> </configuration>
With this configuration, I ran PageRankBenchmark as follows:
java -cp (all the jars..) org.apache.giraph.benchmark.PageRankBenchmark -c 0 -e 3 -s 5 -v -w 1 -V 10
This worked the first time:
12/06/18 15:33:51 INFO mapred.JobClient: Job complete: job_local_0001 12/06/18 15:33:51 INFO mapred.JobClient: Counters: 31 12/06/18 15:33:51 INFO mapred.JobClient: Giraph Timers 12/06/18 15:33:51 INFO mapred.JobClient: Total (milliseconds)=5361 12/06/18 15:33:51 INFO mapred.JobClient: Superstep 3 (milliseconds)=305 12/06/18 15:33:51 INFO mapred.JobClient: Vertex input superstep (milliseconds)=207 12/06/18 15:33:51 INFO mapred.JobClient: Superstep 4 (milliseconds)=317 12/06/18 15:33:51 INFO mapred.JobClient: Superstep 10 (milliseconds)=297 12/06/18 15:33:51 INFO mapred.JobClient: Setup (milliseconds)=459 12/06/18 15:33:51 INFO mapred.JobClient: Shutdown (milliseconds)=875 12/06/18 15:33:51 INFO mapred.JobClient: Superstep 7 (milliseconds)=305 12/06/18 15:33:51 INFO mapred.JobClient: Superstep 0 (milliseconds)=553 12/06/18 15:33:51 INFO mapred.JobClient: Superstep 8 (milliseconds)=304 12/06/18 15:33:51 INFO mapred.JobClient: Superstep 9 (milliseconds)=306 12/06/18 15:33:51 INFO mapred.JobClient: Superstep 6 (milliseconds)=339 12/06/18 15:33:51 INFO mapred.JobClient: Superstep 5 (milliseconds)=268 12/06/18 15:33:51 INFO mapred.JobClient: Superstep 2 (milliseconds)=313 12/06/18 15:33:51 INFO mapred.JobClient: Superstep 1 (milliseconds)=503 12/06/18 15:33:51 INFO mapred.JobClient: File Output Format Counters 12/06/18 15:33:51 INFO mapred.JobClient: Bytes Written=0 12/06/18 15:33:51 INFO mapred.JobClient: Giraph Stats 12/06/18 15:33:51 INFO mapred.JobClient: Aggregate edges=100 12/06/18 15:33:51 INFO mapred.JobClient: Superstep=11 12/06/18 15:33:51 INFO mapred.JobClient: Current workers=1 12/06/18 15:33:51 INFO mapred.JobClient: Last checkpointed superstep=0 12/06/18 15:33:51 INFO mapred.JobClient: Current master task partition=0 12/06/18 15:33:51 INFO mapred.JobClient: Sent messages=0 12/06/18 15:33:51 INFO mapred.JobClient: Aggregate finished vertices=10 12/06/18 15:33:51 INFO mapred.JobClient: Aggregate vertices=10 12/06/18 15:33:51 INFO mapred.JobClient: File Input Format Counters 12/06/18 15:33:51 INFO mapred.JobClient: Bytes Read=0 12/06/18 15:33:51 INFO mapred.JobClient: FileSystemCounters 12/06/18 15:33:51 INFO mapred.JobClient: FILE_BYTES_READ=88 12/06/18 15:33:51 INFO mapred.JobClient: FILE_BYTES_WRITTEN=32525 12/06/18 15:33:51 INFO mapred.JobClient: Map-Reduce Framework 12/06/18 15:33:51 INFO mapred.JobClient: Map input records=1 12/06/18 15:33:51 INFO mapred.JobClient: Spilled Records=0 12/06/18 15:33:51 INFO mapred.JobClient: SPLIT_RAW_BYTES=44 12/06/18 15:33:51 INFO mapred.JobClient: Map output records=0
but trying to run it again yields the following:
12/06/18 15:35:01 INFO mapred.Task: Task 'attempt_local_0001_m_000000_0' done. 12/06/18 15:35:01 WARN mapred.FileOutputCommitter: Output path is null in cleanup 12/06/18 15:35:02 INFO mapred.JobClient: map 100% reduce 0% 12/06/18 15:35:02 INFO mapred.JobClient: Job complete: job_local_0001 12/06/18 15:35:02 INFO mapred.JobClient: Counters: 8 12/06/18 15:35:02 INFO mapred.JobClient: File Output Format Counters 12/06/18 15:35:02 INFO mapred.JobClient: Bytes Written=0 12/06/18 15:35:02 INFO mapred.JobClient: File Input Format Counters 12/06/18 15:35:02 INFO mapred.JobClient: Bytes Read=0 12/06/18 15:35:02 INFO mapred.JobClient: FileSystemCounters 12/06/18 15:35:02 INFO mapred.JobClient: FILE_BYTES_READ=88 12/06/18 15:35:02 INFO mapred.JobClient: FILE_BYTES_WRITTEN=32493 12/06/18 15:35:02 INFO mapred.JobClient: Map-Reduce Framework 12/06/18 15:35:02 INFO mapred.JobClient: Map input records=1 12/06/18 15:35:02 INFO mapred.JobClient: Spilled Records=0 12/06/18 15:35:02 INFO mapred.JobClient: SPLIT_RAW_BYTES=44 12/06/18 15:35:02 INFO mapred.JobClient: Map output records=0 Disconnected from the target VM, address: '127.0.0.1:33268', transport: 'socket'
which is wrong because the Giraph mapper never got called (note the lack of Superstep timers and "Giraph Stats" section in the above)
A workaround for this bug is to run "rm -rf ~/giraph/_bsp/_defaultZkManagerDir" before re-running PageRankBenchmark - then it will run correctly afterwards.
The problem in the code is that the ZookeeperManager's directory is not being removed as it should be. This is because the zkDirDefault in ZooKeeperManager.java is not being set correctly - it is currently:
System.getProperty("user.dir") + "/_bspZooKeeper";
but it should be:
System.getProperty("user.dir") + GiraphJob.ZOOKEEPER_MANAGER_DIR_DEFAULT;
Attachments
Attachments
Issue Links
- is related to
-
GIRAPH-198 running Giraph trunk on Hadoop 2.0.0-alpha leads to an exception
- Open
-
GIRAPH-442 FileNotFoundException: <baseDirectory>/_zkServer is checked before it is created
- Resolved
-
GIRAPH-312 Giraph needs an admin script
- Resolved