Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Not A Bug
-
1.3.2
-
CentOS Linux release 7.3.1611
-
Important
Description
Hello, I am currently running and measuring Flink Gelly examples (Connected components and Pagerank algorithms) with different SNAP datasets. When running with the Twitter dataset for example (https://snap.stanford.edu/data/egonets-Twitter.html) which has 81,306 vertices everything executes and finishes OK and I get the reported job runtime. On the other hand, executions with datasets having a bigger number of vertices, e.g. https://snap.stanford.edu/data/com-Youtube.html with 1,134,890 vertices, hang with no result and reported time, while at the same time I get "Job execution switched to status FINISHED."
I thought that this could be a memory issue so I reached 125GB of RAM assigned to my taskmanagers (and jobmanager), but still no luck.
The exact command I am running is:
./bin/flink run examples/gelly/flink-gelly-examples_*.jar --algorithm PageRank --directed false --input_filename hdfs://sith0:9000/user/xx.txt --input CSV --type integer --input_field_delimiter $' ' --output print