Details
-
Question
-
Status: Resolved
-
Major
-
Resolution: Invalid
-
2.0.2
-
None
Description
Hello, I don't know if I'm writing in the right place but if anyone can help me that would be great.
I've to run PageRank on a really big graph, 400 million edges, 12 million vertices (Wikipedia's graph) but It raises an execution time problem: after 10+ iteration of the algorithm the execution time raises abnormally from 10 mins per iteration to dozens of hours: https://d.pr/svBR.
My code is really simple and it's taken directly from GraphX documentation.
The machine used has two CPU Intel Xeon E5-2697 v3, 64GB of RAM and 500GB hard disk and it runs Windows Server 2012 R2 Standard.
I allocated 8 cores and 50 GB of RAM to Spark invoking the Spark-Shell from the command line.
What could the problem be?
Thanks for any help!