Details
Description
Running GraphX triangle count on large-ish file results in the "Invalid initial capacity" error when running on Spark 2.0 (tested on Spark 2.0, 2.0.1, and 2.0.2). You can see the results at: http://bit.ly/2eQKWDN
Running the same code on Spark 1.6 and the query completes without any problems: http://bit.ly/2fATO1M
As well, running the GraphFrames version of this code runs as well (Spark 2.0, GraphFrames 0.2): http://bit.ly/2fAS8W8
Reference Stackoverflow question:
Spark GraphX: requirement failed: Invalid initial capacity (http://stackoverflow.com/questions/40337366/spark-graphx-requirement-failed-invalid-initial-capacity)