Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-18200

GraphX Invalid initial capacity when running triangleCount

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.0.0, 2.0.1, 2.0.2
    • 2.0.3, 2.1.0
    • GraphX
    • Databricks, Ubuntu 16.04, macOS Sierra

    Description

      Running GraphX triangle count on large-ish file results in the "Invalid initial capacity" error when running on Spark 2.0 (tested on Spark 2.0, 2.0.1, and 2.0.2). You can see the results at: http://bit.ly/2eQKWDN

      Running the same code on Spark 1.6 and the query completes without any problems: http://bit.ly/2fATO1M

      As well, running the GraphFrames version of this code runs as well (Spark 2.0, GraphFrames 0.2): http://bit.ly/2fAS8W8

      Reference Stackoverflow question:
      Spark GraphX: requirement failed: Invalid initial capacity (http://stackoverflow.com/questions/40337366/spark-graphx-requirement-failed-invalid-initial-capacity)

      Attachments

        Activity

          People

            dongjoon Dongjoon Hyun
            dennyglee Denny Lee
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: