Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-18200

GraphX Invalid initial capacity when running triangleCount

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.0.0, 2.0.1, 2.0.2
    • Fix Version/s: 2.0.3, 2.1.0
    • Component/s: GraphX
    • Labels:
    • Environment:

      Databricks, Ubuntu 16.04, macOS Sierra

      Description

      Running GraphX triangle count on large-ish file results in the "Invalid initial capacity" error when running on Spark 2.0 (tested on Spark 2.0, 2.0.1, and 2.0.2). You can see the results at: http://bit.ly/2eQKWDN

      Running the same code on Spark 1.6 and the query completes without any problems: http://bit.ly/2fATO1M

      As well, running the GraphFrames version of this code runs as well (Spark 2.0, GraphFrames 0.2): http://bit.ly/2fAS8W8

      Reference Stackoverflow question:
      Spark GraphX: requirement failed: Invalid initial capacity (http://stackoverflow.com/questions/40337366/spark-graphx-requirement-failed-invalid-initial-capacity)

        Attachments

          Activity

            People

            • Assignee:
              dongjoon Dongjoon Hyun
              Reporter:
              dennyglee Denny Lee
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: