Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-5790

Add tests for: VertexRDD's won't zip properly for `diff` capability

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.4.0
    • Component/s: GraphX, Tests
    • Labels:
      None

      Description

      For VertexRDD's with differing partition sizes one cannot run commands like `diff` as it will thrown an IllegalArgumentException. The code below provides an example:

      import org.apache.spark.graphx._
      import org.apache.spark.rdd._
      val setA: VertexRDD[Int] = VertexRDD(sc.parallelize(0L until 3L).map(id => (id, id.toInt+1)))
      setA.collect.foreach(println(_))
      val setB: VertexRDD[Int] = VertexRDD(sc.parallelize(2L until 4L).map(id => (id, id.toInt+2)))
      setB.collect.foreach(println(_))
      val diff = setA.diff(setB)
      diff.collect.foreach(println(_))
      val setC: VertexRDD[Int] = VertexRDD(sc.parallelize(2L until 4L).map(id => (id, id.toInt+2)) ++ sc.parallelize(6L until 8L).map(id => (id, id.toInt+2)))
      setA.diff(setC).collect
      // java.lang.IllegalArgumentException: Can't zip RDDs with unequal numbers of partitions
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                boyork Brennon York
                Reporter:
                boyork Brennon York
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: