Uploaded image for project: 'Mahout'
  1. Mahout
  2. MAHOUT-1597

A + 1.0 (element-wise scala operation) gives wrong result if rdd is missing rows, Spark side

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.9
    • 0.10.0
    • None
    • None

    Description

          // Concoct an rdd with missing rows
          val aRdd: DrmRdd[Int] = sc.parallelize(
            0 -> dvec(1, 2, 3) ::
                3 -> dvec(3, 4, 5) :: Nil
          ).map { case (key, vec) => key -> (vec: Vector)}
      
          val drmA = drmWrap(rdd = aRdd)
      
          val controlB = inCoreA + 1.0
      
          val drmB = drmA + 1.0
      
          (drmB -: controlB).norm should be < 1e-10
      
      

      should not fail.

      it was failing due to elementwise scalar operator only evaluates rows actually present in dataset.

      In case of Int-keyed row matrices, there are implied rows that yet may not be present in RDD.

      Our goal is to detect the condition and evaluate missing rows prior to physical operators that don't work with missing implied rows.

      Attachments

        Activity

          People

            dlyubimov Dmitriy Lyubimov
            dlyubimov Dmitriy Lyubimov
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: