Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-16961

Utils.randomizeInPlace does not shuffle arrays uniformly

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 2.0.0
    • 2.0.1, 2.1.0
    • Spark Core
    • None

    Description

      The Utils.randomizeInPlace method, which is meant to uniformly shuffle the elements on an input array, never shuffles elements to their starting position. That is, every permutation of the input array is equally likely to be returned, except for any permutation in which any element is in the same position where it began. These permutations are never output.
      This is because line 827 of Utils.scala should be
      val j = rand.nextInt(i + 1)
      instead of
      val j = rand.nextInt( i )

      Attachments

        Activity

          People

            nick.lavers@videoamp.com Nicholas
            nick.lavers@videoamp.com Nicholas
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: