Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-5026

PySpark rdd.randomSpit() is not documented

    XMLWordPrintableJSON

    Details

    • Type: Documentation
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 1.2.0
    • Fix Version/s: 1.2.1
    • Component/s: Documentation, PySpark
    • Labels:
      None

      Description

      In the current latest version of Spark (1.2.0) If you go to the Python API, in the RDD section, there is no documentation for rdd.randomSplit(): http://spark.apache.org/docs/latest/api/python/pyspark.html#pyspark.RDD

      Nevertheless, it is used as an example in the 1.2.0 documentation for mllib: http://spark.apache.org/docs/latest/mllib-ensembles.html#regression

      (It's in the Python code tab, you can Ctrl+F and search for "randomSplit").

      But looking in the code, it seems implemented: https://github.com/apache/spark/blob/branch-1.2/python/pyspark/rdd.py#L322

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              tiangolo Sebastián Ramírez
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: