Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-11658

simplify documentation for PySpark combineByKey

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 1.5.1
    • 1.6.0
    • Documentation, PySpark
    • None

    Description

      The current documentation for combineByKey looks like this:

              >>> x = sc.parallelize([("a", 1), ("b", 1), ("a", 1)])
              >>> def f(x): return x
              >>> def add(a, b): return a + str(b)
              >>> sorted(x.combineByKey(str, add, add).collect())
              [('a', '11'), ('b', '1')]
              """
      

      I think it could be simplified to:

              >>> x = sc.parallelize([("a", 1), ("b", 1), ("a", 1)])
              >>> def add(a, b): return a + str(b)
              >>> x.combineByKey(str, add, add).collect()
              [('a', '11'), ('b', '1')]
              """
      

      I'll shortly add a patch for this.

      Attachments

        Activity

          People

            snowch chris snow
            snowch chris snow
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: