Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-23060

RDD's apply function

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Minor
    • Resolution: Won't Fix
    • 2.2.1
    • None
    • PySpark

    Description

      New function for RDDs -> apply

      >>> def foo(rdd):
      ... return rdd.map(lambda x: x.split('|')).filter(lambda x: x[0] == 'ERROR')
      >>> rdd = sc.parallelize(['ERROR|10', 'ERROR|12', 'WARNING|10', 'INFO|2'])
      >>> result = rdd.apply(foo)
      >>> result.collect()
      [('ERROR', '10'), ('ERROR', '12')]

      Attachments

        Activity

          People

            Unassigned Unassigned
            gianmarco.donetti Gianmarco Donetti
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - 1h
                1h
                Remaining:
                Remaining Estimate - 1h
                1h
                Logged:
                Time Spent - Not Specified
                Not Specified