Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.6.0
    • Fix Version/s: 2.0.0
    • Component/s: SparkR
    • Labels:
      None

      Description

      dapply() applies an R function on each partition of a DataFrame and returns a new DataFrame.

      The function signature is:

      	dapply(df, function(localDF) {}, schema = NULL)
      

      R function input: local data.frame from the partition on local node
      R function output: local data.frame

      Schema specifies the Row format of the resulting DataFrame. It must match the R function's output.

      If schema is not specified, each partition of the result DataFrame will be serialized in R into a single byte array. Such resulting DataFrame can be processed by successive calls to dapply() or collect(), but can't be processed by normal DataFrame operations.

        Attachments

          Activity

            People

            • Assignee:
              sunrui Sun Rui
              Reporter:
              sunrui Sun Rui
            • Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: