Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-9315

SparkR DataFrame improvements to be more R-friendly

    XMLWordPrintableJSON

Details

    • Umbrella
    • Status: Resolved
    • Major
    • Resolution: Incomplete
    • None
    • None
    • SparkR

    Description

      This umbrella will track a bunch of functions that will make SparkR DataFrames more friendly to R users. The goal here is to add functions which make SparkR DataFrames resemble local R data frames better. A similar exercise for dplyr compatibility was done in https://issues.apache.org/jira/browse/SPARK-7231

      Attachments

        1.
        Add support for filtering using `[` (synonym for filter / select) Sub-task Resolved Felix Cheung
        2.
        Add `merge` as synonym for join Sub-task Resolved Hossein Falaki
        3.
        Add support for setting column names, types Sub-task Resolved Felix Cheung
        4.
        Add `summary` as a synonym for `describe` Sub-task Resolved Hossein Falaki
        5.
        Add nrow, ncol, dim for SparkR data frames Sub-task Resolved Hossein Falaki
        6.
        Add rbind as a synonym for `unionAll` Sub-task Resolved Hossein Falaki
        7.
        Add `unique` as a synonym for `distinct` Sub-task Resolved Hossein Falaki
        8.
        Support `collect` on DataFrame columns Sub-task Resolved Unassigned
        9.
        Add transform and subset to DataFrame Sub-task Resolved Felix Cheung
        10.
        DataFrame show method - show(df) should show first N number of rows, similar to R Sub-task Closed Unassigned
        11.
        SparkR: Add sort function to dataframe Sub-task Resolved Narine Kokhlikyan
        12.
        SparkR: Add correlation function to dataframe Sub-task Resolved Unassigned
        13.
        Method coltypes() to return the R column types of a DataFrame Sub-task Resolved Oscar D. Lara Yejas
        14.
        Add as.DataFrame as a synonym for createDataFrame Sub-task Resolved Narine Kokhlikyan
        15.
        Simplify SQLContext method signatures and use a singleton Sub-task Resolved Felix Cheung
        16.
        Add attach() function for DataFrame Sub-task Resolved Weiqiang Zhuang
        17.
        SparkR: Add merge to DataFrame Sub-task Resolved Narine Kokhlikyan
        18.
        SparkR str() method on DataFrame objects Sub-task Resolved Oscar D. Lara Yejas
        19.
        Improve R context management story and add getOrCreate Sub-task Resolved Hossein Falaki
        20.
        SparkR: Documentation change for merge function Sub-task Resolved Unassigned
        21.
        Add 'with' API Sub-task Resolved Weiqiang Zhuang
        22.
        Add parameter drop to subsetting operator [ Sub-task Resolved Oscar D. Lara Yejas
        23.
        Add support for head on DataFrame Column Sub-task Resolved Unassigned

        Activity

          People

            Unassigned Unassigned
            shivaram Shivaram Venkataraman
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: