Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-14883

Fix wrong R examples and make them up-to-date

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 2.0.0
    • Documentation, Examples
    • None

    Description

      This issue aims to fix some errors in R examples and make them up-to-date in docs and example modules.

      • Remove the wrong usage of map. We need to use `lapply` in `SparkR` if needed. However, `lapply` is private now. The correct usage will be added later.
        -teenNames <- map(teenagers, function(p) { paste("Name:", p$name)})
        ...
        
      • Fix the wrong example in Section `Generic Load/Save Functions` of `docs/sql-programming-guide.md` for consistency.
        -df <- loadDF(sqlContext, "people.parquet")
        -saveDF(select(df, "name", "age"), "namesAndAges.parquet")
        +df <- read.df(sqlContext, "examples/src/main/resources/users.parquet")
        +write.df(select(df, "name", "favorite_color"), "namesAndFavColors.parquet")
        
      • Fix datatypes in `sparkr.md`.
        -#  |-- age: integer (nullable = true)
        +#  |-- age: long (nullable = true)
        
        -## DataFrame[eruptions:double, waiting:double]
        +## SparkDataFrame[eruptions:double, waiting:double]
        
      • Update data results
         head(summarize(groupBy(df, df$waiting), count = n(df$waiting)))
         ##  waiting count
        -##1      81    13
        -##2      60     6
        -##3      68     1
        +##1      70     4
        +##2      67     1
        +##3      69     2
        
      • Replace deprecated functions: jsonFile -> read.json, parquetFile -> read.parquet
        df <- jsonFile(sqlContext, "examples/src/main/resources/people.json")
        Warning message:
        'jsonFile' is deprecated.
        Use 'read.json' instead.
        See help("Deprecated") 
        
      • Use up-to-date R-like functions: loadDF -> read.df, saveDF -> write.df, saveAsParquetFile -> write.parquet
      • Replace `SparkR DataFrame` with `SparkDataFrame` in `dataframe.R` and `data-manipulation.R`.
      • Other minor syntax fixes and typos.

      Attachments

        Activity

          People

            dongjoon Dongjoon Hyun
            dongjoon Dongjoon Hyun
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: