Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-12526

`ifelse`, `when`, `otherwise` unable to take Column as value

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.5.2, 1.6.0
    • Fix Version/s: 1.6.1, 2.0.0
    • Component/s: SparkR
    • Labels:
      None

      Description

      When passing a Column to ifelse, when, otherwise, it will error out with

      attempt to replicate an object of type 'environment'
      

      The problems lies in the use of base R ifelse function, which is vectorized version of if ... else ... idiom, but it is unable to replicate a Column's job id as it is an environment.

      Considering callJMethod was never designed to be vectorized, the safe option is to replace ifelse with if ... else ... instead. However technically this is inconsistent to base R's ifelse, which is meant to be vectorized.

      I can send a PR for review first and discuss further if there is scenario at all when `ifelse`, `when`, `otherwise` would be used vectorizedly.

      A dummy example is:

      ifelse(lit(1) == lit(1), lit(2), lit(3))
      

      A concrete example might be:

      ifelse(df$mpg > 0, df$mpg, 0)
      

        Attachments

          Activity

            People

            • Assignee:
              saurfang Sen Fang
              Reporter:
              saurfang Sen Fang
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: