Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-36031 Keep same behavior with pandas for operations of series with nan
  3. SPARK-36143

Adjust `astype` of fractional Series with missing values to follow pandas

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.2.0
    • 3.2.0
    • PySpark
    • None

    Description

      >>> pser = pd.Series([1, 2, np.nan], dtype=float)
      >>> psser = ps.from_pandas(pser)
      >>> pser.astype(int)
      ...
       ValueError: Cannot convert non-finite values (NA or inf) to integer
      >>> psser.astype(int)
       0 1.0
       1 2.0
       2 NaN
       dtype: float64
      

      As shown above, astype of Series with fractional missing values doesn't behave the same as pandas, we ought to adjust that.

      Attachments

        Activity

          People

            XinrongM Xinrong Meng
            XinrongM Xinrong Meng
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: