Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-44464

Fix applyInPandasWithStatePythonRunner to output rows that have Null as first column value

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.3.3
    • 3.4.2, 3.5.0
    • Structured Streaming
    • None

    Description

      The current implementation of ApplyInPandasWithStatePythonRunner cannot deal with outputs where the first column of the row is null, as it cannot distinguish the case where the column is null, or the field is filled as the number of data records are smaller than state records. It causes incorrect results for the former case.

      Attachments

        Activity

          People

            siying Siying Dong
            siying Siying Dong
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: