Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-35337

pandas API on Spark: Separate basic operations into data type based structures

    XMLWordPrintableJSON

Details

    • Umbrella
    • Status: Resolved
    • Major
    • Resolution: Done
    • 3.2.0
    • None
    • PySpark
    • None

    Description

      Currently, the same basic operation of all data types is defined in one function, so it’s difficult to extend the behavior change based on the data types. For example, the binary operation Series + Series behaves differently based on the data type, e.g., just adding for numerical operands, concatenating for string operands, etc. The behavior difference is done by if-else in the function, so it’s messy and difficult to maintain or reuse the logic.

      We should provide an infrastructure to manage the differences in these operations.

      Please refer to pandas APIs on Spark: Separate basic operations into data type based structures for details.

      Attachments

        Issue Links

          There are no Sub-Tasks for this issue.

          Activity

            People

              XinrongM Xinrong Meng
              XinrongM Xinrong Meng
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: