Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-21187

Complete support for remaining Spark data types in Arrow Converters

    Details

    • Type: Umbrella
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 2.3.0
    • Fix Version/s: None
    • Component/s: PySpark, SQL
    • Labels:
      None

      Description

      This is to track adding the remaining type support in Arrow Converters. Currently, only primitive data types are supported. '

      Remaining types:

      • Date
      • Timestamp
      • Complex: Struct, Array, Arrays of Date/Timestamps, Map
      • Decimal
      • Binary
      • Categorical when converting from Pandas

      Some things to do before closing this out:

      • Look to upgrading to Arrow 0.7 for better Decimal support (can now write values as BigDecimal)
      • Need to add some user docs
      • Make sure Python tests are thorough
      • Check into complex type support mentioned in comments by Leif Walsh, should we support mulit-indexing?

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                bryanc Bryan Cutler
                Reporter:
                bryanc Bryan Cutler
              • Votes:
                2 Vote for this issue
                Watchers:
                30 Start watching this issue

                Dates

                • Created:
                  Updated: