Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-21187

Complete support for remaining Spark data types in Arrow Converters

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Umbrella
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.3.0
    • 3.1.0
    • PySpark, SQL
    • None

    Description

      This is to track adding the remaining type support in Arrow Converters. Currently, only primitive data types are supported. '

      Remaining types:

      • Date
      • Timestamp
      • Complex: Struct, Array, Map
      • Decimal
      • Binary
      • Categorical when converting from Pandas

      Some things to do before closing this out:

      • Look to upgrading to Arrow 0.7 for better Decimal support (can now write values as BigDecimal)
      • Need to add some user docs
      • Make sure Python tests are thorough
      • Check into complex type support mentioned in comments by Leif Mortenson, should we support mulit-indexing?

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            bryanc Bryan Cutler
            bryanc Bryan Cutler
            Votes:
            3 Vote for this issue
            Watchers:
            30 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment