Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-35525

Define UDTs in schemas using string format

Attach filesAttach ScreenshotAdd voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.1.1
    • None
    • SQL

    Description

      In PySpark where UDTs are public in 3.1.1 for example, you can define a schema using UDTs in the format:

      schema = StructType([StructField("Stuff", MyUDT())])

      but the format

      schema = "Stuff MyUDT"

      does not work.

      UDTs are officially being made public again in 3.2.0 for Scala, so this issue is pretty important now.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            jshalaby Julian Shalaby

            Dates

              Created:
              Updated:

              Time Tracking

                Estimated:
                Original Estimate - 3h
                3h
                Remaining:
                Remaining Estimate - 3h
                3h
                Logged:
                Time Spent - Not Specified
                Not Specified

                Slack

                  Issue deployment