Uploaded image for project: 'Apache Sedona'
  1. Apache Sedona
  2. SEDONA-14

Saving dataframe to CSV or Parquet fails due to unknown type

Rank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.0.0
    • 1.0.1

    Description

      Tested on Pyspark 3.0.1 (Python 3.8.6) with Sedona 1.0.0 RC1 on a Macbook Pro (Big Sur) and Pyspark 3.0.1 (Python 3.7.9) with Sedona 1.0.0 RC1 on AWS EMR 6.2.0.

      After adding a point geometry field to a dataframe, I'm unable to save the new dataframe. Saving to CSV returns a Python error (attached); whilst saving to Parquet returns the Java stack trace attached:

      Test file also attached.

      Attachments

        1. parquet_save_error.txt
          30 kB
          Hugh Saalmans
        2. csv_save_error.txt
          1 kB
          Hugh Saalmans
        3. test_save_df_issue.py
          1.0 kB
          Hugh Saalmans

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            minus34 Hugh Saalmans
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment