Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-3103

Fix UTF8 encoding in PySpark saveAsTextFile().

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.0.2, 1.1.0
    • 1.1.0
    • PySpark

    Description

      This is a follow-up JIRA for https://github.com/apache/spark/pull/1914, where Ahir and Davies identified a bug in Python JsonRDD when trying to encode non-ASCII strings into unicode.

      The same underlying issue affects saveAsTextFile, so we should apply the same fix there, too, and search for any other code that needs to be updated (and maybe refactor this out into a utility function).

      Attachments

        Activity

          People

            davies Davies Liu
            joshrosen Josh Rosen
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: