Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-970

PySpark's saveAsTextFile() throws UnicodeEncodeError when saving unicode strings

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 0.7.0, 0.7.1, 0.7.2, 0.7.3, 0.8.0
    • 0.8.1, 0.9.0
    • PySpark
    • None

    Description

      PySpark throws a UnicodeEncodeError when trying to save unicode objects to text files. This is because saveAsTextFile() calls str() to get objects' string representations, when it should be calling unicode() instead.

      This is probably a one-line fix.

      This was originally reported on the mailing list at https://mail-archives.apache.org/mod_mbox/incubator-spark-user/201311.mbox/%3CCAPS2vjrorZGbxt7Nyqb1ZLZABk2MZy1O1p-KfF%3DxGJzSN0oq9g%40mail.gmail.com%3E

      Attachments

        Activity

          People

            joshrosen Josh Rosen
            joshrosen Josh Rosen
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: