Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-11772

DataFrame.show() fails with non-ASCII strings

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Not A Problem
    • 1.5.1
    • None
    • SQL
    • None

    Description

      When given a non-ASCII string (in pyspark at least), the DataFrame.show() method fails.

      df = sqlContext.createDataFrame([[u'ab\u0255']])
      df.show()
      

      Results in:

      15/11/16 21:36:54 INFO DAGScheduler: ResultStage 1 (showString at NativeMethodAccessorImpl.java:-2) finished in 0.148 s
      15/11/16 21:36:54 INFO DAGScheduler: Job 1 finished: showString at NativeMethodAccessorImpl.java:-2, took 0.192634 s
      Traceback (most recent call last):
        File ".../show_bug.py", line 8, in <module>
          df.show()
        File ".../spark-1.5.2-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/sql/dataframe.py", line 256, in show
      UnicodeEncodeError: 'ascii' codec can't encode character u'\u0255' in position 21: ordinal not in range(128)
      15/11/16 21:36:54 INFO SparkContext: Invoking stop() from shutdown hook
      

      Attachments

        Activity

          People

            Unassigned Unassigned
            ggbaker Greg Baker
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: