Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-33064

Spark-shell does not display accented chara

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Cannot Reproduce
    • 3.0.1
    • None
    • Spark Shell
    • None
    • Windows 10

      "Beta: Use Unicode UTF-8 for worldwide language support" has been checked.

    • Hide
      Café
      Café

      +-----+
       | _c0|
      +-----+
       | Caf|
       |Café|
      +-----+
      Show
      Café Café +-----+  | _c0| +-----+  | Caf|  |Café| +-----+

    Description

      It seems to be a duplicate of FLEX-18425, which is duplicate of SDK-17398 that does not exist anymore. But the bug remains.

      (1) I create a txt file "café.txt" that contains two lines : 

      Café

      Café

      (2) I type the following command :

      spark.read.csv("café.txt").show()

      It is displayed as following :

      spark.read.csv("caf.txt").show()

      But it works and it returns this : 

      -----
       |   _c0|
      -----
       |  Caf|
       |Café|
      -----

      We notice a shift after "Caf" and "Café".

      (3) The two following commands works. The written textfiles have the same content as "café.txt" 

      spark.read.csv("café.txt").write.format("text").save("café2")

      sc.textFile("café.txt").saveAsTextFile("café3")

       

      Once again, the Spark-shell display this : 

      spark.read.csv("caf.txt").write.format("text").save("caf2")

      sc.textFile("caf.txt").saveAsTextFile("caf3")

       

      (4)If I type 7 "é" an then 7 Backspace, by using the "é" key of my french keyboard, then the scala prompt disappears. I have a new prompt when I type Return.

       

      The issue (4) as well as the shift in (2) seem to be related to the difference between counted characters and displayed characters.

       

      (5) I notice that I haven't got this issue by launching Spark from Ubuntu, thanks to "Windows Subsystem for Linux" Version 2.

      Attachments

        Activity

          People

            Unassigned Unassigned
            lologuem Laurent GUEMAPPE
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: